-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
- python3 -m recipe.cppo.cppo_main_ppo algorithm.adv_estimator=grpo algorithm.use_kl_in_reward=False data.train_files=../data/virl39k_verl/train.parquet data.val_files=../data/geometry3k_verl/test.parquet data.train_batch_size=512 data.max_prompt_length=512 data.max_response_length=4096 data.filter_overlong_prompts=True data.truncation=right data.image_key=images data.custom_cls.path=recipe/cppo/cppo_dataset.py data.custom_cls.name=cppo_RLHFDataset actor_rollout_ref.model.path=../pretrained_models/Qwen2.5-VL-3B-Instruct actor_rollout_ref.actor.optim.lr=1e-6 actor_rollout_ref.model.use_remove_padding=True actor_rollout_ref.actor.ppo_mini_batch_size=64 actor_rollout_ref.actor.ppo_micro_batch_size_per_gpu=8 actor_rollout_ref.actor.use_kl_loss=True actor_rollout_ref.actor.use_vision_cpl_loss=True actor_rollout_ref.actor.cpl_loss_coef=0.01 actor_rollout_ref.actor.cpl_use_vision_mask=True actor_rollout_ref.actor.cpl_vision_top_percent=0.5 actor_rollout_ref.actor.cpl_use_advantage_gating=True actor_rollout_ref.actor.cpl_tau=0.1 actor_rollout_ref.actor.kl_loss_coef=0.01 actor_rollout_ref.actor.kl_loss_type=low_var_kl actor_rollout_ref.actor.entropy_coeff=0 actor_rollout_ref.model.enable_gradient_checkpointing=True actor_rollout_ref.actor.fsdp_config.param_offload=False actor_rollout_ref.actor.fsdp_config.optimizer_offload=False actor_rollout_ref.rollout.log_prob_micro_batch_size_per_gpu=20 actor_rollout_ref.rollout.tensor_model_parallel_size=2 actor_rollout_ref.rollout.name=vllm actor_rollout_ref.rollout.engine_kwargs.vllm.disable_mm_preprocessor_cache=True actor_rollout_ref.rollout.gpu_memory_utilization=0.5 actor_rollout_ref.rollout.enable_chunked_prefill=False actor_rollout_ref.rollout.enforce_eager=False actor_rollout_ref.rollout.free_cache_engine=True actor_rollout_ref.rollout.n=5 actor_rollout_ref.ref.log_prob_micro_batch_size_per_gpu=20 actor_rollout_ref.ref.fsdp_config.param_offload=True trainer.critic_warmup=0 'trainer.logger=["console", "wandb"]' trainer.project_name=Qwen2.5-VL-3B-Instruct-virl39k-CPPO trainer.experiment_name=exp1_CPPO trainer.n_gpus_per_node=4 trainer.nnodes=1 trainer.save_freq=31 trainer.test_freq=31 trainer.val_before_train=False trainer.default_local_dir=checkpoints/Qwen2.5-VL-3B-Instruct-virl39k-CPPO/exp1_CPPO trainer.rollout_data_dir=outputs/Qwen2.5-VL-3B-Instruct-virl39k-CPPO/exp1_CPPO/rollout_dump_train trainer.validation_data_dir=outputs/Qwen2.5-VL-3B-Instruct-virl39k-CPPO/exp1_CPPO/rollout_dump_val trainer.total_epochs=2
In 'ppo_trainer': Could not find 'data/legacy_data'Config search path:
provider=hydra, path=pkg://hydra.conf
provider=main, path=file:///home/jeehye/0214/cppo/verl/recipe/cppo/config
provider=hydra.searchpath in main, path=file://verl/trainer/config
provider=schema, path=structured://Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
I got an error... can you plz kindly provide the legacy_data.yaml file?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels