Skip to content

no legacy_data.yaml file #3

@naajeehxe

Description

@naajeehxe
  • python3 -m recipe.cppo.cppo_main_ppo algorithm.adv_estimator=grpo algorithm.use_kl_in_reward=False data.train_files=../data/virl39k_verl/train.parquet data.val_files=../data/geometry3k_verl/test.parquet data.train_batch_size=512 data.max_prompt_length=512 data.max_response_length=4096 data.filter_overlong_prompts=True data.truncation=right data.image_key=images data.custom_cls.path=recipe/cppo/cppo_dataset.py data.custom_cls.name=cppo_RLHFDataset actor_rollout_ref.model.path=../pretrained_models/Qwen2.5-VL-3B-Instruct actor_rollout_ref.actor.optim.lr=1e-6 actor_rollout_ref.model.use_remove_padding=True actor_rollout_ref.actor.ppo_mini_batch_size=64 actor_rollout_ref.actor.ppo_micro_batch_size_per_gpu=8 actor_rollout_ref.actor.use_kl_loss=True actor_rollout_ref.actor.use_vision_cpl_loss=True actor_rollout_ref.actor.cpl_loss_coef=0.01 actor_rollout_ref.actor.cpl_use_vision_mask=True actor_rollout_ref.actor.cpl_vision_top_percent=0.5 actor_rollout_ref.actor.cpl_use_advantage_gating=True actor_rollout_ref.actor.cpl_tau=0.1 actor_rollout_ref.actor.kl_loss_coef=0.01 actor_rollout_ref.actor.kl_loss_type=low_var_kl actor_rollout_ref.actor.entropy_coeff=0 actor_rollout_ref.model.enable_gradient_checkpointing=True actor_rollout_ref.actor.fsdp_config.param_offload=False actor_rollout_ref.actor.fsdp_config.optimizer_offload=False actor_rollout_ref.rollout.log_prob_micro_batch_size_per_gpu=20 actor_rollout_ref.rollout.tensor_model_parallel_size=2 actor_rollout_ref.rollout.name=vllm actor_rollout_ref.rollout.engine_kwargs.vllm.disable_mm_preprocessor_cache=True actor_rollout_ref.rollout.gpu_memory_utilization=0.5 actor_rollout_ref.rollout.enable_chunked_prefill=False actor_rollout_ref.rollout.enforce_eager=False actor_rollout_ref.rollout.free_cache_engine=True actor_rollout_ref.rollout.n=5 actor_rollout_ref.ref.log_prob_micro_batch_size_per_gpu=20 actor_rollout_ref.ref.fsdp_config.param_offload=True trainer.critic_warmup=0 'trainer.logger=["console", "wandb"]' trainer.project_name=Qwen2.5-VL-3B-Instruct-virl39k-CPPO trainer.experiment_name=exp1_CPPO trainer.n_gpus_per_node=4 trainer.nnodes=1 trainer.save_freq=31 trainer.test_freq=31 trainer.val_before_train=False trainer.default_local_dir=checkpoints/Qwen2.5-VL-3B-Instruct-virl39k-CPPO/exp1_CPPO trainer.rollout_data_dir=outputs/Qwen2.5-VL-3B-Instruct-virl39k-CPPO/exp1_CPPO/rollout_dump_train trainer.validation_data_dir=outputs/Qwen2.5-VL-3B-Instruct-virl39k-CPPO/exp1_CPPO/rollout_dump_val trainer.total_epochs=2
    In 'ppo_trainer': Could not find 'data/legacy_data'

Config search path:
provider=hydra, path=pkg://hydra.conf
provider=main, path=file:///home/jeehye/0214/cppo/verl/recipe/cppo/config
provider=hydra.searchpath in main, path=file://verl/trainer/config
provider=schema, path=structured://

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

I got an error... can you plz kindly provide the legacy_data.yaml file?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions