GitHub - ShaoZhenLiu/RPT: Reinforcement Pretraining，也叫 RL in Pretrain，一次失败的尝试

基于 verl 的 RL in Pretrain 尝试，不过是在 VLM 上的。

本仓库还尝试了添加 sft loss，发现效果有所下降 🤡

之后还会来看看代码的，不过现在就先放着吧。

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
examples		examples
recipe		recipe
scripts		scripts
tests		tests
verl		verl
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Notice.txt		Notice.txt
README.md		README.md
pyproject.toml		pyproject.toml
requirements-cuda.txt		requirements-cuda.txt
requirements.txt		requirements.txt
requirements_sglang.txt		requirements_sglang.txt
setup.py		setup.py

Provide feedback