Conversation
There was a problem hiding this comment.
nit: can we keep this file as it is and move all MLflow related example to the mlflow folder?
There was a problem hiding this comment.
it seems that the new notebook has better training results, so worth keep new one :)
example/rlhf/mlflow/demo_rl.py
Outdated
There was a problem hiding this comment.
nit: does this file suppose to be here because I do not see any MLflow related logic?
There was a problem hiding this comment.
nit: same as this file. Are you planning to add MLflow on this in the future but make a copy of the current training script first.
| from pykoi.chat import QuestionAnswerDatabase | ||
| from pykoi.rlhf import RLHFConfig | ||
| from pykoi.rlhf import SupervisedFinetuning | ||
| import mlflow |
|
All corrected. But please hold on and don't approve and merge until we meet. There are some issues. We can discuss them in person. |
|
This is the final version. All the added scripts have been tested. |
| # data | ||
| *.csv | ||
| !example/rlhf/mlflow/input_rw/ranking.csv | ||
| !example/rlhf/ranking.csv |
There was a problem hiding this comment.
Maybe this two csv can be kept for future users?
|
They are kept for demo.On Oct 7, 2023 4:50 PM, goldmermaid ***@***.***> wrote:
@goldmermaid commented on this pull request.
In .gitignore:
@@ -172,13 +172,21 @@ cython_debug/
# data
*.csv
+!example/rlhf/mlflow/input_rw/ranking.csv
+!example/rlhf/ranking.csv
Maybe this two csv can be kept for future users?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***>
|
example/rlhf/demo_rl.ipynb
Outdated
| "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)", | ||
| "\u001b[1;32m/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb Cell 7\u001b[0m line \u001b[0;36m1\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=0'>1</a>\u001b[0m \u001b[39mfrom\u001b[39;00m \u001b[39maccelerate\u001b[39;00m \u001b[39mimport\u001b[39;00m notebook_launcher\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=2'>3</a>\u001b[0m config \u001b[39m=\u001b[39m RLHFConfig(base_model_path\u001b[39m=\u001b[39m\u001b[39m\"\u001b[39m\u001b[39melinas/llama-7b-hf-transformers-4.29\u001b[39m\u001b[39m\"\u001b[39m, \u001b[39m# \"elinas/llama-7b-hf-transformers-4.29\", \u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=3'>4</a>\u001b[0m dataset_type\u001b[39m=\u001b[39m\u001b[39m\"\u001b[39m\u001b[39mlocal_db\u001b[39m\u001b[39m\"\u001b[39m,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=4'>5</a>\u001b[0m reward_model_path\u001b[39m=\u001b[39m\u001b[39m\"\u001b[39m\u001b[39mgoldmermaid/rlhf_reward_model\u001b[39m\u001b[39m\"\u001b[39m,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=8'>9</a>\u001b[0m \n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=9'>10</a>\u001b[0m )\n\u001b[0;32m---> <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=10'>11</a>\u001b[0m rlhf_step3_rl \u001b[39m=\u001b[39m RL(config)\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=11'>12</a>\u001b[0m rlhf_step3_rl\u001b[39m.\u001b[39mtrain(\u001b[39m\"\u001b[39m\u001b[39m./models/rlhf_step3_rl\u001b[39m\u001b[39m\"\u001b[39m, num_processes\u001b[39m=\u001b[39m\u001b[39m1\u001b[39m)\n", | ||
| "\u001b[1;32m/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb Cell 7\u001b[0m line \u001b[0;36m9\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=4'>5</a>\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mnum_proc \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mnum_workers \u001b[39mif\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mstreaming \u001b[39melse\u001b[39;00m \u001b[39mNone\u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=5'>6</a>\u001b[0m set_seed(rlhf_config\u001b[39m.\u001b[39mseed) \u001b[39m## TODO: how to set seed properly in __init__?\u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=7'>8</a>\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mppo_config\u001b[39m=\u001b[39mPPOConfig(\n\u001b[0;32m----> <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=8'>9</a>\u001b[0m steps\u001b[39m=\u001b[39m\u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_rlhf_config\u001b[39m.\u001b[39;49mtotal_ppo_epochs,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=9'>10</a>\u001b[0m model_name\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mbase_model_path,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=10'>11</a>\u001b[0m learning_rate\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mlearning_rate,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=11'>12</a>\u001b[0m batch_size\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mppo_batch_size,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=12'>13</a>\u001b[0m mini_batch_size\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mmini_batch_size,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=13'>14</a>\u001b[0m gradient_accumulation_steps\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mgradient_accumulation_steps,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=14'>15</a>\u001b[0m optimize_cuda_cache\u001b[39m=\u001b[39m\u001b[39mTrue\u001b[39;00m,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=15'>16</a>\u001b[0m early_stopping\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mearly_stopping,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=16'>17</a>\u001b[0m target_kl\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mtarget_kl,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=17'>18</a>\u001b[0m ppo_epochs\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mppo_epochs,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=18'>19</a>\u001b[0m seed\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mseed,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=19'>20</a>\u001b[0m init_kl_coef\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39minit_kl_coef,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=20'>21</a>\u001b[0m adap_kl_ctrl\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39madap_kl_ctrl,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=21'>22</a>\u001b[0m \u001b[39m# accelerator_kwargs=self._rlhf_config.accelerator_kwargs,\u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=22'>23</a>\u001b[0m )\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=24'>25</a>\u001b[0m \u001b[39m## Load the base model and tokenizer and define the PPO Trainer for RL\u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=25'>26</a>\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mbase_tokenizer \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mcreate_tokenizer(rlhf_config\u001b[39m.\u001b[39mbase_model_path)\n", | ||
| "\u001b[0;31mAttributeError\u001b[0m: 'RLHFConfig' object has no attribute 'total_ppo_epochs'" |
There was a problem hiding this comment.
It seems that there is still an error here. Can you clean up the output and write a to do note?
There was a problem hiding this comment.
Alternatively, can we remove this step 3 rl notebook example? For step 3, we use the py file, not the notebook file, as the example.
In the mlflow folder, I removed the notebook for step 3 rl.
There was a problem hiding this comment.
Error commented. Added a to-do comment in that block.
|
Or we can remove this step 3 rl notebook example? For step 3 we only use the py file as example.In mlflow folder, I removed this notebook.On Oct 7, 2023 4:54 PM, goldmermaid ***@***.***> wrote:
@goldmermaid commented on this pull request.
In example/rlhf/demo_rl.ipynb:
"output_type": "error",
"traceback": [
- "\u001b[1;31mThe Kernel crashed while executing code in the the current cell or a previous cell. Please review the code in the cell(s) to identify a possible cause of the failure. Click <a href='https://aka.ms/vscodeJupyterKernelCrash'>here</a> for more info. View Jupyter <a href='command:jupyter.viewOutput'>log</a> for further details."
+ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
+ "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)",
+ "\u001b[1;32m/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb Cell 7\u001b[0m line \u001b[0;36m1\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=0'>1</a>\u001b[0m \u001b[39mfrom\u001b[39;00m \u001b[39maccelerate\u001b[39;00m \u001b[39mimport\u001b[39;00m notebook_launcher\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=2'>3</a>\u001b[0m config \u001b[39m=\u001b[39m RLHFConfig(base_model_path\u001b[39m=\u001b[39m\u001b[39m\"\u001b[39m\u001b[39melinas/llama-7b-hf-transformers-4.29\u001b[39m\u001b[39m\"\u001b[39m, \u001b[39m# \"elinas/llama-7b-hf-transformers-4.29\", \u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=3'>4</a>\u001b[0m dataset_type\u001b[39m=\u001b[39m\u001b[39m\"\u001b[39m\u001b[39mlocal_db\u001b[39m\u001b[39m\"\u001b[39m,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=4'>5</a>\u001b[0m reward_model_path\u001b[39m=\u001b[39m\u001b[39m\"\u001b[39m\u001b[39mgoldmermaid/rlhf_reward_model\u001b[39m\u001b[39m\"\u001b[39m,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=8'>9</a>\u001b[0m \n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=9'>10</a>\u001b[0m )\n\u001b[0;32m---> <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=10'>11</a>\u001b[0m rlhf_step3_rl \u001b[39m=\u001b[39m RL(config)\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=11'>12</a>\u001b[0m rlhf_step3_rl\u001b[39m.\u001b[39mtrain(\u001b[39m\"\u001b[39m\u001b[39m./models/rlhf_step3_rl\u001b[39m\u001b[39m\"\u001b[39m, num_processes\u001b[39m=\u001b[39m\u001b[39m1\u001b[39m)\n",
+ "\u001b[1;32m/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb Cell 7\u001b[0m line \u001b[0;36m9\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=4'>5</a>\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mnum_proc \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mnum_workers \u001b[39mif\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mstreaming \u001b[39melse\u001b[39;00m \u001b[39mNone\u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=5'>6</a>\u001b[0m set_seed(rlhf_config\u001b[39m.\u001b[39mseed) \u001b[39m## TODO: how to set seed properly in __init__?\u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=7'>8</a>\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mppo_config\u001b[39m=\u001b[39mPPOConfig(\n\u001b[0;32m----> <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=8'>9</a>\u001b[0m steps\u001b[39m=\u001b[39m\u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_rlhf_config\u001b[39m.\u001b[39;49mtotal_ppo_epochs,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=9'>10</a>\u001b[0m model_name\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mbase_model_path,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=10'>11</a>\u001b[0m learning_rate\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mlearning_rate,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=11'>12</a>\u001b[0m batch_size\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mppo_batch_size,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=12'>13</a>\u001b[0m mini_batch_size\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mmini_batch_size,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=13'>14</a>\u001b[0m gradient_accumulation_steps\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mgradient_accumulation_steps,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=14'>15</a>\u001b[0m optimize_cuda_cache\u001b[39m=\u001b[39m\u001b[39mTrue\u001b[39;00m,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=15'>16</a>\u001b[0m early_stopping\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mearly_stopping,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=16'>17</a>\u001b[0m target_kl\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mtarget_kl,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=17'>18</a>\u001b[0m ppo_epochs\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mppo_epochs,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=18'>19</a>\u001b[0m seed\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mseed,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=19'>20</a>\u001b[0m init_kl_coef\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39minit_kl_coef,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=20'>21</a>\u001b[0m adap_kl_ctrl\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39madap_kl_ctrl,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=21'>22</a>\u001b[0m \u001b[39m# accelerator_kwargs=self._rlhf_config.accelerator_kwargs,\u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=22'>23</a>\u001b[0m )\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=24'>25</a>\u001b[0m \u001b[39m## Load the base model and tokenizer and define the PPO Trainer for RL\u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=25'>26</a>\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mbase_tokenizer \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mcreate_tokenizer(rlhf_config\u001b[39m.\u001b[39mbase_model_path)\n",
+ "\u001b[0;31mAttributeError\u001b[0m: 'RLHFConfig' object has no attribute 'total_ppo_epochs'"
It seems that there is still an error here. Can you clean up the output and write a to do note?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***>
|
goldmermaid
left a comment
There was a problem hiding this comment.
Love the ML Flow example! @larryyin Please made the minor changes above.
Added draft example code to train with mlflow. It encountered some GPU issue. We can discuss it when we meet, maybe on Friday.
This is not the final version, just a point for review.