-
Notifications
You must be signed in to change notification settings - Fork 359
Description
Thanks for your great work.
I followed the instructions of readme and ran "bash scripts/gpt2/minillm/train_base_xl.sh". The student model is init-gpt2-base, which is provided in the readme. The Rouge-L I got at the beginning of the training was higher than what was presented in the paper (24.6), which is strange. You can see some of the Rouge-L results below:
eval | rougeL: 25.100 | exact_match: 3.200 | rev_kl: 1.894 | lens: 69.771
train | data_epochs 0/10 | inner iter: 3/ 8 | ppo epoch: 0/ 4 | global iter: 100/ 5000| tot_loss: 3.7369 | rl_loss: 3.7369 | pt_loss: 0.0000 | pg_loss: 1.3049 | reg_loss: 2.4320 | reward: -1.7812 | rev_kl: 2.5176 | stu_lens: 67.8750 | mixed_lens: 55.2188 | lr: 5.0000e-06 | scale: 2048.00 | time: 1.298 | step time: 0.000
...
eval | rougeL: 25.650 | exact_match: 2.900 | rev_kl: 1.713 | lens: 70.656
train | data_epochs 0/10 | inner iter: 7/ 8 | ppo epoch: 0/ 4 | global iter: 200/ 5000| tot_loss: 2.8755 | rl_loss: 2.8755 | pt_loss: 0.0000 | pg_loss: 0.9560 | reg_loss: 1.9194 | reward: -1.4756 | rev_kl: 1.9214 | stu_lens: 92.5000 | mixed_lens: 70.6250 | lr: 5.0000e-06 | scale: 2048.00 | time: 1.283 | step time: 0.000
...
eval | rougeL: 26.270 | exact_match: 3.600 | rev_kl: 1.603 | lens: 65.790
train | data_epochs 0/10 | inner iter: 3/ 8 | ppo epoch: 1/ 4 | global iter: 300/ 5000| tot_loss: 2.2480 | rl_loss: 2.2480 | pt_loss: 0.0000 | pg_loss: 0.6949 | reg_loss: 1.5532 | reward: -1.4756 | rev_kl: 2.4121 | stu_lens: 42.1250 | mixed_lens: 61.0625 | lr: 5.0000e-06 | scale: 2048.00 | time: 1.250 | step time: 0.000
I train it on two nvidia 3090, and the contents of the "scripts/gpt2/minillm/train_base_xl.sh" are as below:
#! /bin/bash
MASTER_ADDR=localhost
MASTER_PORT=${2-2012}
NNODES=1
NODE_RANK=0
GPUS_PER_NODE=${3-2}
DISTRIBUTED_ARGS="--nproc_per_node $GPUS_PER_NODE
--nnodes $NNODES
--node_rank $NODE_RANK
--master_addr $MASTER_ADDR
--master_port $MASTER_PORT"
model
BASE_PATH=${1-"/workspace/codes/minillm"}
CKPT_NAME="gpt2-base"
CKPT="/workspace/codes/minillm/checkpoints/gpt2-base/"
CKPT_NAME="init-gpt2-120M"
CKPT="/workspace/codes/minillm/checkpoints/init-gpt2-120M"
TEACHER_CKPT_NAME="teacher-gpt2-1.5B"
TEACHER_CKPT="/workspace/codes/minillm/checkpoints/teacher-gpt2-1.5B"
data
PROMPT_DATA_DIR="/workspace/codes/minillm/processed_data/dolly/prompt/gpt2/"
runtime
SAVE_PATH="/workspace/codes/minillm/results/gpt2/train/minillm/"
hp
GRAD_ACC=1
BATCH_SIZE=16
CHUNK_SIZE=16
OPTS=""
model
OPTS+=" --base-path ${BASE_PATH}"
OPTS+=" --model-path ${CKPT}"
OPTS+=" --teacher-model-path ${TEACHER_CKPT}"
OPTS+=" --ckpt-name ${CKPT_NAME}"
OPTS+=" --teacher-ckpt-name ${TEACHER_CKPT_NAME}"
OPTS+=" --n-gpu ${GPUS_PER_NODE}"
OPTS+=" --n-nodes ${NNODES}"
OPTS+=" --teacher-model-fp16"
OPTS+=" --gradient-checkpointing"
data
OPTS+=" --prompt-data-dir ${PROMPT_DATA_DIR}"
OPTS+=" --lm-data-dir ${LM_DATA_DIR}"
OPTS+=" --dev-num 1000"
OPTS+=" --num-workers 16"
hp
OPTS+=" --epochs 10"
OPTS+=" --total-iters 5000"
OPTS+=" --kd-ratio 0.5"
OPTS+=" --batch-size ${BATCH_SIZE}"
OPTS+=" --lr 5e-6"
OPTS+=" --lr-min 5e-6"
OPTS+=" --gradient-accumulation-steps ${GRAD_ACC}"
OPTS+=" --max-length 512"
OPTS+=" --max-prompt-length 256"
OPTS+=" --warmup-iters 100"
runtime
OPTS+=" --save ${SAVE_PATH}"
OPTS+=" --seed 10"
OPTS+=" --seed-ppo 42"
OPTS+=" --seed-lm 7"
OPTS+=" --save-interval 500"
OPTS+=" --eval-interval 100"
OPTS+=" --log-interval 16"
OPTS+=" --mid-log-num 1"
ppo
OPTS+=" --type minillm"
OPTS+=" --ppo-epochs 4"
OPTS+=" --num-rollouts 256"
OPTS+=" --chunk-size ${CHUNK_SIZE}"
minillm
OPTS+=" --length-norm"
OPTS+=" --single-step-reg"
OPTS+=" --teacher-mixed-alpha 0.2"
reward
OPTS+=" --reward-scaling 0.5"
OPTS+=" --cliprange-reward 100"
gen
OPTS+=" --do-sample"
OPTS+=" --top-k 0"
OPTS+=" --top-p 1.0"
OPTS+=" --temperature 1.0"
deepspeed
OPTS+=" --deepspeed"
OPTS+=" --deepspeed_config ${BASE_PATH}/configs/deepspeed/ds_config_zero1_fp16.json"
export NCCL_DEBUG=""
export WANDB_DISABLED=True
export TF_CPP_MIN_LOG_LEVEL=3
export PYTHONPATH=${BASE_PATH}
CMD="torchrun ${DISTRIBUTED_ARGS} ${BASE_PATH}/train_minillm.py
echo ${CMD}
echo "PYTHONPATH=${PYTHONPATH}"
mkdir -p ${SAVE_PATH}
${CMD}
Do you have any clues why I got these strange Rouge-L results? Thanks a lot for your help.