Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llama 1 7b MMLU results largely diverges from reported #291

Open
Edenzzzz opened this issue Apr 23, 2024 · 0 comments
Open

Llama 1 7b MMLU results largely diverges from reported #291

Edenzzzz opened this issue Apr 23, 2024 · 0 comments

Comments

@Edenzzzz
Copy link

Edenzzzz commented Apr 23, 2024

Hi,
I took the hyperparams from the paper but got only 32.1 MMLU acc.
Could you point out what could be wrong here? I've also attached training logs. Thanks!

python qlora.py \
    --model_name_or_path huggyllama/llama-7b \
    --use_auth \
    --output_dir /fly/results/qlora \
    --logging_steps 10 \
    --save_strategy steps \
    --data_seed 42 \
    --save_steps 500 \
    --save_total_limit 40 \
    --evaluation_strategy steps \
    --eval_dataset_size 1024 \
    --max_eval_samples 1000 \
    --per_device_eval_batch_size 1 \
    --max_new_tokens 32 \
    --dataloader_num_workers 1 \
    --group_by_length \
    --logging_strategy steps \
    --remove_unused_columns False \
    --do_train \
    --do_eval \
    --do_mmlu_eval \
    --lora_r 64 \
    --lora_alpha 16 \
    --lora_modules all \
    --double_quant \
    --quant_type nf4 \
    --bf16 \
    --bits 16 \
    --warmup_ratio 0.03 \
    --lr_scheduler_type constant \
    --gradient_checkpointing \
    --dataset alpaca \
    --source_max_len 16 \
    --target_max_len 512 \
    --per_device_train_batch_size 1 \
    --gradient_accumulation_steps 16 \
    --max_steps 1875 \
    --eval_steps 187 \
    --learning_rate 0.0002 \
    --adam_beta2 0.999 \
    --max_grad_norm 0.3 \
    --lora_dropout 0.1 \
    --weight_decay 0.0 \
    --seed 0 \
    --mmlu_split test
image @artidoro
@Edenzzzz Edenzzzz reopened this Apr 26, 2024
@Edenzzzz Edenzzzz reopened this May 29, 2024
@Edenzzzz Edenzzzz changed the title Llama 2 7b MMLU results largely diverges from reported Llama 1 7b MMLU results largely diverges from reported May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant