Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add single mmlu config for lighteval suite #61

Open
lewtun opened this issue Feb 27, 2024 · 1 comment
Open

Add single mmlu config for lighteval suite #61

lewtun opened this issue Feb 27, 2024 · 1 comment
Labels
feature request New feature/request

Comments

@lewtun
Copy link
Member

lewtun commented Feb 27, 2024

Currently it seems that to run MMLU with the lighteval suite, one needs to specify all the subsets individually as is done for leaderboard task set here.

Is it possible to group these together so that one can just run something like this:

accelerate launch --multi_gpu --num_processes=8 run_evals_accelerate.py \
    --tasks="lighteval|mmlu|5|0" \
    --model_args "pretrained=Qwen/Qwen1.5-0.5B-Chat" \
    --output_dir "./scratch/evals/" --override_batch_size 1

Or do you recommend using one of the other suites like helm or original for this task?

@clefourrier
Copy link
Member

Atm, it's not possible; however, if you run a task with many subsets (using a config file), you should get a display of the average at the task level in the score table.

If you want to get results comparable to the Open LLM Leaderboard, you'll need to use lighteval (you can take a look at the differences between the 3 versions here).

@clefourrier clefourrier added the feature request New feature/request label Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature/request
Projects
None yet
Development

No branches or pull requests

2 participants