Add single `mmlu` config for `lighteval` suite #61

lewtun · 2024-02-27T09:46:13Z

Currently it seems that to run MMLU with the lighteval suite, one needs to specify all the subsets individually as is done for leaderboard task set here.

Is it possible to group these together so that one can just run something like this:

accelerate launch --multi_gpu --num_processes=8 run_evals_accelerate.py \
    --tasks="lighteval|mmlu|5|0" \
    --model_args "pretrained=Qwen/Qwen1.5-0.5B-Chat" \
    --output_dir "./scratch/evals/" --override_batch_size 1

Or do you recommend using one of the other suites like helm or original for this task?

The text was updated successfully, but these errors were encountered:

clefourrier · 2024-02-27T10:02:18Z

Atm, it's not possible; however, if you run a task with many subsets (using a config file), you should get a display of the average at the task level in the score table.

If you want to get results comparable to the Open LLM Leaderboard, you'll need to use lighteval (you can take a look at the differences between the 3 versions here).

clefourrier added the feature request New feature/request label Feb 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add single `mmlu` config for `lighteval` suite #61

Add single `mmlu` config for `lighteval` suite #61

lewtun commented Feb 27, 2024 •

edited

Loading

clefourrier commented Feb 27, 2024

Add single mmlu config for lighteval suite #61

Add single mmlu config for lighteval suite #61

Comments

lewtun commented Feb 27, 2024 • edited Loading

clefourrier commented Feb 27, 2024

Add single `mmlu` config for `lighteval` suite #61

Add single `mmlu` config for `lighteval` suite #61

lewtun commented Feb 27, 2024 •

edited

Loading