Add HumanEval and HumanEval+ #63

lewtun · 2024-02-27T11:27:06Z

The HumanEval and HumanEval+ benchmarks are stables for benchmarking code capabilities of base LLMs. It would be nice to include them in lighteval so one doesn't have to switch to another framework like BigCode's

References:

HumanEval: https://github.com/openai/human-eval
HumanEval+: https://arxiv.org/abs/2305.01210
Implementation: https://github.com/evalplus/evalplus?tab=readme-ov-file
BigCode eval harness: https://github.com/bigcode-project/bigcode-evaluation-harness/tree/main

The text was updated successfully, but these errors were encountered:

0-hero · 2024-04-01T04:11:29Z

+1, would be nice to have

clefourrier added feature request New feature/request new task and removed feature request New feature/request labels Feb 27, 2024

clefourrier mentioned this issue Mar 27, 2024

human eval run #130

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HumanEval and HumanEval+ #63

Add HumanEval and HumanEval+ #63

lewtun commented Feb 27, 2024

0-hero commented Apr 1, 2024

Add HumanEval and HumanEval+ #63

Add HumanEval and HumanEval+ #63

Comments

lewtun commented Feb 27, 2024

0-hero commented Apr 1, 2024