Skip to content
View clefourrier's full-sized avatar
📘
probably reading
📘
probably reading

Highlights

  • Pro

Organizations

@huggingface

Block or report clefourrier

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Visualization of cache-optimized matrix multiplication

Python 98 8 Updated Jun 13, 2019

A comprehensive set of LLM benchmark scores and provider prices.

JavaScript 80 4 Updated Jan 2, 2025

Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!

Jupyter Notebook 964 59 Updated Jan 7, 2025

Benchmarking Benchmark Leakage in Large Language Models

JavaScript 47 3 Updated May 20, 2024

aider is AI pair programming in your terminal

Python 25,338 2,335 Updated Jan 22, 2025

The central repo for Creole based NLU and NLG work

HTML 16 4 Updated May 28, 2024

Make awesome display tables using Python.

Python 2,002 75 Updated Jan 22, 2025

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Python 15,372 2,644 Updated Dec 18, 2024

Creative Commons Licenses for Github

566 304 Updated Dec 10, 2024

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

Python 35,421 2,672 Updated Jan 22, 2025

Hallucinations (Confabulations) Document-Based Benchmark for RAG

HTML 65 2 Updated Jan 21, 2025

An extremely fast Python package and project manager, written in Rust.

Rust 36,671 995 Updated Jan 22, 2025

Machine Learning Engineering Open Book

Python 12,487 767 Updated Jan 22, 2025

BigCodeBench: Benchmarking Code Generation Towards AGI

Python 276 30 Updated Jan 22, 2025

Website for hosting the Open Foundation Models Cheat Sheet.

JavaScript 263 19 Updated Jun 26, 2024

datasets from the paper "Towards Understanding Sycophancy in Language Models"

Jupyter Notebook 67 7 Updated Oct 25, 2023
Python 136 7 Updated Sep 10, 2023
Python 2,209 188 Updated Jan 8, 2025

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

Python 985 124 Updated Jan 22, 2025

Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM

Jupyter Notebook 1,368 165 Updated Dec 17, 2024

A benchmark to evaluate language models on questions I've previously asked them to solve.

Python 957 70 Updated Nov 4, 2024

Doing simple retrieval from LLM models at various context lengths to measure accuracy

Python 99 1 Updated Apr 4, 2024

😱 Falsehoods Programmers Believe in

25,150 586 Updated Nov 6, 2024

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 43,732 4,662 Updated Jan 18, 2025

A natural language interface for computers

Python 57,917 4,967 Updated Jan 18, 2025

Scrape and export data from the Open LLM Leaderboard.

Python 42 3 Updated Dec 17, 2024

List of papers on hallucination detection in LLMs.

745 60 Updated Dec 19, 2024

ChatGPT and Bing AI prompt curation

838 81 Updated Mar 6, 2024

This repo includes ChatGPT prompt curation to use ChatGPT and other LLM tools better.

HTML 118,453 15,999 Updated Jan 14, 2025
Next
Showing results