Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 826 Bytes

README.md

File metadata and controls

5 lines (3 loc) · 826 Bytes

gpt-2

This is my reimplementation of the forward and backward pass of GPT-2, written in standard C11 with no dependencies other than the C POSIX library. My only intention with this project was to gain a low-level understanding of how transformer models work. The code is not production-ready.

Derivations can be found in notes.pdf. To compile and run, execute build.sh. For my reference implementation in Python, see ref.py. I used this to verify the results I got in C, and it was originally compared against Andrej Karpathy's nanoGPT project. For both implementations, you must have a model.safetensors file in the assets directory. This file contains the parameter values of the model and can be downloaded from HuggingFace.