gpt-2

This is my reimplementation of the forward and backward pass of GPT-2, written in standard C11 with no dependencies other than the C POSIX library. My only intention with this project was to gain a low-level understanding of how transformer models work. The code is not production-ready.

Derivations can be found in notes.pdf. To compile and run, execute build.sh. For my reference implementation in Python, see ref.py. I used this to verify the results I got in C, and it was originally compared against Andrej Karpathy's nanoGPT project. For both implementations, you must have a model.safetensors file in the assets directory. This file contains the parameter values of the model and can be downloaded from HuggingFace.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.sh		build.sh
notes.pdf		notes.pdf
tf.c		tf.c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gpt-2

About

Languages

License

rkaehn/gpt-2

Folders and files

Latest commit

History

Repository files navigation

gpt-2

About

Resources

License

Stars

Watchers

Forks

Languages