Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metal qmatmul mat-mat product #39

Merged
merged 8 commits into from
Nov 14, 2024
Merged

Metal qmatmul mat-mat product #39

merged 8 commits into from
Nov 14, 2024

Conversation

EricLBuehler
Copy link
Owner

Before, we were only using an mv qmatmul which scaled lineraly with bs * seqlen. This uses a mm kernel which should speed things up!

@EricLBuehler EricLBuehler merged commit 6be03dd into main Nov 14, 2024
8 of 11 checks passed
@EricLBuehler EricLBuehler deleted the metal_qmatmul_mm branch November 14, 2024 16:40
EricLBuehler added a commit that referenced this pull request Nov 14, 2024
* Test passes

* All tests pass

* Now all the tests really pass

* Try out always using mm

* Mirror llama.cpp metric

* Mirror llama.cpp metric

* Update test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant