-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Related work: Prompt lookup decoding #45
Comments
Interesting, could you point to the merged PR? Does it support batching? This method has a similar idea (copy from input, no Jacobi): https://github.com/alipay/PainlessInferenceAcceleration |
Here's the PR: huggingface/transformers#27775 From a cursory glance at the PR, it seems like it supports batching. |
I have also noticed these two methods. |
Lookahead decoding takes the n-grams from prior lookahead decoding steps /Jacobi trajectories. Prompt lookup decoding takes the n-grams from the prompt. |
It doesn't :/ huggingface/transformers#27775 (comment) |
Interesting. As the comment also suggests, it seems like PLD can support batching in theory - it's just the implementation that doesn't support it. |
Lookahead was mentioned here https://github.com/SafeAILab/EAGLE |
https://github.com/apoorvumang/prompt-lookup-decoding
This method was recently merged into Huggingface
transformers
and also uses n-grams (found in the input prompt) to accelerate decoding.The text was updated successfully, but these errors were encountered: