Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[willing to PR] Add Lookahead speculative decoding #2772

Open
2 tasks done
jjjjohnson opened this issue Jan 7, 2025 · 1 comment
Open
2 tasks done

[willing to PR] Add Lookahead speculative decoding #2772

jjjjohnson opened this issue Jan 7, 2025 · 1 comment
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@jjjjohnson
Copy link
Contributor

Checklist

Motivation

n-gram based speculative is very effective in retrieval augmented generation(RAG). The cost of generating draft tokens is relatively low compared to eagle and has a great potential for accelerating token generation in RAG. Ant group has proposed the Trie-based retrieval and verification mechanism. I want to adopt it to SGLang.

Related resources

Lookahead: An Inference Acceleration Framework for Large Language Model with Lossless Generation Accuracy

@zhyncs zhyncs added good first issue Good for newcomers enhancement New feature or request labels Jan 7, 2025
@jjjjohnson
Copy link
Contributor Author

@zhyncs #2790

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants