Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support XGrammar backend as an alternative to Outlines #2900

Open
2016bgeyer opened this issue Jan 10, 2025 · 0 comments
Open

Support XGrammar backend as an alternative to Outlines #2900

2016bgeyer opened this issue Jan 10, 2025 · 0 comments
Assignees

Comments

@2016bgeyer
Copy link

Feature request

Support the use of XGrammar instead of Outlines for the backend Structured-Output generation.

Motivation

XGrammar has been shown to be much faster than Outlines for generation structured output

BlogPost:
https://blog.mlc.ai/2024/11/22/achieving-efficient-flexible-portable-structured-generation-with-xgrammar

"As shown in Figure 1, XGrammar outperforms existing structured generation solutions by up to 3.5x on the JSON schema workload and more than 10x on the CFG workload. Notably, the gap in CFG-guided generation is larger. This is because many JSON schema specifications can be expressed as regular expressions, bringing more optimizations that are not directly applicable to CFGs."
image

"Figure 2 shows end-to-end inference performance on LLM serving tasks. We can find the trend again that the gap on CFG-guided settings is larger, and the gap grows on larger batch sizes. This is because the GPU throughput is higher on larger batch sizes, putting greater pressure on the grammar engine running on CPUs. Note that the main slowdown of vLLM comes from its structured generation engine, which can be potentially eliminated by integrating with XGrammar. In all cases, XGrammar enables high-performance generation in both settings without compromising flexibility and efficiency."
image

Paper / Technical Report from XGrammar:
https://arxiv.org/abs/2411.15100

Your contribution

XGrammar Repo:
https://github.com/mlc-ai/xgrammar

SGLang Repo:
https://github.com/sgl-project/sglang/

SGLang Docs on Structured Output generation including using XGrammar:
https://sgl-project.github.io/backend/openai_api_completions.html#Structured-Outputs-(JSON,-Regex,-EBNF)

Note: Those docs do note that XGrammar does not support regular expressions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants