Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constrained decoding #103

Merged
merged 19 commits into from
Apr 11, 2024

Conversation

lucasavila00
Copy link
Contributor

@lucasavila00 lucasavila00 commented Apr 10, 2024

The public interface is the same used by TGI.

A grammar parameter that can have type "regex" or "yacc". (TGI has regex and json-schema)

Example 1 - Token healing

curl http://localhost:1234/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer EMPTY" \
-d '{
"model": "",
"grammar": {
    "type": "regex",
    "value": "Sure! (?s:.)*"
},
"messages": [
{
    "role": "user",
    "content": "Tell me a joke."
}
]
}'

Example 2 - Markdown List

curl http://localhost:1234/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer EMPTY" \
-d '{
"model": "",
"grammar": {
    "type": "regex",
    "value": "(- [^\n]*\n)+(- [^\n]*)(\n\n)?"
},
"messages": [
{
    "role": "user",
    "content": "Write a list of jokes. Return a markdown list where each item is a joke."
}
]
}'

Copy link

github-actions bot commented Apr 10, 2024

Code Metrics Report
  ───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
Rust                        47     16623     1176       706    14741        809
───────────────────────────────────────────────────────────────────────────────
Total                       47     16623     1176       706    14741        809
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop 55,563
Estimated Schedule Effort 10.201465 months
Estimated People Required 3.967364
───────────────────────────────────────────────────────────────────────────────
Processed 566260 bytes, 0.566 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────
  

@lucasavila00
Copy link
Contributor Author

lucasavila00 commented Apr 10, 2024

@EricLBuehler What do you think about this approach?

The Windows build fails because the linker complains about missing functions that exist just for WASM. I'm not sure how to deal with this...

Also, I'm not a fan of using a library from Git. Should we vendor the code or something?

I haven't tried optimizing it yet, but on my system, this creates no overhead.

But it works 😃

@EricLBuehler
Copy link
Owner

This looks really great, thank you so much for adding it! However, I am concerned about the build failures on Windows, which I do want to support. An AICI issue: microsoft/aici#42 says that they do not support native Windows, which is a problem.

Perhaps we could support using aici_abi on non-Windows platforms, but use kalosm_sample's RegexParser otherwise or on all platforms for consistency? This is really great, and I wish that Windows support was available. Do you think you could modify it to use RegexParser? For example, they run the parser here and modify the logits.

@lucasavila00
Copy link
Contributor Author

This looks really great, thank you so much for adding it! However, I am concerned about the build failures on Windows, which I do want to support. An AICI issue: microsoft/aici#42 says that they do not support native Windows, which is a problem.

I tried vendoring the code. It's 2k lines and by vendoring I was able to remove ~20 transient dependencies too.

Perhaps we could support using aici_abi on non-Windows platforms, but use kalosm_sample's RegexParser otherwise or on all platforms for consistency? This is really great, and I wish that Windows support was available. Do you think you could modify it to use RegexParser? For example, they run the parser here and modify the logits.

The parser is in "string-space" and we need to operate on "token-space". This would require a lot of glue code to do this conversion. I hope to make AICI code work for us, if it doesn't then I look into RegexParser

@EricLBuehler
Copy link
Owner

Thank you for pointing that out. Looks like the Windows build is passing; I think we are probably close to a merge once we do some testing, and you mark it as ready!

@EricLBuehler EricLBuehler added new feature New feature or request models Additions to model or architectures processing Processing related to the model labels Apr 11, 2024
@lucasavila00 lucasavila00 marked this pull request as ready for review April 11, 2024 01:41
@lucasavila00
Copy link
Contributor Author

@EricLBuehler I think this is ready for review and merge.

There is work left to do on optimization, and I haven't tested the yacc implementation yet.

But since the MR is already big, it might be a good point to merge.

mistralrs-core/src/engine/mod.rs Outdated Show resolved Hide resolved
@EricLBuehler
Copy link
Owner

I wrote some comments regarding error handling, other than that this looks excellent. Could you please add the example requests above into the examples/ directory?

@lucasavila00
Copy link
Contributor Author

I added a Yacc example and tested that it works as expected.

@EricLBuehler EricLBuehler merged commit 370463c into EricLBuehler:master Apr 11, 2024
11 checks passed
@EricLBuehler
Copy link
Owner

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
models Additions to model or architectures new feature New feature or request processing Processing related to the model
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants