-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Extend response parameters support to BLS in python backend #395
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A minor readme comment otherwise LGTM. Nice work on your first feature PR!
Co-authored-by: Jacky <[email protected]>
@kthui Thank you so much for sharing context and knowledge on this PR and beyond! Cannot do it w/o them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm as well, nice work 🚀
What does the PR do?
Add support for setting response parameters in regular and decoupled Python backend model response(s), when using BLS.
Checklist
<commit_type>: <Title>
Commit Type:
Check the conventional commit type
box here and add the label to the github PR.
Related PRs:
triton-inference-server/server#7987
Where should the reviewer start?
Start with the test cases on the related server PR linked above to learn the context and usage. Then the major changes in this PR are only inside
request_executor.cc
Test plan:
New tests are added to the related server PR.
Caveats:
N/A
Background
N/A
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
https://jirasw.nvidia.com/browse/DLIS-7520