Fix TGI
(Text Generation Inference) Endpoint Inference and TGI JSON Grammar Generation
#502
+25
−6
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
While implementing a custom task using
lighteval
, I needed to use constrained grammar generation with TGI and it seems that TGI integration is not up-to-date and not working.Fixes for TGI Endpoint Inference
/info
route of TGI3.0.1
doesn't always return required fields such asmodel_dtype
, so it was set toNone
by default if not found:AsyncClient
from TGI has agenerate
function that expects multiple parameters and not a structure.do_sample
,return_full_text
andwatermark
parameters asFalse
by default since they come fromhuggingface_hub
which accepts aNone
default parameters but TGI doesn't accept them_async_process_request
anyway and maybe this should be fixed in another PR. Same foradapter_id
for LoRA heads.ModelClient
's usage has been fixed to use theconfig: TGIModelConfig
by default instead of named parametersFixes for TGI JSON Grammar Generation
text_generation
to0.7.0
Environment
Command
Dependencies
model_config_path
argument for TGItgi.yaml
:Test Results
It works as can be seen from the logs.
TGI Logs with JSON Grammar Generation
Lighteval Logs
Note: I have anonymized parts of the logs