-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancement: Support for HuggingFace Self-Served TGI Serving with Updated Dependencies and Improved AI Handler #472
Conversation
PR Analysis(review updated until commit 5d9c767)
PR Feedback
|
/improve --pr_code_suggestions.summarize=true |
PR Code Suggestions💡 Suggestion: Avoid hardcoding values directly in the code. Instead, consider defining them as constants at the top of your file or in a separate config file. This will make it easier to manage and modify these values in the future. File: pr_agent/algo/ai_handler.py (74-79) Example code:Existing code: @retry(
exceptions=(APIError, TryAgain, AttributeError, RateLimitError),
tries=OPENAI_RETRIES,
delay=2,
backoff=2,
jitter=(1, 3),
) Improved code: OPENAI_RETRY_DELAY = 2
OPENAI_RETRY_BACKOFF = 2
OPENAI_RETRY_JITTER = (1, 3)
@retry(
exceptions=(APIError, TryAgain, AttributeError, RateLimitError),
tries=OPENAI_RETRIES,
delay=OPENAI_RETRY_DELAY,
backoff=OPENAI_RETRY_BACKOFF,
jitter=OPENAI_RETRY_JITTER,
) 💡 Suggestion: It's a good practice to use f-strings for formatting strings in Python. It's more readable and efficient than the traditional string formatting methods. File: pr_agent/algo/ai_handler.py (135-142) Example code:Existing code: get_logger().info(
"AI response",
response=resp,
messages=messages,
finish_reason=finish_reason,
model=model,
usage=usage,
) Improved code: get_logger().info(
f"AI response: {resp}, messages: {messages}, finish_reason: {finish_reason}, model: {model}, usage: {usage}"
) 💡 Suggestion: It's better to use the dict.get() method instead of dict[key] for dictionary access. This will prevent KeyError exceptions if the key is not present in the dictionary. File: pr_agent/algo/utils.py (373-373) Example code:Existing code: max_tokens_model = MAX_TOKENS.get(model, 4000) Improved code: max_tokens_model = MAX_TOKENS.get(model, 4000) 💡 Suggestion: It's recommended to pin the versions of your dependencies. This can help to avoid unexpected breaks due to updates in the dependencies. File: requirements.txt (25-25) Example code:Existing code: litellm @ git+https://github.com/Codium-ai/litellm.git Improved code: litellm @ git+https://github.com/Codium-ai/litellm.git@<commit-hash> |
/add_docs |
Persistent review updated to latest commit 5d9c767 |
PR Description updated to latest commit (5d9c767) |
1 similar comment
PR Description updated to latest commit (5d9c767) |
Persistent review updated to latest commit 5d9c767 |
1 similar comment
Persistent review updated to latest commit 5d9c767 |
PR Code Suggestions💡 Suggestion: Instead of using the File: pr_agent/algo/ai_handler.py (30-60) Example code:Existing code: if get_settings().get("OPENAI.KEY", None):
openai.api_key = get_settings().openai.key
litellm.openai_key = get_settings().openai.key
if get_settings().get("HUGGINGFACE.KEY", None):
litellm.huggingface_key = get_settings().huggingface.key
if get_settings().get("HUGGINGFACE.API_BASE", None) and 'huggingface' in get_settings().config.model:
litellm.api_base = get_settings().huggingface.api_base
self.api_base = get_settings().huggingface.api_base Improved code: settings = get_settings()
if settings.get("OPENAI.KEY", None):
openai.api_key = settings.openai.key
litellm.openai_key = settings.openai.key
if settings.get("HUGGINGFACE.KEY", None):
litellm.huggingface_key = settings.huggingface.key
if settings.get("HUGGINGFACE.API_BASE", None) and 'huggingface' in settings.config.model:
litellm.api_base = settings.huggingface.api_base
self.api_base = settings.huggingface.api_base 💡 Suggestion: Instead of concatenating strings using '+', consider using f-strings for better readability. File: pr_agent/algo/ai_handler.py (112-115) Example code:Existing code: system = get_settings().get("CONFIG.MODEL_SYSTEM_PREFIX", "") + system + \
get_settings().get("CONFIG.MODEL_SYSTEM_SUFFIX", "")
user = get_settings().get("CONFIG.MODEL_USER_PREFIX", "") + user + \
get_settings().get("CONFIG.MODEL_USER_SUFFIX", "") Improved code: system = f"{get_settings().get('CONFIG.MODEL_SYSTEM_PREFIX', '')}{system}{get_settings().get('CONFIG.MODEL_SYSTEM_SUFFIX', '')}"
user = f"{get_settings().get('CONFIG.MODEL_USER_PREFIX', '')}{user}{get_settings().get('CONFIG.MODEL_USER_SUFFIX', '')}" 💡 Suggestion: Instead of using the File: pr_agent/algo/utils.py (378-380) Example code:Existing code: settings = get_settings()
max_tokens_model = MAX_TOKENS.get(model, 4000)
if settings.config.max_model_tokens:
max_tokens_model = min(settings.config.max_model_tokens, max_tokens_model) Improved code: settings = get_settings()
max_tokens_model = MAX_TOKENS.get(model, 4000)
if settings.config.max_model_tokens:
max_tokens_model = min(settings.config.max_model_tokens, max_tokens_model) 💡 Suggestion: Keep the dependencies in the requirements.txt file sorted alphabetically. This makes it easier to find a dependency and see its version at a glance. File: requirements.txt (5-25) Example code:Existing code: openai==1.3.5
...
litellm==1.7.1 Improved code: litellm==1.7.1
...
openai==1.3.5 |
PR Description updated to latest commit (5d9c767) |
PR Code Suggestions💡 [enhancement] It seems that the File: pr_agent/algo/ai_handler.py (29-60) Example code:Existing code: self.api_base = None
...
if get_settings().get("HUGGINGFACE.API_BASE", None) and 'huggingface' in get_settings().config.model:
litellm.api_base = get_settings().huggingface.api_base
self.api_base = get_settings().huggingface.api_base Improved code: if get_settings().get("HUGGINGFACE.API_BASE", None) and 'huggingface' in get_settings().config.model:
litellm.api_base = get_settings().huggingface.api_base 💡 [maintainability] The File: pr_agent/algo/ai_handler.py (81-147) Example code:Existing code: async def chat_completion(
self, model: str, system: str, user: str, temperature: float = 0.2
):
... Improved code: async def chat_completion(
self, model: str, system: str, user: str, temperature: float = 0.2
):
system, user = self._add_prefix_suffix(system, user)
messages = self._create_messages(system, user)
response = await self._get_completion_response(model, messages, temperature)
self._log_response(response, messages, model)
return response["choices"][0]["message"]["content"], response["choices"][0]["finish_reason"] 💡 [enhancement] The File: pr_agent/algo/utils.py (378-378) Example code:Existing code: max_tokens_model = MAX_TOKENS.get(model, 4000) Improved code: if model not in MAX_TOKENS:
get_logger().warning(f"Model {model} not found in MAX_TOKENS dictionary. Defaulting to 4000 tokens.")
max_tokens_model = MAX_TOKENS.get(model, 4000) |
/improve |
PR Code Suggestions💡 Suggestion: Instead of using the File: pr_agent/algo/ai_handler.py (30-60) Example code:Existing code: if get_settings().get("OPENAI.KEY", None):
openai.api_key = get_settings().openai.key
litellm.openai_key = get_settings().openai.key
if get_settings().get("HUGGINGFACE.KEY", None):
litellm.huggingface_key = get_settings().huggingface.key
if get_settings().get("HUGGINGFACE.API_BASE", None) and 'huggingface' in get_settings().config.model:
litellm.api_base = get_settings().huggingface.api_base
self.api_base = get_settings().huggingface.api_base Improved code: settings = get_settings()
if settings.get("OPENAI.KEY", None):
openai.api_key = settings.openai.key
litellm.openai_key = settings.openai.key
if settings.get("HUGGINGFACE.KEY", None):
litellm.huggingface_key = settings.huggingface.key
if settings.get("HUGGINGFACE.API_BASE", None) and 'huggingface' in settings.config.model:
litellm.api_base = settings.huggingface.api_base
self.api_base = settings.huggingface.api_base 💡 Suggestion: Consider using f-strings for string concatenation for better readability. File: pr_agent/algo/ai_handler.py (112-115) Example code:Existing code: system = get_settings().get("CONFIG.MODEL_SYSTEM_PREFIX", "") + system + \
get_settings().get("CONFIG.MODEL_SYSTEM_SUFFIX", "")
user = get_settings().get("CONFIG.MODEL_USER_PREFIX", "") + user + \
get_settings().get("CONFIG.MODEL_USER_SUFFIX", "") Improved code: system = f"{get_settings().get('CONFIG.MODEL_SYSTEM_PREFIX', '')}{system}{get_settings().get('CONFIG.MODEL_SYSTEM_SUFFIX', '')}"
user = f"{get_settings().get('CONFIG.MODEL_USER_PREFIX', '')}{user}{get_settings().get('CONFIG.MODEL_USER_SUFFIX', '')}" 💡 Suggestion: Consider using the File: pr_agent/algo/utils.py (378-378) Example code:Existing code: max_tokens_model = MAX_TOKENS[model] Improved code: max_tokens_model = MAX_TOKENS.get(model, 4000) 💡 Suggestion: Consider maintaining the order of dependencies in the requirements file. This can help in identifying the changes made to the dependencies more easily. File: requirements.txt (5-25) Example code:Existing code: openai==1.3.5
...
litellm==1.7.1 Improved code: openai==1.3.5
litellm==1.7.1
... |
except Exception as e: | ||
get_logger().error("Unknown error during OpenAI inference: ", e) | ||
raise TryAgain from e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Consider using a more specific exception instead of the generic Exception
in the chat_completion
method. This will make the error handling more precise and prevent catching and handling unexpected exceptions that might need different handling logic. [best practice]
except Exception as e: | |
get_logger().error("Unknown error during OpenAI inference: ", e) | |
raise TryAgain from e | |
except SpecificException as e: # Replace SpecificException with the specific exception you want to catch | |
get_logger().error("Specific error during OpenAI inference: ", e) | |
raise TryAgain from e |
@retry( | ||
exceptions=(APIError, TryAgain, AttributeError, RateLimitError), | ||
tries=OPENAI_RETRIES, | ||
delay=2, | ||
backoff=2, | ||
jitter=(1, 3), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: It's a good practice to avoid using magic numbers directly in the code. The numbers used in the retry
decorator in the chat_completion
method (like 2
, 1
, 3
) could be replaced with well-named constants to improve code readability and maintainability. [best practice]
@retry( | |
exceptions=(APIError, TryAgain, AttributeError, RateLimitError), | |
tries=OPENAI_RETRIES, | |
delay=2, | |
backoff=2, | |
jitter=(1, 3), | |
) | |
DELAY = 2 | |
BACKOFF = 2 | |
JITTER = (1, 3) | |
@retry( | |
exceptions=(APIError, TryAgain, AttributeError, RateLimitError), | |
tries=OPENAI_RETRIES, | |
delay=DELAY, | |
backoff=BACKOFF, | |
jitter=JITTER, | |
) |
@@ -375,7 +375,7 @@ def get_user_labels(current_labels: List[str] = None): | |||
|
|||
def get_max_tokens(model): | |||
settings = get_settings() | |||
max_tokens_model = MAX_TOKENS[model] | |||
max_tokens_model = MAX_TOKENS.get(model, 4000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Consider adding a default value for MAX_TOKENS
in the get_max_tokens
function. This will make the function more robust in case the MAX_TOKENS
dictionary doesn't contain the model as a key. [enhancement]
max_tokens_model = MAX_TOKENS.get(model, 4000) | |
DEFAULT_MAX_TOKENS = 4000 | |
max_tokens_model = MAX_TOKENS.get(model, DEFAULT_MAX_TOKENS) |
requirements.txt
Outdated
@@ -23,3 +22,4 @@ pinecone-client | |||
pinecone-datasets @ git+https://github.com/mrT23/pinecone-datasets.git@main | |||
loguru==0.7.2 | |||
google-cloud-aiplatform==1.35.0 | |||
litellm==1.7.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: It's a good practice to pin the versions of your dependencies to ensure your project does not break due to updates in the dependencies. Consider pinning the version of litellm
in your requirements.txt
file. [best practice]
litellm==1.7.1 | |
litellm==1.7.1 # Pin the version to ensure compatibility |
/analyze |
PR AnalysisHere is a components analysis of the PR, as well as a change summary for each component:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL
Type
Enhancement
Description
This PR introduces support for HuggingFace self-served TGI serving and makes several enhancements to the AI Handler. The main changes include:
api_base
to theAIHandler
class.chat_completion
method to includeapi_base
in theacompletion
call.chat_completion
method to add prefixes and suffixes to system and user messages based on the configuration.MAX_TOKENS
dictionary inget_max_tokens
function.openai
version from0.27.8
to1.3.5
andlitellm
version from0.12.5
to1.7.1
in therequirements.txt
file.TryAgain
fromopenai.error
totenacity
.PR changes walkthrough
1 files
ai_handler.py
pr_agent/algo/ai_handler.py
The
AIHandler
class has been updated to include a newattribute
api_base
. Thechat_completion
method has beenmodified to include
api_base
in theacompletion
call andto add prefixes and suffixes to system and user messages
based on the configuration. The import of
TryAgain
hasbeen replaced from
openai.error
totenacity
.1 files
utils.py
pr_agent/algo/utils.py
The
get_max_tokens
function has been updated to handle thecase where the model is not found in the
MAX_TOKENS
dictionary.
1 files
requirements.txt
requirements.txt
The
openai
version has been updated from0.27.8
to1.3.5
andlitellm
version from0.12.5
to1.7.1
.