You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I ask a question that I know there's an answer in the knowledge base, it works. When I ask another question without an answer in the base, it gives answer like The provided text does not contain information about the xxx. However, I can provide a general overview of the key principles of cloud computing.
During the process it called search_knowledge_base(query=xxx) just once.
But when I use a custom RAG tool like
`class CustomToolkit(Toolkit):
def init(self):
super().init(name="xxx")
self.register(self.custom_search_endpoint)
custom_search_endpoint() returns pure text paragraph, but that seems not working so I use the adaptor to simulate the result in your function:
`def search_knowledge_base(self, query: str) -> str:
"""Use this function to search the knowledge base for information about a query.
Args:
query: The query to search for.
Returns:
str: A string containing the response from the knowledge base.
"""
# Get the relevant documents from the knowledge base
retrieval_timer = Timer()
retrieval_timer.start()
docs_from_knowledge = self.get_relevant_docs_from_knowledge(query=query)
if docs_from_knowledge is not None:
references = MessageReferences(
query=query, references=docs_from_knowledge, time=round(retrieval_timer.elapsed, 4)
)
# Add the references to the run_response
if self.run_response.extra_data is None:
self.run_response.extra_data = RunResponseExtraData()
if self.run_response.extra_data.references is None:
self.run_response.extra_data.references = []
self.run_response.extra_data.references.append(references)
retrieval_timer.stop()
logger.debug(f"Time to get references: {retrieval_timer.elapsed:.4f}s")
if docs_from_knowledge is None:
return "No documents found"
return self.convert_documents_to_string(docs_from_knowledge)`
By launching typer.run(test), if the question is related, I can get a normal answer. however if the question is less related, it keeps calling the custom_search_endpoint for several times, each time with a question slightly different (but similar and duplicates in some cases) and process ends up by too many tokens sent to llm. Shouldn't it return an answer like The provided text does not contain information about the xxx. However, I can provide a general overview of the key principles of xxx. instead of keep calling the tool?
The text was updated successfully, but these errors were encountered:
Kkkassini
changed the title
Custom RAG tool kept being called and ended up getting too many tokens error
Custom RAG tool kept being called and as a result llm getting too many tokens error
Jan 16, 2025
When using provided RAG everything looks good, ex:
`knowledge_base = PDFKnowledgeBase(
path="/xxx/",
vector_db=PgVector2(schema="xx", collection="xx", db_url=db_url, embedder=embedder)
)
model = OpenAIChat(
id="casperhansen/llama-3.3-70b-instruct-awq",
base_url="xxx",
api_key= "xxx",
model="openai/casperhansen/llama-3.3-70b-instruct-awq",
temperature=0.3,
)
knowledge_base.load(upsert=True)
storage = PgAssistantStorage(table_name="pdf_assistant", db_url=db_url)
def pdf_assistant(new: bool = False, user: str = "user"):
run_id: Optional[str] = None
if not new:
existing_run_ids: List[str] = storage.get_all_run_ids(user)
if len(existing_run_ids) > 0:
run_id = existing_run_ids[0]
When I ask a question that I know there's an answer in the knowledge base, it works. When I ask another question without an answer in the base, it gives answer like
The provided text does not contain information about the xxx. However, I can provide a general overview of the key principles of cloud computing.
During the process it called search_knowledge_base(query=xxx) just once.
But when I use a custom RAG tool like
`class CustomToolkit(Toolkit):
def init(self):
super().init(name="xxx")
self.register(self.custom_search_endpoint)
def test():
agent = Agent(model=model,
tools=[CustomToolkit()],
show_tool_calls=True,
add_context=False,
markdown=False)
custom_search_endpoint() returns pure text paragraph, but that seems not working so I use the adaptor to simulate the result in your function:
`def search_knowledge_base(self, query: str) -> str:
"""Use this function to search the knowledge base for information about a query.
By launching typer.run(test), if the question is related, I can get a normal answer. however if the question is less related, it keeps calling the custom_search_endpoint for several times, each time with a question slightly different (but similar and duplicates in some cases) and process ends up by too many tokens sent to llm. Shouldn't it return an answer like
The provided text does not contain information about the xxx. However, I can provide a general overview of the key principles of xxx.
instead of keep calling the tool?The text was updated successfully, but these errors were encountered: