-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GraphRAG performance enhacements #924
Conversation
Signed-off-by: Rita Brugarolas <[email protected]>
for more information, see https://pre-commit.ci
…o directly run build communities Signed-off-by: Rita Brugarolas <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Rita Brugarolas <[email protected]>
Signed-off-by: Rita Brugarolas <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Rita Brugarolas <[email protected]>
Signed-off-by: rbrygaro <[email protected]>
for more information, see https://pre-commit.ci
There are lot of dataprep backends and neoj4/llama is not the default one used in docker compose files and Helm charts. Do the ones used by default have also similar bottleneck? |
@eero-t thanks for commenting. Althought there are other dataprep backends functionality is very different from this microservice. This dataprep models the microsoft graphRAG (https://github.com/microsoft/graphrag) dataprep backend which performs entity/relationship extraction (using LLM), builds a graph, clusters the graph nodes to generate communities, generates community summaries and those are later retrieved by retriever to answer query by generating a final answer from partial answers. For "toy datasets" these bottlenecks are negligible but for large dataset slow down is significant. |
Signed-off-by: rbrygaro <[email protected]>
Signed-off-by: rbrygaro <[email protected]>
Signed-off-by: rbrygaro <[email protected]>
Signed-off-by: rbrygaro <[email protected]>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rbrugaro
My comments and change requests are mostly generic and feel free to challenge me 😄
comps/retrievers/neo4j/llama_index/retriever_community_answers_neo4j.py
Outdated
Show resolved
Hide resolved
comps/retrievers/neo4j/llama_index/retriever_community_answers_neo4j.py
Outdated
Show resolved
Hide resolved
comps/retrievers/neo4j/llama_index/retriever_community_answers_neo4j.py
Outdated
Show resolved
Hide resolved
comps/retrievers/neo4j/llama_index/retriever_community_answers_neo4j.py
Outdated
Show resolved
Hide resolved
comps/retrievers/neo4j/llama_index/retriever_community_answers_neo4j.py
Outdated
Show resolved
Hide resolved
tests/retrievers/test_retrievers_neo4j_llama_index_on_intel_hpu.sh
Outdated
Show resolved
Hide resolved
Hi @rbrugaro, could you please resolve the comments? Thanks. |
Do you mean the mode initialization of TGI/TEI service? If yes, I suppose the LLM model initialization are in TGI/TEI service, so model initialization should be performed only once. Both dataprep and retrieval only refer to the service instead of creating a local one. Does GraphRAG enabled in ChatQnA? the dataprep (GraphRAG) share the same TGI/TEI service (docker container) with ChatQnA LLM? |
@xiguiw sorry for the confusion and thx for the comments. I updated the PR description maybe is clearer now. I was referring to the initialization of the GraphStore. The inference microservices are just deployed and initialized once. dataprep and retriever code just send endpoint requests to those with llama-index api. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: rbrygaro <[email protected]>
Verified test_retrievers_neo4j.sh OK Dataprep still pre-refactor (not merged yet in main) Signed-off-by: rbrygaro <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: rbrygaro <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Issue: When property graph store gets filled (~12K nodes, 15K relationships) insertion time in dataprep gets slow. Extraction + insertion starts at ~30 sec and once it gets filled grows to (~12K nodes, 15K relationships) ~800 sec Perf bottleneck this cypher call in llama-index to do node upsert: https://github.com/run-llama/llama_index/blob/795bebc2bad31db51b854a5c062bedca42397630/llama-index-integrations/graph_stores/llama-index-graph-stores-neo4j/llama_index/graph_stores/neo4j/neo4j_property_graph.py#L334 Performance optimizations in this PR: 1. Move neo4j GraphStore initialization out of detaprep and retrieve function so it's only performed once at the begining 2. Disable schema_refresh of neo4j graph when not necessary because for large graph this is very slow. 3. Switch to OpenAILike class from llama-index to work with vllm or tgi endpoints without code changes (only docker compose.yaml changes) 4. Added concurrency and batching for generating community summaries and generating answers from summaries --------- Signed-off-by: Rita Brugarolas <[email protected]>
Issue: When property graph store gets filled (~12K nodes, 15K relationships) insertion time in dataprep gets slow.
Extraction + insertion starts at ~30 sec and once it gets filled grows to (~12K nodes, 15K relationships) ~800 sec
Perf bottleneck this cypher call in llama-index to do node upsert:
https://github.com/run-llama/llama_index/blob/795bebc2bad31db51b854a5c062bedca42397630/llama-index-integrations/graph_stores/llama-index-graph-stores-neo4j/llama_index/graph_stores/neo4j/neo4j_property_graph.py#L334
Performance optimizations in this PR: