Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GraphRAG performance enhacements #924

Merged
merged 25 commits into from
Jan 19, 2025
Merged

Conversation

rbrugaro
Copy link
Collaborator

@rbrugaro rbrugaro commented Nov 20, 2024

Issue: When property graph store gets filled (~12K nodes, 15K relationships) insertion time in dataprep gets slow.
Extraction + insertion starts at ~30 sec and once it gets filled grows to (~12K nodes, 15K relationships) ~800 sec
Perf bottleneck this cypher call in llama-index to do node upsert:
https://github.com/run-llama/llama_index/blob/795bebc2bad31db51b854a5c062bedca42397630/llama-index-integrations/graph_stores/llama-index-graph-stores-neo4j/llama_index/graph_stores/neo4j/neo4j_property_graph.py#L334

Performance optimizations in this PR:

  1. Move neo4j GraphStore initialization out of detaprep and retrieve function so it's only performed once at the begining
  2. Disable schema_refresh of neo4j graph when not necessary because for large graph this is very slow.
  3. Switch to OpenAILike class from llama-index to work with vllm or tgi endpoints without code changes (only docker compose.yaml changes)
  4. Added concurrency and batching for generating community summaries and generating answers from summaries

@joshuayao joshuayao linked an issue Dec 12, 2024 that may be closed by this pull request
@eero-t
Copy link
Contributor

eero-t commented Dec 18, 2024

There are lot of dataprep backends and neoj4/llama is not the default one used in docker compose files and Helm charts.

Do the ones used by default have also similar bottleneck?

@joshuayao joshuayao added this to the v1.2 milestone Jan 7, 2025
@rbrugaro
Copy link
Collaborator Author

rbrugaro commented Jan 7, 2025

There are lot of dataprep backends and neoj4/llama is not the default one used in docker compose files and Helm charts.

Do the ones used by default have also similar bottleneck?

@eero-t thanks for commenting. Althought there are other dataprep backends functionality is very different from this microservice. This dataprep models the microsoft graphRAG (https://github.com/microsoft/graphrag) dataprep backend which performs entity/relationship extraction (using LLM), builds a graph, clusters the graph nodes to generate communities, generates community summaries and those are later retrieved by retriever to answer query by generating a final answer from partial answers.

For "toy datasets" these bottlenecks are negligible but for large dataset slow down is significant.

@rbrugaro rbrugaro marked this pull request as ready for review January 11, 2025 00:45
@rbrugaro rbrugaro added WIP r1.2 and removed r1.2 labels Jan 11, 2025
@rbrugaro rbrugaro removed the WIP label Jan 13, 2025
@rbrugaro rbrugaro added the r1.2 label Jan 13, 2025
Copy link
Collaborator

@ashahba ashahba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rbrugaro
My comments and change requests are mostly generic and feel free to challenge me 😄

@joshuayao
Copy link
Collaborator

Hi @rbrugaro, could you please resolve the comments? Thanks.

@xiguiw
Copy link
Collaborator

xiguiw commented Jan 15, 2025

Issue: When property graph store gets filled (~12K nodes, 15K relationships) insertion time in dataprep gets slow. Extraction + insertion starts at ~30 sec and once it gets filled grows to (~12K nodes, 15K relationships) ~800 sec Perf bottleneck this cypher call in llama-index to do node upsert: https://github.com/run-llama/llama_index/blob/795bebc2bad31db51b854a5c062bedca42397630/llama-index-integrations/graph_stores/llama-index-graph-stores-neo4j/llama_index/graph_stores/neo4j/neo4j_property_graph.py#L334

WIP solution:

  1. mode initialization out of detaprep and retrieve function so only performed once
  2. ...

Do you mean the mode initialization of TGI/TEI service? If yes, I suppose the LLM model initialization are in TGI/TEI service, so model initialization should be performed only once. Both dataprep and retrieval only refer to the service instead of creating a local one.
The dependency should be in document and docker-compose.yml file.

Does GraphRAG enabled in ChatQnA? the dataprep (GraphRAG) share the same TGI/TEI service (docker container) with ChatQnA LLM?

@rbrugaro
Copy link
Collaborator Author

Issue: When property graph store gets filled (~12K nodes, 15K relationships) insertion time in dataprep gets slow. Extraction + insertion starts at ~30 sec and once it gets filled grows to (~12K nodes, 15K relationships) ~800 sec Perf bottleneck this cypher call in llama-index to do node upsert: https://github.com/run-llama/llama_index/blob/795bebc2bad31db51b854a5c062bedca42397630/llama-index-integrations/graph_stores/llama-index-graph-stores-neo4j/llama_index/graph_stores/neo4j/neo4j_property_graph.py#L334
WIP solution:

  1. mode initialization out of detaprep and retrieve function so only performed once
  2. ...

Do you mean the mode initialization of TGI/TEI service? If yes, I suppose the LLM model initialization are in TGI/TEI service, so model initialization should be performed only once. Both dataprep and retrieval only refer to the service instead of creating a local one. The dependency should be in document and docker-compose.yml file.

Does GraphRAG enabled in ChatQnA? the dataprep (GraphRAG) share the same TGI/TEI service (docker container) with ChatQnA LLM?

@xiguiw sorry for the confusion and thx for the comments. I updated the PR description maybe is clearer now. I was referring to the initialization of the GraphStore. The inference microservices are just deployed and initialized once. dataprep and retriever code just send endpoint requests to those with llama-index api.
I am not sure what you mean by GraphRAG and ChatQnA sharing same docker containers. This yaml defines the deployment: https://github.com/rbrugaro/GenAIComps/blob/GRAG_1.2/comps/dataprep/neo4j/llama_index/neo4j_llama_index.yaml

Copy link
Collaborator

@lkk12014402 lkk12014402 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

lkk12014402 and others added 4 commits January 15, 2025 13:27
Verified test_retrievers_neo4j.sh OK
Dataprep still pre-refactor (not merged yet in main)

Signed-off-by: rbrygaro <[email protected]>
Copy link
Collaborator

@letonghan letonghan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@chensuyue chensuyue dismissed ashahba’s stale review January 19, 2025 12:58

change request meet

@chensuyue chensuyue merged commit db8acd5 into opea-project:main Jan 19, 2025
21 checks passed
smguggen pushed a commit to opea-aws-proserve/GenAIComps that referenced this pull request Jan 23, 2025
Issue: When property graph store gets filled (~12K nodes, 15K relationships) insertion time in dataprep gets slow.
Extraction + insertion starts at ~30 sec and once it gets filled grows to (~12K nodes, 15K relationships) ~800 sec
Perf bottleneck this cypher call in llama-index to do node upsert:
https://github.com/run-llama/llama_index/blob/795bebc2bad31db51b854a5c062bedca42397630/llama-index-integrations/graph_stores/llama-index-graph-stores-neo4j/llama_index/graph_stores/neo4j/neo4j_property_graph.py#L334

Performance optimizations in this PR:

1. Move neo4j GraphStore initialization out of detaprep and retrieve function so it's only performed once at the begining
2. Disable schema_refresh of neo4j graph when not necessary because for large graph this is very slow.
3. Switch to OpenAILike class from llama-index to work with vllm or tgi endpoints without code changes (only docker compose.yaml changes)
4. Added concurrency and batching for generating community summaries and generating answers from summaries

---------

Signed-off-by: Rita Brugarolas <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] GraphRAG perf improvement
9 participants