-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Fix CAGRA graph optimization bug #565
[BUG] Fix CAGRA graph optimization bug #565
Conversation
It seems too severe to abort the optimization process if there is even one invalid node, so I updated the code not to abort unless the pruned graph can be generated even if there are invalid nodes in the initial kNN graph. |
@anaruse Can you review the update? It is related to merging the pruned graph and MST optimization edges. |
Approving, but waiting to merge until you confirm it's ready @enp1s0 |
@cjnolet |
@anaruse Can you review the code again? |
It looks good to me. |
… params (#569) This PR updates the default chunk size of the CAGRA graph extension and also adds a knob to control the batch size of the CAGRA searches run inside for better throughput. The default chunk size was set to 1 in the current implementation because there is a potential problem with low recall when the chunk size is large, because no edges are made within nodes in the same chunk. However, as I have investigated, the low recall problem rarely occurs with large chunk sizes. # Search performance The performance was measured after applying a bugfix #565 ## degree = 32 ![extend-ir0 9-degree32](https://github.com/user-attachments/assets/a5bb2fb6-8c12-49ad-b96a-1b384d79a96b) (I don't know the reason the performance is unstable in NYTimes.) ## degree = 64 ![extend-ir0 9-degree64](https://github.com/user-attachments/assets/8e926e1c-d772-4682-9419-9cc027f09d3f) So I increase the default chunk size to the size of the new dataset vectors for better throughput in this PR. I also make public a knob to control the search batch size in the `extend' function to control the balance between throughput and memory consumption. Authors: - tsuki (https://github.com/enp1s0) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) - Tamas Bela Feher (https://github.com/tfeher) URL: #569
/merge |
The merging process of the pruned and revedge graphs is inappropriate, resulting in neighbor index duplication. This PR fixes the issue and adds the validation of graph indices at the end of the
graph::optimze
function.The simplest solution to fix this issue is 1151511, but it ignores the memory space optimization by rapidsai/raft@12480cf. The code changes in this PR apply the memory space optimization to the simplest solution to remove the memory space for the pruned graph (
pruned_graph
).This PR also fixes a problem that throwing an exception from an OMP loop does not work as expected.
This is the search performance comparison between the graph optimization that has a bug (branch-25.02) and fixed (fix-cagra-graph-optimization-bug). Each point represents the search performance for itopk=32, 64, ... 512. By this PR, the recall becomes slightly higher when searching with a small itopk size. Because the number of duplicated nodes by this bug is not so large (typically less than 100 in a 1M dataset), the performances are almost the same when searching with a large itopk size and traversing the graph sufficiently.