-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid limited memory adaptor issue in balanced KMeans #2570
base: branch-25.04
Are you sure you want to change the base?
Avoid limited memory adaptor issue in balanced KMeans #2570
Conversation
…to fix-sparse-utilities
@csadorf we will also want to make sure we port this over to cuVS, since the kmeans in raft will be getting ported shortly after GTC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have couple problem with kmeans_balanced here
rmm::mr::managed_memory_resource managed_memory; | ||
rmm::device_async_resource_ref device_memory = resource::get_workspace_resource(handle); | ||
rmm::device_async_resource_ref current_device_resource = rmm::mr::get_current_device_resource(); | ||
rmm::device_async_resource_ref workspace_resource = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of those rare cases where we do indeed need to explicitly allocate rmm::mr::managed_memory_resource
(the removed TODO comment is actually incorrect).
The need to use managed memory here has nothing to do with the memory limit and the user choice, but rather is a part of the algorithm. We use the managed_memory
variable across this file for not-so-big allocations that are accessed by both device and host (see, for example, the build_fine_clusters
function above). Hence, using the device-only memory simply breaks the algorithm here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adressed.
However, I'm wondering whether we should be generally using make_managed_vector
instead then. @achirkin Was there a specific motivation for the use of rmm::uvector
s instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, only historic reasons: the balanced kmeans code arrived earlier than these managed helpers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying.
@cjnolet Considering that this code has moved to cuVS anyways I assume there is no point in refactoring this, is there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason to refactor, just need to fix any issues that is blocking cuML UMAP ATM
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the extra comment, LGTM
Keeping this in draft mode until #2541 is merged. |
Prepared in rapidsai/cuvs#659 . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, as this code is deprecated anyways (and will be removed once cuML is using cuVS for this).
Thanks so much @csadorf! |
get_large_workspace_resource
instead ofget_workspace_resource
Based on and merge after #2541 (diff)