-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrency optimization for native graph loading #2345
base: main
Are you sure you want to change the base?
Conversation
7cb8710
to
8e90b88
Compare
Please add an entry in the changelog. |
src/main/java/org/opensearch/knn/index/memory/NativeMemoryEntryContext.java
Show resolved
Hide resolved
src/main/java/org/opensearch/knn/index/memory/NativeMemoryEntryContext.java
Show resolved
Hide resolved
src/main/java/org/opensearch/knn/index/memory/NativeMemoryEntryContext.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/knn/index/memory/NativeMemoryCacheManager.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/knn/index/memory/NativeMemoryEntryContext.java
Show resolved
Hide resolved
HI @Gankris96, thank you for the PR. |
Hi @0ctopus13prime - yes i am working on getting the benchmarking numbers. Primarily trying to test it on remote store backed index to see the performance gains. Will update with Benchmarking numbers soon. |
@Gankris96 please fix the failing CIs. |
f981b83
to
79392bc
Compare
src/main/java/org/opensearch/knn/index/memory/NativeMemoryEntryContext.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/knn/index/memory/NativeMemoryEntryContext.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/knn/index/memory/NativeMemoryEntryContext.java
Outdated
Show resolved
Hide resolved
0c2b587
to
0121bdc
Compare
33afd58
to
ecdb8fa
Compare
src/main/java/org/opensearch/knn/index/memory/NativeMemoryEntryContext.java
Outdated
Show resolved
Hide resolved
Great this is what was expected. I think the first query shows the improvement:
because that is time when all the graph files will be downloaded from remote, and once it is downloaded then the gains will drop and this fix was actually for the first query or for the queries for which the graph files are evicted from disk. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make our unit and integ test more robust with code changes we are doing
- for cases isIndexGraphFileOpened() Ensure that openVectorIndex does nothing if the index graph file is already opened.
2.Verify that the method extracts the vector file name correctly and proceeds to load the index without errors.
3.Pass an invalid cache key that does not contain a vector file name and verify that the method throws an IllegalStateException with the correct error message.
4.Can we mock directory.openInput method to return a valid IndexInput and verify that readStream and indexInputWithBuffer are initialized correctly. - Can we veify readStream.seek(0) is called successfully.
src/main/java/org/opensearch/knn/index/memory/NativeMemoryCacheManager.java
Show resolved
Hide resolved
@@ -350,7 +352,11 @@ public NativeMemoryAllocation get(NativeMemoryEntryContext<?> nativeMemoryEntryC | |||
return result; | |||
} | |||
} else { | |||
return cache.get(nativeMemoryEntryContext.getKey(), nativeMemoryEntryContext::load); | |||
// open graphFile before load | |||
try (nativeMemoryEntryContext) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There could be a case where Multiple threads trigger eviction and graph loading concurrently, leading to temporary spikes in memory usage. Can we think of using bounded concurrency for eviction and graph loading tasks with thread pools?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will take it up in a separate issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a fair callout. I think we need to improve on our cache operations in general.
I think the problem we are going through right now is that the cache operations can be async in nature (cleanup, eviction) where as we use it as a 1:1 reference for the off heap memory in use.
We can create a tracking issue and deal with this separately.
return cache.get(nativeMemoryEntryContext.getKey(), nativeMemoryEntryContext::load); | ||
// open graphFile before load | ||
try (nativeMemoryEntryContext) { | ||
nativeMemoryEntryContext.openVectorIndex(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid this case when graph is partially loaded or an error occurs during loading, which endup cache being an inconsistent state . Can we ensure automaticity in graph loading and only put in cache if it is successful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there is error in graph loading then the entry will not be in cache. What would be the scenario where cache ends up in inconsistent state ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Gankris96 Can we wrap this call behind the same lock based logic above?
Just to make sure we do not open the same index files concurrently in two different threads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrapping this within a lock still seems to fail some bwc search tests where we endup getting incorrect results. Even doing so would not really help coz we don't solve the eventual problem of multiple graph files getting loaded at the same time because the load is not synchronized anymore.
This probably requires revisiting in a new separate issue where we refactor the whole cache strategy imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create an issue so that we can track it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the bwc failure was a different issue unrelated to this. I did add back the locking logic for this as well. It seems to work fine so we can keep this in.
@Gankris96 can you please fix the CIs |
src/main/java/org/opensearch/knn/index/memory/NativeMemoryEntryContext.java
Show resolved
Hide resolved
68c067b
to
daae55d
Compare
@Vikasht34 @kotwanikunal @navneet1v have updated with some additional locking around |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the comments , Looks good to me.
9edec4e
to
2d90b20
Compare
return cache.get(nativeMemoryEntryContext.getKey(), nativeMemoryEntryContext::load); | ||
// open graphFile before load | ||
try (nativeMemoryEntryContext) { | ||
nativeMemoryEntryContext.openVectorIndex(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create an issue so that we can track it.
d8e017a
to
4fdd28e
Compare
Signed-off-by: Ganesh Ramadurai <[email protected]>
4fdd28e
to
4dc8449
Compare
@navneet1v @jmazanec15 please take a look and approve if all looks good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. Some comments related to maintenance of code
ReentrantLock indexFileLock = indexLocks.computeIfAbsent(key, k -> new ReentrantLock()); | ||
indexFileLock.lock(); | ||
nativeMemoryEntryContext.openVectorIndex(); | ||
indexFileLock.unlock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please, have a private method openIndex()
here so this is taken care as an when the code changes?
// recheck if another thread already loaded this entry into the cache | ||
result = cache.getIfPresent(key); | ||
if (result != null) { | ||
accessRecencyQueue.remove(key); | ||
accessRecencyQueue.addLast(key); | ||
return result; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private method for this as well? There will be additional null check but everytime get
returns accessRecency should be updated.
Overall looks good to me. I think I agree with @shatejas comments. Please resolve them so that we can ship this change |
Description
Fixes #2265
Refactors the graph load into a 2 step approach detailed here - #2265 (comment)
This will help to move out the opening of indexInput file outside of the synchronized block so that the graphfile can be downloaded in parallel even if the graph load and createIndexAllocation are inside synchronized block.
Related Issues
Resolves #2265
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.