-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XorBinaryFuse8 failed to construct on some data, at around 11511 entries #32
Comments
This very much looks like FastFilter/xorfilter#23 |
I also found other cases where construction requires more retries than I would expect... One way to resolve the issue is to change the constants in calculateSegmentLength and calculateSizeFactor; I'm trying that now. |
I found constants that are more reliable...
Test case:
We would then have a maximum of around 30 re-tries at size 2100 .. 2116 (segment length), but no more than that. This is roughly in the middle of this segment length; we have 9 segments. @lemire I wonder what is your opinion here... |
@thomasmueller I tested both the C version as well as the Go version with We could update our constants. |
Ah, I think the Java version is worse because it uses a different "reduce" function (which uses only 32 bits, instead of 64 bits like in the C / C++ version). I will see if this can be fixed. And then I will to reduce the number of retries. |
One of the challenges is that Java still doesn't support an "unsigned 64-bit multiply high" -- see https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8188044 -- so we can't have exactly the same code in Java and C / C++, if we want to keep it at the current (high) performance. However, I think I found constants that work well. Now I need to write a proper test case and clean this up. |
@thomasmueller The link you offer suggest that this was fixed in Java 18. |
@lemire you are right it is fixed in Java 18 (I initially didn't understand this part sorry), but I think we should support older versions of Java as well. I made a mistake and used a different hash function to search better constants... so I need to test again, and it will take a bit longer. |
OK, I believe it is now fixed. The main problem was actually this:
That was in no way correct... I replaced it with But I also changed the segment sizes a bit, because I found that construction requires less retries. Now segments tend to be a bit smaller, specially for small sets. And now a IllegalArgumentException is thrown if construction fails. |
@konghuarukhr Let me know if this resolves the problem! |
I don't find any bad case now. 👍 |
I find it is not fixed, this is another construction failure:
will report:
Originally posted by @konghuarukhr in #31 (comment)
The text was updated successfully, but these errors were encountered: