-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sa_search
incompatible with int numpy array
#50
Comments
Could you try casting your input arrays as int8? In randint, just add |
Thank you very much for your prompt reply. However, my use case requires the elements of the NumPy array to have a range of approximately 0 to 128k, which far exceeds the range of uint8. If there is a better way to handle this, please let me know. Thank you again. |
I think this is an edge case I don't cover. |
I can probably fix this in ~12h. |
Thank you very much for your help. I am using the repository you developed to process the dataset after tokenization of a large language model. More specifically, it is https://arxiv.org/pdf/2401.17377 (Chapter 3), where it seems to mention a similar approach of converting token IDs from 0-128k into some kind of numeral system. I wonder if it might be of any reference to you. |
see the new test:
It uses the undocumented I want to leave this issue open as the best solution would be to reimplement |
Thank you for your time. I will test the performance difference between this interface and my simple Python implementation :) |
Dear project developers,
Hello, I encountered an issue with the
sa_search
method in combination with numpy. The test code is as follows:And get following error:
It seems that this method does not natively support searching int np.array. Currently, I am using the following alternative solution to achieve this. I hope to get some advice from you.
Lastly, please allow me to express my sincere respect for your valuable time once again.
The text was updated successfully, but these errors were encountered: