-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
License status? #2
Comments
Any chance of an update here? I'd love to include look into porting this these algorithms to a GPU, and this looks like a great starting point. Any derived work would remain open source, if it matters. |
@maddyscientist - I've moved this onto the MIT License. Note that some code is drawn from https://github.com/peterahrens/ReproBLAS, which uses a University of California license (details). Sorry for the delay. I'd be interested to know how this works out for you :-) |
@r-barnes thank you very much. I'll update you if I make if and when I make any progress on this front. |
Just to note that I've mostly completed a port to CUDA. I've essentially ported ahrens2020.cpp to CUDA (in ahrens2020_cuda.cu). Still some optimization left to be done, but it's performing reasonably (get's up to 450 GB/s on a GPU that can get around 650 GB/s on a naive summation algorithm). Up until now, I've kept fairly close to the style of the original code, but I may diverge / extend it to include other summation algorithms, e.g., higher-level floating-point expansions, fp128, double-double, etc. |
@maddyscientist - Very cool! Thanks for the link. |
Thank you for this work, it is a great demonstration of the reproducible summation algorithms.
The license of this source code doesn't seem to be specified. Would the author consider making this work available under a BSD / MIT / Apache license?
Thanks
The text was updated successfully, but these errors were encountered: