You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Limiting Factors in a Dot Product Calculation | Richard Startin’s Blog
The dot product is a simple calculation which reduces two vectors to the sum of their element-wise products. The calculation has a variety of applications and is used heavily in neural networks, linear regression and in search. What are the constraints on its computational performance? The combination of the computational simplicity and its streaming nature means the limiting factor in efficient code should be memory bandwidth. This is a good opportunity to look at the raw performance that will be made available with the vector API when it’s released.
Hi Richard,
This is really nice blog. In fact all the blogs are insightful. I have also tried to do the same tests but I am not getting as good performance boost as you are getting. Please have a look into this stack overflow question .
Thank you !
Hi @nitirajrathore - I ran the benchmarks approximately 2 years ago, it's still not released and there are bugs from time to time. To be so much slower than the vector implementations, you might have hit a bug with vector box elimination, or maybe you ran in to downclocking. The only way to be sure is to profile, so I can't give a definitive answer.
As a general point, I'm not the best person to ask about this (I just ran some benchmarks), and StackOverflow isn't the best place either. You can write to the people working on this project at the panama-dev mailing list.
Limiting Factors in a Dot Product Calculation | Richard Startin’s Blog
The dot product is a simple calculation which reduces two vectors to the sum of their element-wise products. The calculation has a variety of applications and is used heavily in neural networks, linear regression and in search. What are the constraints on its computational performance? The combination of the computational simplicity and its streaming nature means the limiting factor in efficient code should be memory bandwidth. This is a good opportunity to look at the raw performance that will be made available with the vector API when it’s released.
https://richardstartin.github.io/posts/vector-api-dot-product
The text was updated successfully, but these errors were encountered: