Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[test] Limit the number of launched threads in shift_left_right.pass with the OMP backend to avoid timeouts #1928

Merged
merged 2 commits into from
Nov 1, 2024

Conversation

mmichel11
Copy link
Contributor

We have recently seen timeouts in CI in shift_left_right.pass when running on CPUs with large core counts using the OMP backend. This has occurred mostly with debug builds but also in a small number of release builds. Due to the small problem sizes in our tests, the grain size per thread is too small when a large number of threads are launched resulting in significant performance loss due to excessive parallelism and cross-socket traffic.

To resolve this issue, we cap the number of threads to 32 in shift_left_right.pass which enables the tests to pass in a reasonable time while also launching enough threads to assess the correctness of the parallel implementation.

Signed-off-by: Matthew Michel <[email protected]>
Copy link
Contributor

@danhoeflinger danhoeflinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love that this is necessary. In other words, The better fix is probably to improve our implementation to better chunk the data with a larger chunk size so that we can handle an appropriate number of threads to match the hardware. Or, if limiting the threads is truly the best option, then our implementation should probably control that internally.

If we see this issue with performance, then users would as well.

In the short term, I'm OK with making this change to stop the timeouts, if we also add a github issue to explore this further with a fix to the implementation, rather than the test.

Functionally, the code looks fine to me.

@mmichel11
Copy link
Contributor Author

I don't love that this is necessary. In other words, The better fix is probably to improve our implementation to better chunk the data with a larger chunk size so that we can handle an appropriate number of threads to match the hardware. Or, if limiting the threads is truly the best option, then our implementation should probably control that internally.

If we see this issue with performance, then users would as well.

In the short term, I'm OK with making this change to stop the timeouts, if we also add a github issue to explore this further with a fix to the implementation, rather than the test.

Functionally, the code looks fine to me.

I agree this is our issue to fix and opened which #1929 uses shift_left_right.pass as an example. I strongly suspect this performance issue extends beyond this algorithm. I think we need some benchmarking to measure the impact, so I opened this as a general problem with the OpenMP backend to understand any broader impact and address these issues.

Copy link
Contributor

@danhoeflinger danhoeflinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with merging this for now given the open issue.

@mmichel11 mmichel11 merged commit bdc037b into main Nov 1, 2024
22 checks passed
@mmichel11 mmichel11 deleted the dev/mmichel11/fix_shift_left_right_timeout branch November 1, 2024 20:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants