Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct private memory allocations in the SCLA path #150

Merged
merged 11 commits into from
Aug 16, 2024

Conversation

AD2605
Copy link
Contributor

@AD2605 AD2605 commented Aug 14, 2024

Corrects the allocation size of the private arrays in the SCLA path, which was otherwise leading to spills and hence poor performance even when using SCLAs.

Also brings in a minor fix in the benchmarks. Turns out we were only reporting the last value instead of the average.

tag @victor-eds

tag @Rbiessy @hjabird

Already Merges: #149

Checklist

Tick if relevant:

  • [] New files have a copyright
  • [] New headers have an include guards
  • API is documented with Doxygen
  • New functionalities are tested
  • Tests pass locally
  • Files are clang-formatted

hjabird and others added 8 commits August 12, 2024 13:00
…gle benchmark version

* The Google Benchmark code does not work if CMAKE_CXX_FLAGS include -fsycl due to conflicting
  C++ version requirements.
* This removes the external CXX flags for the Google Benchmark library build.
* Also update Google Benchmark version tag.
@AD2605
Copy link
Contributor Author

AD2605 commented Aug 14, 2024

@victor-eds , if you could post the log of passing tests, and speed-up observed using the SCLA path 🙏🏻

@AD2605 AD2605 requested review from Rbiessy and hjabird August 14, 2024 16:54
src/portfft/common/workgroup.hpp Outdated Show resolved Hide resolved
src/portfft/common/workgroup.hpp Show resolved Hide resolved
test/bench/portfft/launch_bench.hpp Outdated Show resolved Hide resolved
test/bench/portfft/launch_bench.hpp Outdated Show resolved Hide resolved
src/portfft/dispatcher/workgroup_dispatcher.hpp Outdated Show resolved Hide resolved
src/portfft/dispatcher/global_dispatcher.hpp Outdated Show resolved Hide resolved
src/portfft/dispatcher/global_dispatcher.hpp Show resolved Hide resolved
@victor-eds
Copy link

I ran the tests both on PVC and an iGPU earlier this week and they worked. Performance wise, PVC:

image

We see one of the benchmarks benefits with a 1.25x speedup thanks to this patch.

Copy link
Collaborator

@Rbiessy Rbiessy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch, this looks good to me overall. We're not sure we will keep the changes related to GBench. I've triggered the CI pipeline so they are useful for now.

test/bench/portfft/launch_bench.hpp Outdated Show resolved Hide resolved
@AD2605
Copy link
Contributor Author

AD2605 commented Aug 16, 2024

We're not sure we will keep the changes related to GBench

Oh you mean #149 ?

@Rbiessy
Copy link
Collaborator

Rbiessy commented Aug 16, 2024

Oh you mean #149 ?

Yes you may want to revert this change if you want to merge this PR soon.

hjabird
hjabird previously approved these changes Aug 16, 2024
…date Google benchmark version"

This reverts commit 99184a2.
@AD2605
Copy link
Contributor Author

AD2605 commented Aug 16, 2024

Oh you mean #149 ?

Yes you may want to revert this change if you want to merge this PR soon.

Done

@Rbiessy Rbiessy merged commit f29d8e7 into main Aug 16, 2024
1 check passed
@Rbiessy Rbiessy deleted the atharva/correct_wi_temp_allocations branch August 16, 2024 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants