Correct private memory allocations in the SCLA path #150

AD2605 · 2024-08-14T16:50:36Z

Corrects the allocation size of the private arrays in the SCLA path, which was otherwise leading to spills and hence poor performance even when using SCLAs.

Also brings in a minor fix in the benchmarks. Turns out we were only reporting the last value instead of the average.

tag @victor-eds

tag @Rbiessy @hjabird

Already Merges: #149

Checklist

Tick if relevant:

[] New files have a copyright
[] New headers have an include guards
API is documented with Doxygen
New functionalities are tested
Tests pass locally
Files are clang-formatted

…gle benchmark version * The Google Benchmark code does not work if CMAKE_CXX_FLAGS include -fsycl due to conflicting C++ version requirements. * This removes the external CXX flags for the Google Benchmark library build. * Also update Google Benchmark version tag.

…o atharva/correct_wi_temp_allocations

AD2605 · 2024-08-14T16:50:58Z

@victor-eds , if you could post the log of passing tests, and speed-up observed using the SCLA path 🙏🏻

src/portfft/common/workgroup.hpp

test/bench/portfft/launch_bench.hpp

src/portfft/dispatcher/workgroup_dispatcher.hpp

src/portfft/dispatcher/global_dispatcher.hpp

victor-eds · 2024-08-16T08:48:45Z

I ran the tests both on PVC and an iGPU earlier this week and they worked. Performance wise, PVC:

We see one of the benchmarks benefits with a 1.25x speedup thanks to this patch.

Rbiessy

Thanks for the patch, this looks good to me overall. We're not sure we will keep the changes related to GBench. I've triggered the CI pipeline so they are useful for now.

test/bench/portfft/launch_bench.hpp

src/portfft/dispatcher/workgroup_dispatcher.hpp

AD2605 · 2024-08-16T13:28:56Z

We're not sure we will keep the changes related to GBench

Oh you mean #149 ?

Rbiessy · 2024-08-16T13:34:10Z

Oh you mean #149 ?

Yes you may want to revert this change if you want to merge this PR soon.

…date Google benchmark version" This reverts commit 99184a2.

AD2605 · 2024-08-16T13:41:21Z

Oh you mean #149 ?

Yes you may want to revert this change if you want to merge this PR soon.

Done

hjabird and others added 8 commits August 12, 2024 13:00

correct wi_temps allocation

971b0b1

further reduce scla allocations in subgroup and workgroup impl

8b5dbaf

set missed spec constants in global implementation

cdfb617

Merge remote-tracking branch 'origin/hjab/update_benchmark_cmake' int…

68b6326

…o atharva/correct_wi_temp_allocations

report average instead of last counter value in benchmarks

a97c234

udpate doxygens in workgroup.hpp

8ae0ac3

remove commented code

493a4ca

AD2605 requested review from Rbiessy and hjabird August 14, 2024 16:54

hjabird reviewed Aug 14, 2024

View reviewed changes

review comments 1

9056e3f

Rbiessy reviewed Aug 16, 2024

View reviewed changes

test/bench/portfft/launch_bench.hpp Outdated Show resolved Hide resolved

src/portfft/dispatcher/workgroup_dispatcher.hpp Show resolved Hide resolved

use benchmark::Counter to get average flop and throughput

f784ded

hjabird previously approved these changes Aug 16, 2024

View reviewed changes

Revert "[CMake] Enable CXXFLAGS to include -fsycl with benchmarks; Up…

a12747e

…date Google benchmark version" This reverts commit 99184a2.

AD2605 dismissed hjabird’s stale review via a12747e August 16, 2024 13:40

hjabird approved these changes Aug 16, 2024

View reviewed changes

Rbiessy approved these changes Aug 16, 2024

View reviewed changes

Rbiessy merged commit f29d8e7 into main Aug 16, 2024
1 check passed

Rbiessy deleted the atharva/correct_wi_temp_allocations branch August 16, 2024 14:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct private memory allocations in the SCLA path #150

Correct private memory allocations in the SCLA path #150

AD2605 commented Aug 14, 2024

AD2605 commented Aug 14, 2024 •

edited

Loading

victor-eds commented Aug 16, 2024

Rbiessy left a comment

AD2605 commented Aug 16, 2024

Rbiessy commented Aug 16, 2024

AD2605 commented Aug 16, 2024

Correct private memory allocations in the SCLA path #150

Correct private memory allocations in the SCLA path #150

Conversation

AD2605 commented Aug 14, 2024

Checklist

AD2605 commented Aug 14, 2024 • edited Loading

victor-eds commented Aug 16, 2024

Rbiessy left a comment

Choose a reason for hiding this comment

AD2605 commented Aug 16, 2024

Rbiessy commented Aug 16, 2024

AD2605 commented Aug 16, 2024

AD2605 commented Aug 14, 2024 •

edited

Loading