Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: revert acl_winograd_convolution to stateful #2357

Merged
merged 1 commit into from
Jan 9, 2025

Conversation

Sqvid
Copy link
Contributor

@Sqvid Sqvid commented Jan 8, 2025

Description

partially reverts 16d6dd4: "cpu: aarch64: Enable stateless ACL depthwise convolution"

reverts commit 03db3e4: "cpu: aarch64: Call stateless ACL API from winograd convolution"

reverts commit 513f882: "cpu: aarch64: hot fix for aux tensor management of stateless gemm-conv and winograd conv without lock."

Fixes #2324
Fixes #2303

Checklist

General

  • Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
  • Have you formatted the code using clang-format?

Performance improvements

On a Neoverse V1 machine:

Current:

$ ./build-upstream/tests/benchdnn/benchdnn --max-ms-per-prb=3e3 --mode=P --conv --reset --allow-enum-tags-only=0 --engine=cpu --dir=FWD_I --alg=WINO --dt=f32:f32:f32 --stag=acdb --wtag=any --dtag=acdb --attr-scratchpad=user mb1_ic512oc512_ih7oh7kh3sh1dh0ph1_iw6ow6kw3sw1dw0pw1
Output template: perf,%engine%,%impl%,%name%,%prb%,%Gops%,%+ctime%,%-time%,%-Gflops%,%0time%,%0Gflops%
perf,cpu,wino:acl,,--mode=P --conv --allow-enum-tags-only=false --dir=FWD_I --stag=acdb --dtag=acdb --alg=wino --attr-scratchpad=user mb1ic512ih7iw6oc512oh7ow6kh3kw3ph1pw1,0.159384,1.25562,140.688,1.13288,147.603,1.07981
tests:1 passed:1 skipped:0 mistrusted:0 unimplemented:0 invalid_arguments:0 failed:0 listed:0
total perf: min(ms):140.688 avg(ms):147.603
total: 3.32s; fill: 0.01s (0%);

After the revert:

$ ./build/tests/benchdnn/benchdnn --max-ms-per-prb=3e3 --mode=P --conv --reset --allow-enum-tags-only=0 --engine=cpu --dir=FWD_I --alg=WINO --dt=f32:f32:f32 --stag=acdb --wtag=any --dtag=acdb --attr-scratchpad=user mb1_ic512oc512_ih7oh7kh3sh1dh0ph1_iw6ow6kw3sw1dw0pw1        
Output template: perf,%engine%,%impl%,%name%,%prb%,%Gops%,%+ctime%,%-time%,%-Gflops%,%0time%,%0Gflops%
perf,cpu,wino:acl,,--mode=P --conv --allow-enum-tags-only=false --dir=FWD_I --stag=acdb --dtag=acdb --alg=wino --attr-scratchpad=user mb1ic512ih7iw6oc512oh7ow6kh3kw3ph1pw1,0.159384,1.66626,2.6106,61.0526,2.75155,57.925
tests:1 passed:1 skipped:0 mistrusted:0 unimplemented:0 invalid_arguments:0 failed:0 listed:0
total perf: min(ms):2.6106 avg(ms):2.75155
total: 3.26s; fill: 0.01s (0%);```

partially reverts 16d6dd4: "cpu: aarch64: Enable stateless ACL depthwise convolution"

reverts commit 03db3e4: "cpu: aarch64: Call stateless ACL API from winograd convolution"

reverts commit 513f882: "cpu: aarch64: hot fix for aux tensor management of stateless gemm-conv and winograd conv without lock."

Signed-off-by: Siddhartha Menon <[email protected]>
@github-actions github-actions bot added the platform:cpu-aarch64 Codeowner: @oneapi-src/onednn-cpu-aarch64 label Jan 8, 2025
@theComputeKid
Copy link
Member

@alvoron code freeze for oneDNN is end of the week, so we wanted to get in a stable version before then, this is the best we can do for now. We will try to test it a bit more and then get it in for Friday. Sorry for the slowdown.

@theComputeKid theComputeKid added this to the v3.7 milestone Jan 8, 2025
@Sqvid Sqvid marked this pull request as ready for review January 9, 2025 11:14
@Sqvid Sqvid requested review from a team as code owners January 9, 2025 11:14
@theComputeKid theComputeKid self-requested a review January 9, 2025 11:15
@Radu2k
Copy link
Contributor

Radu2k commented Jan 9, 2025

@dzarukin @mgouicem Could someone please review? Ideally, we would like to see this in v3.7.

@dzarukin
Copy link
Contributor

dzarukin commented Jan 9, 2025

In most cases it's most beneficial to create a dedicated revert commit to the one that it reverts. It's easier to track changes this way and re-apply patches in the future. Mixing all in one doesn't help with such flow.
Feel free to skip it this time but please refer to this practice in the future. Thanks.

@Radu2k
Copy link
Contributor

Radu2k commented Jan 9, 2025

In most cases it's most beneficial to create a dedicated revert commit to the one that it reverts. It's easier to track changes this way and re-apply patches in the future. Mixing all in one doesn't help with such flow. Feel free to skip it this time but please refer to this practice in the future. Thanks.

Thanks @mgouicem @dzarukin, noted.

I assume that having 3 separate ones, one for each 03db3e4 | 513f882 and a partially one for 16d6dd4, would have been ideal.

@theComputeKid theComputeKid merged commit 73c2053 into oneapi-src:main Jan 9, 2025
18 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:cpu-aarch64 Codeowner: @oneapi-src/onednn-cpu-aarch64
Projects
None yet
5 participants