-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Device Agnostic Pipeline #140
Device Agnostic Pipeline #140
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of this PR?
Have an example a user can try out on any HW and get decent performance?
We don't expect anyone to really use this for anything right?
The GEMM isn't really device agonistic, is it? It's more that likely any GPU a user would run on has features assumed by the SM_70 pipeline.
If so would it be sufficient to only add the example, but not the new builder and instead directly use the SM_70 builder as a baseline for most current GPUs?
examples/sycl/device_agnostic/device_agnostic_collective_builder.cpp
Outdated
Show resolved
Hide resolved
include/cutlass/gemm/collective/builders/device_agnostic_mma_builder.inl
Outdated
Show resolved
Hide resolved
The SM70 mainloop is device agnostic, it implements a tiled GEMM algorithm, with data being blocked in shared memory and registers. With us passing the UniversalCopy and UniversalMMA, this would become a truly device agnostic gemm.
SM70 does not have a collective builder. Also, I believe the idea is that the API accepts something like a |
…e_agnostic_pipeline
…pipeline # Conflicts: # examples/CMakeLists.txt
46b102a
to
044bcf7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of issues with the examples, but otherwise looks good I think 👍
include/cutlass/epilogue/collective/builders/device_agnostic_builder.inl
Outdated
Show resolved
Hide resolved
examples/sycl/device_agnostic/device_agnostic_collective_builder.cpp
Outdated
Show resolved
Hide resolved
This pipeline serves as an entry point for others in the community to build their own backends and setup their infrastructure. We don’t expect to get performance, just functionality and we are not planning to support more functionalities than what is in this PR. Just basic GEMM with the default epilogue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Adds
CollectiveMMA
and aCollectiveBuilder
API for device agnostic pipeline.This piggybacks off of the SM_70 2 stage gemm pipeline, with blocking in SMem and RMem, to get somewhat performant
gemm
on any device.