Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Device Agnostic Pipeline #140

Merged

Conversation

AD2605
Copy link
Collaborator

@AD2605 AD2605 commented Oct 2, 2024

Adds CollectiveMMA and a CollectiveBuilder API for device agnostic pipeline.
This piggybacks off of the SM_70 2 stage gemm pipeline, with blocking in SMem and RMem, to get somewhat performant gemm on any device.

@AD2605 AD2605 marked this pull request as ready for review October 9, 2024 14:30
@AD2605 AD2605 changed the title [DRAFT] Device Agnostic Pipeline Device Agnostic Pipeline Oct 10, 2024
Copy link
Collaborator

@rolandschulz rolandschulz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this PR?
Have an example a user can try out on any HW and get decent performance?
We don't expect anyone to really use this for anything right?
The GEMM isn't really device agonistic, is it? It's more that likely any GPU a user would run on has features assumed by the SM_70 pipeline.
If so would it be sufficient to only add the example, but not the new builder and instead directly use the SM_70 builder as a baseline for most current GPUs?

@AD2605
Copy link
Collaborator Author

AD2605 commented Oct 17, 2024

The GEMM isn't really device agonistic, is it? It's more that likely any GPU a user would run on has features assumed by the SM_70 pipeline.

The SM70 mainloop is device agnostic, it implements a tiled GEMM algorithm, with data being blocked in shared memory and registers. With us passing the UniversalCopy and UniversalMMA, this would become a truly device agnostic gemm.

If so would it be sufficient to only add the example, but not the new builder and instead directly use the SM_70 builder as a baseline for most current GPUs?

SM70 does not have a collective builder. Also, I believe the idea is that the API accepts something like a DeviceAgnostic arch, instead of we relying on the user to actually understand that the sm_70 pipeline could potentially be turned to Device Agnostic one.

@aacostadiaz aacostadiaz force-pushed the atharva/device_agnostic_pipeline branch from 46b102a to 044bcf7 Compare December 10, 2024 18:06
Copy link
Collaborator

@joeatodd joeatodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of issues with the examples, but otherwise looks good I think 👍

@aacostadiaz
Copy link
Collaborator

What's the purpose of this PR? Have an example a user can try out on any HW and get decent performance? We don't expect anyone to really use this for anything right? The GEMM isn't really device agonistic, is it? It's more that likely any GPU a user would run on has features assumed by the SM_70 pipeline. If so would it be sufficient to only add the example, but not the new builder and instead directly use the SM_70 builder as a baseline for most current GPUs?

This pipeline serves as an entry point for others in the community to build their own backends and setup their infrastructure. We don’t expect to get performance, just functionality and we are not planning to support more functionalities than what is in this PR. Just basic GEMM with the default epilogue.

Copy link
Collaborator

@joeatodd joeatodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@mehdi-goli mehdi-goli merged commit f14d683 into codeplaysoftware:sycl-develop Dec 11, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants