Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BLAS] Upstreaming portBLAS as GENERIC SYCL BLAS implementation #1

Closed
wants to merge 500 commits into from

Conversation

s-Nick
Copy link
Contributor

@s-Nick s-Nick commented Dec 13, 2024

This PR upstreams the long work done with portBLAS as generic SYCL implementation of BLAS routines.

Everything is added under the onemath/sycl/blas directory and meant to be used as header-only library.
An example of the current usage is the oneMath project where portBLAS is the generic backend for BLAS.

Ben Tracy and others added 30 commits December 10, 2019 12:11
Integer constants are deduced to int which can conflict with index_t
and fail the template type deduction.
* Call _scal when alpha is zero in GEMM

When alpha is zero in GEMM, the operation is just `C = beta * C`, so we
can just call `_scal` to scale the output matrix without computing the
matrix multiply.

* Add tests for alpha = 0

* Set values to zero when scaling by zero
* Add FindClara.cmake
* Move external headers around for a FHS-friendly layout
* Clean up auto tuner entry point

* Reduce device copies required in tuner

* Change auto tuner to use x-macro configs

* Handle errors better in auto tuner

* Add error to header when macro undefined

* Rename generated file
* Add assert to gemm_local.hpp for non square tiles

* Fix incorrect assert string in gemm_no_local.hpp
* Generate kernels in separate TUs

* Set include dirs for targets rather than globally

* Remove explicit link to libm
* Reorder cmake commands to prevent overwriting options

The SYCL-BLAS cmake scripts are very sensitive to the order in which
options are declared and used. Recently this causes link failures when
cmake is run with default options, as the `GEMM_TALL_SKINNY_SUPPORT`
option is used before it has been declared to have a default of `ON`.
This means that the first time the library object targets are generated
they do not include any tall skinny GEMM kernels, however when the host
launcher object is compiled the option has been defaulted on, so the
host expects these kernels to be available.
The OpenBLAS submodule is not used in this repo, so we can remove it.
The cblas header was moved in 742a6d7 but the path used in CMake was not
updated, causing build problems on platforms which do not have a cblas
header in the system include directories.

Co-authored-by: Ben Tracy <[email protected]>
By using a constexpr variable with the correct value, we can ensure that
the template deduction won't conflict.
* Update OpenBLAS section of readme

* Reflow text
Rather than only building one or the other, provide targets to build
both types of ARM Compute Library benchmarks.

Co-authored-by: Mehdi Goli <[email protected]>
…types (#230)

Adds the first vectorized version of the no_local GEMM kernel as a seperate file, also adds an enum to GEMM configurations to allow selecting the type of vectorization to use. This allows us more options for picking the best kernel configuration for different backends.

This partial vectorization of the no_local kernel is only vectorizable when A is not transposed and B is transposed.
A user might want to provide a buffer of `const` values as an input of
GEMV, however the current kernel will infer that the value to use for all
inermediates should then be `const`. By removing the `const` from the type
we can use `const` inputs and have mutable intermediates.
* Add support for IMGDNN and PowerVR GX6250
It is a common pattern to call routines like GEMM and GEMV with
uninitialized data in the output buffer when specifying beta to be zero.
This can cause problems if the uninitialized data happens to contain bit
patterns that correspond to NaN values, which can then get incorrectly
propagated to the output, as `0 * NaN == NaN` and not `0` as required.

To avoid this, provide an alternative final kernel that avoids loading
and multiplying the output by zero.
When `alpha == 0`, GEMM can just scale the matrix.
However, this currently doesn't take the `batch_size` into account,
leading to wrong results.

This PR fixes this issue and adds tests for the new cases.
It also updates existing tests by changing `alpha` and `beta`
to something other than `1` -
the issue this PR is fixing doesn't present itself when `beta == 1`.
Fixes compilation error for CLBLAST benchmark.
* Allow generic types

Right now only `DOUBLE_SUPPORT` is possible,
and there is a lot of manual intervention in benchmarks and tests.

This patch generalizes support for types
by using `BLAS_DATA_TYPES` variable.
For example, instead of specifying `-DDOUBLE_SUPPORT=ON` to CMake,
the user would specify `-DBLAS_DATA_TYPES="float;double"`.
The old flag is still there for backwards compatibility.

Note however that the reference BLAS
doesn't support generic types yet.
That is a more difficult problem
because system BLAS requires either `float*` or `double*` to data.

1. `BLAS_DATA_TYPES` CMake variable
    * Passed in as a CMake list
    * `float` is a mandatory type
2. Introduced mapping of CMake string types to C++ types
    * Allows the user to pass in a simpler representation
    * Done by `to_cpp_type`
    * No special types supported at the moment,
      but it allows e.g. cl::sycl::half
3. Replaced `DOUBLE_SUPPORT` C++ macro
  with `BLAS_DATA_TYPE_DOUBLE`
4. Generic way to register benchmarks
    * `BLAS_REGISTER_BENCHMARK`
5. Generic way to register tests
    * `BLAS_REGISTER_TEST`
    * Requires `combination_t` to be templated,
      so some types had to be made into templates
6. `TypeDispatcher` replaced with `blas_system_function`
    * More generic, separates function retrieval from the call
    * Not enough to actually allow different types,
      this is only the foundation
7. Conditionally instantiate policy handler
8. Minor fixes
    * System `dot` can take `y` as a constant
    * Renamed `a` and `b` in `gemv` and `ger` to `x` and `y`
    * Some clang-format

* rotg fix

* Assume register_benchmark exists

* Use MAKE_C_IDENTIFIER

* Update benchmark/common_utils.hpp

Typo

Co-authored-by: Duncan McBain <[email protected]>

* Replaced to_cpp_type with sanitize_file_name

Co-authored-by: Duncan McBain <[email protected]>
Co-authored-by: Mehdi Goli <[email protected]>
Because the generators can be called in parallel
there is a race condition during the build
when os.makedirs tries to create a folder that already exists.
It's safer to try to perform the operation
and ignore the exception if folder already exists.

Co-authored-by: Mehdi Goli <[email protected]>
GitHub Actions and others added 15 commits May 31, 2024 14:14
* Update namespace to sycl2020

Update included headers to sycl2020 standard

* Update device default selector to SYCL2020

* Remove SYCL_VERSION check

Removing all pragma that checks that sycl version used by compiler is
appropriate

* Update access::mode to access_mode
* Update ::target to new namespace and target
* Remove github workflow

Current workflow crash when bulding the docker due to outdated
configuration. Removing workflow entirely since it is not necessary for
now.

* Remove dependabot and .github directory
* Add runtime support check  for discrete Intel GPUs on txsv operators

* Enabling txsv operators for iGPU with DEFAULT TUNING TARGET

Since the support for txsv is checked before calling the implementation,
exception for Arc and GPU Max is raised before so it is possible to have
a default configuration working on iGPUs.

---------

Signed-off-by: nscipione <[email protected]>
Enable OpenSSF Scorecard workflow, add badge to README file
Remove deprecated "Build and Test" badge from README file
Bumps the github-actions group with 4 updates: [actions/checkout](https://github.com/actions/checkout), [ossf/scorecard-action](https://github.com/ossf/scorecard-action), [actions/upload-artifact](https://github.com/actions/upload-artifact) and [github/codeql-action](https://github.com/github/codeql-action).


Updates `actions/checkout` from 4.1.6 to 4.1.7
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](actions/checkout@a5ac7e5...692973e)

Updates `ossf/scorecard-action` from 2.3.3 to 2.4.0
- [Release notes](https://github.com/ossf/scorecard-action/releases)
- [Changelog](https://github.com/ossf/scorecard-action/blob/main/RELEASE.md)
- [Commits](ossf/scorecard-action@dc50aa9...62b2cac)

Updates `actions/upload-artifact` from 4.3.3 to 4.3.4
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](actions/upload-artifact@6546280...0b2256b)

Updates `github/codeql-action` from 3.25.8 to 3.25.15
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](github/codeql-action@2e230e8...afb54ba)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
- dependency-name: ossf/scorecard-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <[email protected]>
This patch substitutes the group_broadcast of two scalar indexes with the broadcast of a single vector value.
* Update rotmg implementation

* Add conditional check on mem_type in use

* Add dependencies to copy_y1 operation

* Add new interface to library and add related tests.
Bumps the github-actions group with 2 updates: [actions/upload-artifact](https://github.com/actions/upload-artifact) and [github/codeql-action](https://github.com/github/codeql-action).


Updates `actions/upload-artifact` from 4.3.4 to 4.4.0
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](actions/upload-artifact@0b2256b...5076954)

Updates `github/codeql-action` from 3.25.15 to 3.26.6
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](github/codeql-action@afb54ba...4dd1613)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <[email protected]>
Bumps the github-actions group with 2 updates: [actions/checkout](https://github.com/actions/checkout) and [github/codeql-action](https://github.com/github/codeql-action).


Updates `actions/checkout` from 4.1.7 to 4.2.0
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](actions/checkout@692973e...d632683)

Updates `github/codeql-action` from 3.26.6 to 3.26.10
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](github/codeql-action@4dd1613...e2b3eaf)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <[email protected]>
@s-Nick s-Nick requested a review from Rbiessy December 13, 2024 10:21
@s-Nick
Copy link
Contributor Author

s-Nick commented Dec 13, 2024

Hi @rodburns,
I am adding the new generic SYCL BLAS implementation and a few files to handle copyright, conduct, licensing and all this sort of things. I am not an expert in those and I might have missed something while updating them.
Could you please review them? I'll be happy to apply all the changes needed.

.github/dependabot.yml Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
onemath/sycl/blas/CMakeLists.txt Show resolved Hide resolved
onemath/sycl/blas/CMakeLists.txt Outdated Show resolved Hide resolved
onemath/sycl/blas/CMakeLists.txt Outdated Show resolved Hide resolved
onemath/sycl/blas/README.md Outdated Show resolved Hide resolved
@Rbiessy Rbiessy requested a review from rodburns December 13, 2024 14:00
Copy link
Contributor

@rodburns rodburns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who are the maintainers?
There should be a MAINTAINERS file to show who is responsible for the code.

Get Started
It's not clear immediately to me from either this repo, or the oneMath repo how to use this as a backend.
Can we have something that links to the oneMath repo section for backends or the documentation directly? https://uxlfoundation.github.io/oneMath/building_the_project_with_dpcpp.html

Documentation for this repo is not linked from the README, can we make it clearer that documentation exists and what it is? https://github.com/uxlfoundation/generic-sycl-components/tree/nscipione/pb_upstream/onemath/sycl/blas/doc

Other changes outside this repo:
oneMath README needs to be updated to link to the BLAS repo instead of portBLAS
The docs e.g. https://uxlfoundation.github.io/oneMath/building_the_project_with_dpcpp.html need to be updated from portBLAS.

onemath/sycl/blas/README.md Outdated Show resolved Hide resolved
@Rbiessy
Copy link
Contributor

Rbiessy commented Dec 16, 2024

Other changes outside this repo:
oneMath README needs to be updated to link to the BLAS repo instead of portBLAS
The docs e.g. https://uxlfoundation.github.io/oneMath/building_the_project_with_dpcpp.html need to be updated from portBLAS.

We have a circular dependency as we need to merge this PR and make this repository public before we can suggest to link to it in oneMath. We already have a note to update the existing oneMath PR: uxlfoundation/oneMath#618 (comment)

@s-Nick s-Nick force-pushed the nscipione/pb_upstream branch from e88a9a4 to 4362f05 Compare December 16, 2024 15:36
@s-Nick
Copy link
Contributor Author

s-Nick commented Dec 16, 2024

Thank you for your comment @rodburns.
I added a MAINTAINERS.MD file in cf64f2e
For what concern the blas README both changes requested are in db42352. Currently, links might be broken due to the PR open on the other repository (as mentioned by @Rbiessy above)

onemath/sycl/blas/README.md Outdated Show resolved Hide resolved
onemath/sycl/blas/README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@rodburns rodburns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with the minor changes.

README.md Outdated Show resolved Hide resolved
Cleaning repository to all portBLAS references.
Update documentation.
Remove tests, samples and benchmarks.
Allow only header-only compilation.

Signed-off-by: nscipione <[email protected]>
@s-Nick s-Nick closed this Dec 17, 2024
@s-Nick s-Nick force-pushed the nscipione/pb_upstream branch from 369ead3 to c52d35e Compare December 17, 2024 14:16
@Rbiessy
Copy link
Contributor

Rbiessy commented Dec 17, 2024

The PR was closed due to a technical GitHub difficulty with the commit history. We were able to push to the main branch instead and preserve the commit history.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.