-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BLAS] Upstreaming portBLAS as GENERIC SYCL BLAS implementation #1
Conversation
Integer constants are deduced to int which can conflict with index_t and fail the template type deduction.
* Call _scal when alpha is zero in GEMM When alpha is zero in GEMM, the operation is just `C = beta * C`, so we can just call `_scal` to scale the output matrix without computing the matrix multiply. * Add tests for alpha = 0 * Set values to zero when scaling by zero
* Add FindClara.cmake * Move external headers around for a FHS-friendly layout
* Clean up auto tuner entry point * Reduce device copies required in tuner * Change auto tuner to use x-macro configs * Handle errors better in auto tuner * Add error to header when macro undefined * Rename generated file
* Add assert to gemm_local.hpp for non square tiles * Fix incorrect assert string in gemm_no_local.hpp
* Generate kernels in separate TUs * Set include dirs for targets rather than globally * Remove explicit link to libm
* Reorder cmake commands to prevent overwriting options The SYCL-BLAS cmake scripts are very sensitive to the order in which options are declared and used. Recently this causes link failures when cmake is run with default options, as the `GEMM_TALL_SKINNY_SUPPORT` option is used before it has been declared to have a default of `ON`. This means that the first time the library object targets are generated they do not include any tall skinny GEMM kernels, however when the host launcher object is compiled the option has been defaulted on, so the host expects these kernels to be available.
The OpenBLAS submodule is not used in this repo, so we can remove it.
The cblas header was moved in 742a6d7 but the path used in CMake was not updated, causing build problems on platforms which do not have a cblas header in the system include directories. Co-authored-by: Ben Tracy <[email protected]>
By using a constexpr variable with the correct value, we can ensure that the template deduction won't conflict.
Co-authored-by: Mehdi Goli <[email protected]>
* Update OpenBLAS section of readme * Reflow text
Rather than only building one or the other, provide targets to build both types of ARM Compute Library benchmarks. Co-authored-by: Mehdi Goli <[email protected]>
…types (#230) Adds the first vectorized version of the no_local GEMM kernel as a seperate file, also adds an enum to GEMM configurations to allow selecting the type of vectorization to use. This allows us more options for picking the best kernel configuration for different backends. This partial vectorization of the no_local kernel is only vectorizable when A is not transposed and B is transposed.
A user might want to provide a buffer of `const` values as an input of GEMV, however the current kernel will infer that the value to use for all inermediates should then be `const`. By removing the `const` from the type we can use `const` inputs and have mutable intermediates.
* Add support for IMGDNN and PowerVR GX6250
It is a common pattern to call routines like GEMM and GEMV with uninitialized data in the output buffer when specifying beta to be zero. This can cause problems if the uninitialized data happens to contain bit patterns that correspond to NaN values, which can then get incorrectly propagated to the output, as `0 * NaN == NaN` and not `0` as required. To avoid this, provide an alternative final kernel that avoids loading and multiplying the output by zero.
When `alpha == 0`, GEMM can just scale the matrix. However, this currently doesn't take the `batch_size` into account, leading to wrong results. This PR fixes this issue and adds tests for the new cases. It also updates existing tests by changing `alpha` and `beta` to something other than `1` - the issue this PR is fixing doesn't present itself when `beta == 1`.
Fixes compilation error for CLBLAST benchmark.
* Allow generic types Right now only `DOUBLE_SUPPORT` is possible, and there is a lot of manual intervention in benchmarks and tests. This patch generalizes support for types by using `BLAS_DATA_TYPES` variable. For example, instead of specifying `-DDOUBLE_SUPPORT=ON` to CMake, the user would specify `-DBLAS_DATA_TYPES="float;double"`. The old flag is still there for backwards compatibility. Note however that the reference BLAS doesn't support generic types yet. That is a more difficult problem because system BLAS requires either `float*` or `double*` to data. 1. `BLAS_DATA_TYPES` CMake variable * Passed in as a CMake list * `float` is a mandatory type 2. Introduced mapping of CMake string types to C++ types * Allows the user to pass in a simpler representation * Done by `to_cpp_type` * No special types supported at the moment, but it allows e.g. cl::sycl::half 3. Replaced `DOUBLE_SUPPORT` C++ macro with `BLAS_DATA_TYPE_DOUBLE` 4. Generic way to register benchmarks * `BLAS_REGISTER_BENCHMARK` 5. Generic way to register tests * `BLAS_REGISTER_TEST` * Requires `combination_t` to be templated, so some types had to be made into templates 6. `TypeDispatcher` replaced with `blas_system_function` * More generic, separates function retrieval from the call * Not enough to actually allow different types, this is only the foundation 7. Conditionally instantiate policy handler 8. Minor fixes * System `dot` can take `y` as a constant * Renamed `a` and `b` in `gemv` and `ger` to `x` and `y` * Some clang-format * rotg fix * Assume register_benchmark exists * Use MAKE_C_IDENTIFIER * Update benchmark/common_utils.hpp Typo Co-authored-by: Duncan McBain <[email protected]> * Replaced to_cpp_type with sanitize_file_name Co-authored-by: Duncan McBain <[email protected]> Co-authored-by: Mehdi Goli <[email protected]>
Because the generators can be called in parallel there is a race condition during the build when os.makedirs tries to create a folder that already exists. It's safer to try to perform the operation and ignore the exception if folder already exists. Co-authored-by: Mehdi Goli <[email protected]>
* Update namespace to sycl2020 Update included headers to sycl2020 standard * Update device default selector to SYCL2020 * Remove SYCL_VERSION check Removing all pragma that checks that sycl version used by compiler is appropriate * Update access::mode to access_mode * Update ::target to new namespace and target
* Remove github workflow Current workflow crash when bulding the docker due to outdated configuration. Removing workflow entirely since it is not necessary for now. * Remove dependabot and .github directory
* Add runtime support check for discrete Intel GPUs on txsv operators * Enabling txsv operators for iGPU with DEFAULT TUNING TARGET Since the support for txsv is checked before calling the implementation, exception for Arc and GPU Max is raised before so it is possible to have a default configuration working on iGPUs. --------- Signed-off-by: nscipione <[email protected]>
Enable OpenSSF Scorecard workflow, add badge to README file Remove deprecated "Build and Test" badge from README file
Bumps the github-actions group with 4 updates: [actions/checkout](https://github.com/actions/checkout), [ossf/scorecard-action](https://github.com/ossf/scorecard-action), [actions/upload-artifact](https://github.com/actions/upload-artifact) and [github/codeql-action](https://github.com/github/codeql-action). Updates `actions/checkout` from 4.1.6 to 4.1.7 - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](actions/checkout@a5ac7e5...692973e) Updates `ossf/scorecard-action` from 2.3.3 to 2.4.0 - [Release notes](https://github.com/ossf/scorecard-action/releases) - [Changelog](https://github.com/ossf/scorecard-action/blob/main/RELEASE.md) - [Commits](ossf/scorecard-action@dc50aa9...62b2cac) Updates `actions/upload-artifact` from 4.3.3 to 4.3.4 - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](actions/upload-artifact@6546280...0b2256b) Updates `github/codeql-action` from 3.25.8 to 3.25.15 - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](github/codeql-action@2e230e8...afb54ba) --- updated-dependencies: - dependency-name: actions/checkout dependency-type: direct:production update-type: version-update:semver-patch dependency-group: github-actions - dependency-name: ossf/scorecard-action dependency-type: direct:production update-type: version-update:semver-minor dependency-group: github-actions - dependency-name: actions/upload-artifact dependency-type: direct:production update-type: version-update:semver-patch dependency-group: github-actions - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch dependency-group: github-actions ... Signed-off-by: dependabot[bot] <[email protected]>
This patch substitutes the group_broadcast of two scalar indexes with the broadcast of a single vector value.
* Update rotmg implementation * Add conditional check on mem_type in use * Add dependencies to copy_y1 operation * Add new interface to library and add related tests.
Bumps the github-actions group with 2 updates: [actions/upload-artifact](https://github.com/actions/upload-artifact) and [github/codeql-action](https://github.com/github/codeql-action). Updates `actions/upload-artifact` from 4.3.4 to 4.4.0 - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](actions/upload-artifact@0b2256b...5076954) Updates `github/codeql-action` from 3.25.15 to 3.26.6 - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](github/codeql-action@afb54ba...4dd1613) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-type: direct:production update-type: version-update:semver-minor dependency-group: github-actions - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-minor dependency-group: github-actions ... Signed-off-by: dependabot[bot] <[email protected]>
Bumps the github-actions group with 2 updates: [actions/checkout](https://github.com/actions/checkout) and [github/codeql-action](https://github.com/github/codeql-action). Updates `actions/checkout` from 4.1.7 to 4.2.0 - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](actions/checkout@692973e...d632683) Updates `github/codeql-action` from 3.26.6 to 3.26.10 - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](github/codeql-action@4dd1613...e2b3eaf) --- updated-dependencies: - dependency-name: actions/checkout dependency-type: direct:production update-type: version-update:semver-minor dependency-group: github-actions - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-patch dependency-group: github-actions ... Signed-off-by: dependabot[bot] <[email protected]>
Hi @rodburns, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who are the maintainers?
There should be a MAINTAINERS file to show who is responsible for the code.
Get Started
It's not clear immediately to me from either this repo, or the oneMath repo how to use this as a backend.
Can we have something that links to the oneMath repo section for backends or the documentation directly? https://uxlfoundation.github.io/oneMath/building_the_project_with_dpcpp.html
Documentation for this repo is not linked from the README, can we make it clearer that documentation exists and what it is? https://github.com/uxlfoundation/generic-sycl-components/tree/nscipione/pb_upstream/onemath/sycl/blas/doc
Other changes outside this repo:
oneMath README needs to be updated to link to the BLAS repo instead of portBLAS
The docs e.g. https://uxlfoundation.github.io/oneMath/building_the_project_with_dpcpp.html need to be updated from portBLAS.
We have a circular dependency as we need to merge this PR and make this repository public before we can suggest to link to it in oneMath. We already have a note to update the existing oneMath PR: uxlfoundation/oneMath#618 (comment) |
e88a9a4
to
4362f05
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved with the minor changes.
Cleaning repository to all portBLAS references. Update documentation. Remove tests, samples and benchmarks. Allow only header-only compilation. Signed-off-by: nscipione <[email protected]>
369ead3
to
c52d35e
Compare
The PR was closed due to a technical GitHub difficulty with the commit history. We were able to push to the main branch instead and preserve the commit history. |
This PR upstreams the long work done with portBLAS as generic SYCL implementation of BLAS routines.
Everything is added under the
onemath/sycl/blas
directory and meant to be used as header-only library.An example of the current usage is the oneMath project where portBLAS is the generic backend for BLAS.