Skip to content

Commit

Permalink
Disable cuBLAS dot wrapper (#2206)
Browse files Browse the repository at this point in the history
(not deleted, just guarded with #if 0 and comments explaining)

It performs significantly worse than our native impl on 11.2, 11.8 and 12.0 on V100.
This is in the dot perf test with a warm-up call.

trilinos/Trilinos#12982 was a symptom of this.
  • Loading branch information
brian-kelley authored May 22, 2024
1 parent 3414c91 commit 6204151
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 0 deletions.
6 changes: 6 additions & 0 deletions blas/tpls/KokkosBlas1_dot_tpl_spec_avail.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -89,9 +89,15 @@ KOKKOSBLAS1_DOT_TPL_SPEC_AVAIL_BLAS(Kokkos::complex<float>, Kokkos::LayoutLeft,
KOKKOSBLAS1_DOT_TPL_SPEC(Kokkos::complex<double>, LAYOUT, EXECSPACE, MEMSPACE)

#ifdef KOKKOSKERNELS_ENABLE_TPL_CUBLAS
// Note BMK: CUBLAS dot is consistently slower than our native dot
// (measured 11.2, 11.8, 12.0 using perf test, and all are similar)
// If a future version improves performance, re-enable it here and
// in the tpl_spec_decl file.
#if 0
KOKKOSBLAS1_DOT_TPL_SPEC_AVAIL(Kokkos::LayoutLeft, Kokkos::Cuda,
Kokkos::CudaSpace)
#endif
#endif

#ifdef KOKKOSKERNELS_ENABLE_TPL_ROCBLAS
KOKKOSBLAS1_DOT_TPL_SPEC_AVAIL(Kokkos::LayoutLeft, Kokkos::HIP,
Expand Down
4 changes: 4 additions & 0 deletions blas/tpls/KokkosBlas1_dot_tpl_spec_decl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,9 @@ KOKKOSBLAS1_DOT_TPL_SPEC_DECL_BLAS_EXT(false)

// cuBLAS
#ifdef KOKKOSKERNELS_ENABLE_TPL_CUBLAS
// Disabled because native has better performance.
// See tpl_spec_avail file for more details
#if 0
#include <KokkosBlas_tpl_spec.hpp>

namespace KokkosBlas {
Expand Down Expand Up @@ -174,6 +177,7 @@ KOKKOSBLAS1_DOT_TPL_SPEC_DECL_CUBLAS_EXT(false)
} // namespace Impl
} // namespace KokkosBlas
#endif
#endif

// rocBLAS
#ifdef KOKKOSKERNELS_ENABLE_TPL_ROCBLAS
Expand Down

0 comments on commit 6204151

Please sign in to comment.