Skip to content

Commit

Permalink
ocl: pointer-arithmetic for device-pointers
Browse files Browse the repository at this point in the history
* Implemented pointer-arithmetic for device-pointers using Intel's USM as well as fallback code.
* Fallback to main-thread's stream (c_dbcsr_acc_opencl_stream_default).
* Fixed c_dbcsr_acc_opencl_stream_default and reduce one level of indirection.
* Reworked entire memory allocation (determining offsets).
* Consolidated compile-time decisions about LIBXSMM_VERSION_NUMBER.
* Removed runtime decisions accounting for pooled allocations.
* Removed support for performance estimation and suitability.
* Support older LIBXSMM (pooled memory allocations).
* Set ACC_OPENCL_ATOMIC_KIND to sequentially consistent; set ACC_OPENCL_NLOCKS=1.
* Complemented ACC_OPENCL_NLOCKS with environment variable.
* Introduced ACC_OPENCL_OMPLOCKS, ACC_OPENCL_MEM_DEBUG, ACC_OPENCL_EVENT_FLUSH.
* Implemented behavior of c_dbcsr_acc_opencl_stream_default already in c_dbcsr_acc_opencl_stream.
* Cache active device-ID to avoid determining context/properties (c_dbcsr_acc_set_active_device).
* Support event chain (dependency), improved handling errors (c_dbcsr_acc_stream_wait_event).
* Support event chain (dependency), improved handling errors (c_dbcsr_acc_event_record).
* Introduced lock-arguments (internal, e.g., c_dbcsr_acc_opencl_set_active_device).
* Consolidated domain-locks into c_dbcsr_acc_opencl_config.
* Made build-log available (c_dbcsr_acc_opencl_kernel).
* Reworked stream-registry and stream-info facility.
* Consolidated tuned parameters, and updated tuned parameters.
* Use "int" instead of "cl_int" when taking the return-code.
* Consistently use EXIT_SUCCESS instead of CL_SUCCESS.
* Removed support for ACC_OPENCL_OVERMALLOC.
* Removed support for per-thread device.
* Removed ACC_OPENCL_EVENT_BARRIER.
* Introduced ACC_OPENCL_MEM_TLS (disabled).
* Simplified c_dbcsr_acc_opencl_memset.
* Support ACC_OPENCL_STREAM_NULL in event facility.
* Introduced assertion (dbcsr_acc_devmem.F).
* Fixed using size_t as kernel argument.
* Introduced UNROLL_AUTO.
  • Loading branch information
hfp committed Feb 26, 2024
1 parent ef62505 commit fed2519
Show file tree
Hide file tree
Showing 19 changed files with 1,828 additions and 2,005 deletions.
1 change: 1 addition & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,7 @@ if (USE_SMM MATCHES "libxsmm")
target_link_libraries(dbcsr PRIVATE PkgConfig::LIBXSMMEXT)
endif ()
target_link_libraries(dbcsr PRIVATE PkgConfig::LIBXSMM)
target_link_libraries(dbcsr PRIVATE ${BLAS_LIBRARIES})
endif ()

if (BLAS_LIBRARIES MATCHES "mkl_")
Expand Down
2 changes: 1 addition & 1 deletion src/acc/acc_bench_smm.c
Original file line number Diff line number Diff line change
Expand Up @@ -499,7 +499,7 @@ int main(int argc, char* argv[]) {
if (maxdiff < epsilon && NULL != file) maxdiff = epsilon;
if (0 < epsilon) {
if (LIBXSMM_NOTNAN(diff.v_tst)) {
PRINTF(" (%g != %g)\n", diff.v_ref, diff.v_tst);
PRINTF(" (|%g-%g|=%g)\n", diff.v_ref, diff.v_tst, fabs(diff.v_ref - diff.v_tst));
}
else {
PRINTF(" (%g)\n", diff.v_tst);
Expand Down
6 changes: 5 additions & 1 deletion src/acc/dbcsr_acc_devmem.F
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
MODULE dbcsr_acc_devmem
!! Accelerator support
#if defined (__DBCSR_ACC)
USE ISO_C_BINDING, ONLY: C_INT, C_SIZE_T, C_PTR, C_LOC, C_NULL_PTR
USE ISO_C_BINDING, ONLY: C_INT, C_SIZE_T, C_PTR, C_LOC, C_NULL_PTR, C_ASSOCIATED
#endif
USE dbcsr_kinds, ONLY: int_4, &
int_4_size, &
Expand Down Expand Up @@ -255,6 +255,10 @@ FUNCTION acc_devmem_allocated(this) RESULT(res)
LOGICAL :: res
!! true if device memory is allocated, false otherwise

#if defined (__DBCSR_ACC)
DBCSR_ASSERT(C_ASSOCIATED(this%cptr) .OR. this%size_in_bytes <= 0)
#endif

res = this%size_in_bytes >= 0
END FUNCTION acc_devmem_allocated

Expand Down
2 changes: 1 addition & 1 deletion src/acc/opencl/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ ifneq (0,$(DBG))
CFLAGS += -O0
endif
else
CFLAGS += -O2 -DNDEBUG -DNDBGDEV
CFLAGS += -O2 -DNDEBUG
SYM := 0
endif
ifneq (0,$(SYM))
Expand Down
1,025 changes: 506 additions & 519 deletions src/acc/opencl/acc_opencl.c

Large diffs are not rendered by default.

Loading

0 comments on commit fed2519

Please sign in to comment.