-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Syr2 fix #2013
Syr2 fix #2013
Conversation
On KokkosEco_Trilinos_Weaver_CUDA112_opt-uvm the SYR2 test enerates a compile time error probably due to a mixed use of host and device views when comparing implemented vs. reference results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @lucbv ! I was just trying to work out a standalone reproducer without Trilinos, this saves some time :)
@ndellingwood my attempt to reproduce did not work but nonetheless I think this change might help based on the error reported. |
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930_Tpls_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GNU1021
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GNU1021_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GNU1021
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19_solo
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001_solo
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA90A_ROCM560
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA90A_Tpls_ROCM560
Jenkins Parameters
Using Repos:
Pull Request Author: lucbv |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930_Tpls_CLANG13CUDA10
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GNU1021
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GNU1021_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GNU1021
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL19_solo
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001_solo
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_Tpls_ARMPL2110
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_A64FX_GCC1020
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA90A_ROCM560
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_VEGA90A_Tpls_ROCM560
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ eeprude ndellingwood ]! |
Status Flag 'Pull Request AutoTester' - Pull Request MUST BE MERGED MANUALLY BY Project Team - This Repo does not support Automerge |
After merge of the PR the nightly Cuda/11.2.2 UVM build is still failing to compile with error:
|
Yeah, I don't know why we get the following device: |
@lucbv here is the configuration used on Weaver for reference (using source override for the kokkos and kokkos-kernels repos), though the builds can take awhile...: source /projects/ppc64le-pwr9-rhel8/legacy-env.sh
module purge
module load git cmake/3.23.1 cuda/11.2.2/gcc/8.3.1 openmpi/4.1.1/gcc/8.3.1/cuda/11.2.2 openblas/0.3.18/gcc/8.3.1
module load metis/5.1.0/gcc/8.3.1 hdf5/1.10.7/gcc/8.3.1/openmpi/4.1.1 parmetis/4.0.3/gcc/8.3.1/openmpi/4.1.1 zlib/1.2.11/gcc/8.3.1 boost/1.70.0/gcc/8.3.1
module list
export OMPI_CXX=$KOKKOS_DIR/bin/nvcc_wrapper
cmake \
-D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR} \
-D CMAKE_CXX_COMPILER="`which mpicxx`" \
-D CMAKE_CXX_FLAGS="-g" \
-D CMAKE_C_COMPILER="`which mpicc`" \
-D CMAKE_C_FLAGS="-g" \
-D CMAKE_Fortran_COMPILER="`which mpifort`" \
-D CMAKE_Fortran_FLAGS="-g" \
-D CMAKE_BUILD_TYPE:STRING=RELEASE \
-D CMAKE_CXX_STANDARD=17 \
-D CMAKE_INSTALL_PREFIX=$PWD/install \
-D TPL_ENABLE_CUDA:STRING=ON \
-D TPL_ENABLE_MPI:STRING=ON \
-D MPI_BASE_DIR:PATH="$OPENMPI_ROOT" \
-D MPI_BIN_DIR:PATH="$OPENMPI_BIN" \
-D MPI_EXEC_POST_NUMPROCS_FLAGS:STRING="-map-by;socket:PE=4" \
-D TPL_ENABLE_BLAS:STRING=ON \
-D BLAS_LIBRARY_DIRS:FILEPATH="$OPENBLAS_ROOT/lib" \
-D BLAS_LIBRARY_NAMES:STRING="openblas" \
-D TPL_ENABLE_LAPACK:STRING=ON \
-D LAPACK_INCLUDE_DIRS:FILEPATH="$OPENBLAS_ROOT/include" \
-D LAPACK_LIBRARY_DIRS:FILEPATH="$OPENBLAS_ROOT/lib" \
-D LAPACK_LIBRARY_NAMES:STRING="openblas" \
-D TPL_ENABLE_Boost:BOOL=ON \
-D Boost_INCLUDE_DIRS:FILEPATH="$BOOST_ROOT/include" \
-D Boost_LIBRARY_DIRS:FILEPATH="$BOOST_ROOT/lib" \
-D TPL_ENABLE_BoostLib:BOOL=ON \
-D BoostLib_INCLUDE_DIRS:FILEPATH="$BOOST_ROOT/include" \
-D BoostLib_LIBRARY_DIRS:FILEPATH="$BOOST_ROOT/lib" \
-D TPL_ENABLE_Netcdf:BOOL=OFF \
-D Trilinos_ENABLE_TESTS=OFF \
-D Trilinos_ENABLE_EXAMPLES=OFF \
-D Trilinos_ENABLE_Kokkos=ON \
-D Kokkos_ENABLE_CUDA=ON \
-D Kokkos_ENABLE_CUDA_LAMBDA=ON \
-D Kokkos_ENABLE_CUDA_UVM=ON \
-D Kokkos_ARCH_VOLTA70=ON \
-D Kokkos_ARCH_POWER9=ON \
-D Kokkos_ENABLE_TESTS=ON \
-D Trilinos_ENABLE_KokkosKernels=ON \
-D KokkosKernels_ENABLE_TESTS=ON \
-D Trilinos_ENABLE_Tpetra=ON \
-D Tpetra_ENABLE_CUDA=ON \
-D Tpetra_ENABLE_SERIAL=ON \
-D Tpetra_ENABLE_TESTS=ON \
-D Trilinos_ENABLE_Sacado=ON \
-D Sacado_ENABLE_TESTS=ON \
-D Trilinos_ENABLE_Amesos2=ON \
-D Amesos2_ENABLE_TESTS=ON \
-D Trilinos_ENABLE_Teuchos=ON \
-D Teuchos_ENABLE_TESTS=ON \
-D Trilinos_ENABLE_Ifpack2=ON \
-D Ifpack2_ENABLE_TESTS=ON \
-D Trilinos_ENABLE_Belos=ON \
-D Belos_ENABLE_TESTS=ON \
-D Trilinos_ENABLE_Stokhos=ON \
-D Stokhos_ENABLE_TESTS=ON \
-D Trilinos_ENABLE_Phalanx=ON \
-D Phalanx_ENABLE_TESTS=ON \
-D Trilinos_ENABLE_Zoltan2=ON \
-D Zoltan2_ENABLE_TESTS=ON \
-D Trilinos_ENABLE_Anasazi=ON \
-D Trilinos_ENABLE_MueLu=ON \
-D MueLu_ENABLE_TESTS=ON \
-D Trilinos_ENABLE_Panzer=ON \
-D Panzer_ENABLE_TESTS=ON \
-D Trilinos_ENABLE_Intrepid2=ON \
-D Intrepid2_ENABLE_TESTS=ON \
-D Trilinos_ENABLE_SEACAS=OFF \
-D Trilinos_ENABLE_Zoltan2Sphynx=OFF \
-D Trilinos_ENABLE_ShyLU_NodeTacho=OFF \
-D KokkosKernels_INST_MEMSPACE_CUDAUVMSPACE=ON \
-D Stokhos_KokkosCrsMatrixUQPCEUnitTest_Cuda_MPI_1_SET_RUN_SERIAL=ON \
-D Stokhos_TpetraCrsMatrixUQPCEUnitTest_Cuda_MPI_4_SET_RUN_SERIAL=ON \
-D Stokhos_KokkosCrsMatrixMPVectorUnitTest_Cuda_MPI_1_SET_RUN_SERIAL=ON \
-D Intrepid2_unit-test_MonolithicExecutable_Intrepid2_Tests_MPI_1_SET_RUN_SERIAL=ON \
-D Kokkos_CoreUnitTest_CudaTimingBased_MPI_1_SET_RUN_SERIAL=ON \
-D Kokkos_CoreUnitTest_Default_MPI_1_SET_RUN_SERIAL=ON \
-DKokkos_SOURCE_DIR_OVERRIDE:STRING=kokkos \
-DKokkosKernels_SOURCE_DIR_OVERRIDE:STRING=kokkos-kernels \
$TRILINOS_DIR |
I've been trying to do it outside of Trilinos to make things a bit easier but I might just do what you show above and only enable Kokkos and Kokkos Kernels, hopefully that's enough to reproduce the issue (maybe Tpetra as well for ETI?). |
Yeah, enabling Tpetra may trigger a code path we aren't hitting in the standalone builds |
Hi Luc and Nahtan. Sorry, I was out of home for some hours. I looked at the code for about 15 minutes and could not find anything "obvious" that could explain the problem. Luc, let me know if you would like for you and I to meet together in your office, in order to look at this issue together. This Thrusday I am free until 10:59 AM MT and from 3:01 PM MT on. |
I split the remaining errors off into a separate issue for easier tracking, #2027 |
Small change to the SYR2 unit test so that it uses host views not devices views when comparing results.
It seems more consistent and will hopefully fix the issue in the KokkosEco_Trilinos_Weaver_CUDA112_opt-uvm nightly build.