Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace SFINAE by if constexpr for create_mirror* #6

Open
wants to merge 136 commits into
base: refactor/create-mirror/variables
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
136 commits
Select commit Hold shift + click to select a range
b4bc406
Reenable TestHIP_Memory_Requirements
Rombur Feb 2, 2024
c1a8006
Don't use Fedora development version in GitHub CI
masterleinad Mar 13, 2024
841b3a9
Fix deep copy when filling Rank-7 views
cedricchevalier19 Feb 15, 2024
a2f2ba4
TestViewCopy_c.hpp: add new unit test for deep copy (ViewFill)
cedricchevalier19 Feb 16, 2024
ae4d001
TestViewCopy_c.hpp: better handling for OpenMPTarget
cedricchevalier19 Mar 11, 2024
46354d2
Use builtin for atomic_fetch in the HIP backend
Rombur Mar 18, 2024
70603df
Combine the two Impl::create_mirror functions into one with constexpr
thierryantoun Mar 21, 2024
6524e2b
Format create_mirror
thierryantoun Mar 21, 2024
1635ff8
Combine the four Impl::create_mirror_view in one using if constexpr
thierryantoun Mar 21, 2024
d3bb03a
Combine the two create_mirror_view_and_copy functions into one with c…
pzehner Mar 21, 2024
872dc42
Fix Makefile.kokkos for Threads
masterleinad Mar 22, 2024
2035e31
Fix a bug in Makefile when using AMD GPU architectures (#6892)
Rombur Mar 25, 2024
8d734b0
Cuda: Fix configuring with CMake 3.28.4 (#6898)
masterleinad Mar 28, 2024
a53d30a
Merge pull request #6896 from masterleinad/fix_makefile_threads
dalg24 Mar 28, 2024
68c6684
Update Intel GPU architectures in Makefile (#6895)
masterleinad Mar 28, 2024
e2cfdec
Drop Experimental::LayoutTiled class template
dalg24 Mar 29, 2024
51b98e1
Get rid of now unnecessary use of is_layouttiled trait
dalg24 Mar 29, 2024
1efeb5d
Deprecate is_layouttiled trait
dalg24 Mar 29, 2024
5eac0bc
Merge pull request #6876 from masterleinad/disable_fedora_rawhide
dalg24 Apr 1, 2024
6355510
Move `Kokkos::Array` tests to a more suitable place (#6905)
dalg24 Apr 1, 2024
391e040
Do not return a copy of the input functor for Kokkos::Experimental::f…
tpadioleau Apr 2, 2024
b667853
Drop specialization of ViewMapping for Kokkos::Array
dalg24 Mar 29, 2024
059cd15
Accommodate users that depend on a code that define silly macros (#6909)
dalg24 Apr 2, 2024
2aecb1d
SYCL: Fix multi-GPU support and add test (#6887)
masterleinad Apr 3, 2024
5cf0951
Merge pull request #6910 from tpadioleau/remove-return-functor-copy-f…
dalg24 Apr 3, 2024
caa139c
SYCL: Unroll shuffle loops for top-level parallel_reduce and parallel…
masterleinad Apr 3, 2024
9220154
Merge branch 'refactor/create-mirror/variables' into refactor/create-…
pzehner Apr 4, 2024
a833fb0
Preparing readme for develop as the default branch (#6796)
cedricchevalier19 Apr 4, 2024
682291f
Format with clang
thierryantoun Apr 4, 2024
b89d618
Merge branch 'refactor/create-mirror/constexpr' of https://github.com…
thierryantoun Apr 4, 2024
497b438
CHANGELOG.md: 4.3.00 update
ndellingwood Apr 4, 2024
cc21a54
Merge pull request #6919 from ndellingwood/dev-changelog-4300
dalg24 Apr 4, 2024
4b90930
Refactor: Uniformize `create_mirror*` parameter name for views (#6917)
pzehner Apr 5, 2024
1256f69
Merge pull request #6822 from CExA-project/fix-deep-copy
dalg24 Apr 5, 2024
98b1a38
SYCL: Improve team_reduce implementation (#6562)
masterleinad Apr 5, 2024
e93b168
Merge pull request #6907 from dalg24/rm_experimental_layout_tiled
dalg24 Apr 6, 2024
e52cda3
Merge pull request #6785 from Rombur/memory_test
dalg24 Apr 8, 2024
55c5757
Use recommended/max team size functions in Cuda ParallelFor and Reduc…
tcclevenger Apr 8, 2024
8cf8410
SYCL: Fix range in subgroup scan for workgroup_scan
masterleinad Apr 8, 2024
7b41536
Merge pull request #6924 from masterleinad/fix_sycl_workgroup_scan
crtrott Apr 9, 2024
3a27cdb
Add ROCm 6.0 in the nightly CI
Rombur Apr 10, 2024
1fe8108
Merge pull request #6906 from dalg24/make_view_of_arrays_less_special
dalg24 Apr 10, 2024
74c8122
Merge pull request #6926 from Rombur/latest_rocm
dalg24 Apr 10, 2024
164519d
MI300 support unified memory support (#6877)
crtrott Apr 10, 2024
6ea7be7
cuda: reduction with `RangePolicy`: fix grid dimensions to work for l…
fnrizzi Apr 10, 2024
0099c10
Fix nightly CI
Rombur Apr 11, 2024
b0c2566
Merge pull request #6930 from Rombur/fix_nightly
dalg24 Apr 11, 2024
a2af4e0
Deprecate trailing Proxy template argument in Kokkos::Array
dalg24 Apr 11, 2024
92e02b5
CUDA: Update nvcc_wrapper
Apr 12, 2024
b5ec79b
Merge pull request #6936 from rgayatri23/issue_6874
dalg24 Apr 12, 2024
d88e2a5
bring back --fmad option to nvcc_wrapper (#6931)
glesur Apr 12, 2024
de3a263
Merge pull request #6934 from dalg24/deprecate_kokkos_array_proxy_tem…
dalg24 Apr 15, 2024
f2d3780
Remove unnecessary header include
dalg24 Apr 15, 2024
8c7cc95
Merge pull request #6940 from dalg24/unused_limits_header_include_in_…
dalg24 Apr 16, 2024
a8115e5
Adding converting constructor in Kokkos::RandomAccessIterator (#6929)
yasahi-hpc Apr 16, 2024
0e3a673
Use if constexpr for offset view create_mirror*
pzehner Apr 16, 2024
856bcc2
Use if constexpr for dynamic view create_mirror*
pzehner Apr 16, 2024
f1f4741
Use if constexpr for dynamic rank view create_mirror*
pzehner Apr 16, 2024
f94e8d3
Prefer standard C++ feature testing to guard the C++20 requires expre…
dalg24 Apr 16, 2024
c9e21ce
Add `kokkos_swap(Array<T, N>)` sepcialization
dalg24 Apr 16, 2024
730d8d8
Deprecate specialization of Kokkos::pair for a single element
dalg24 Apr 17, 2024
906e8ce
Merge pull request #6942 from dalg24/fix_nightlies_cxx20_requires_exp…
dalg24 Apr 17, 2024
d914fe3
Fix deprecated warning from `Kokkos::Array` specialization (#6945)
dalg24 Apr 17, 2024
e1b8afa
Add comments
pzehner Apr 17, 2024
1287f7f
Restore inline specifiers
pzehner Apr 17, 2024
69c527a
[ci skip] Enable deprecated code and deprecated warnings in nightly CI
Rombur Apr 17, 2024
e7b486f
Serial: Use the provided execution space instance in TeamPolicy
masterleinad Apr 17, 2024
0859ab0
Fixed the link for P6601 (Threads backend change)
nliber Apr 17, 2024
d5fd512
Merge pull request #6947 from dalg24/deprecate_kokkos_pair_void_speci…
dalg24 Apr 17, 2024
04bc3d9
Merge pull request #6952 from nliber/changelog43
crtrott Apr 18, 2024
34d0db2
Add test
masterleinad Apr 18, 2024
44fde21
Use Kokkos::AUTO for OpenMPTarget
masterleinad Apr 18, 2024
8706b68
kokkos_swap(Array) member friend should not be templated on some othe…
dalg24 Apr 18, 2024
86f5988
Fix noexcept specification for kokkos_swap on zero-sized arrays
dalg24 Apr 18, 2024
cc60295
Merge pull request #6951 from masterleinad/fix_serial_space_team_policy
crtrott Apr 19, 2024
3dc6f55
Add maybe_unused
pzehner Apr 22, 2024
2b54e2c
Mutualize check functions
pzehner Apr 22, 2024
7b6edf7
Simplify code
pzehner Apr 22, 2024
1ee800a
Fix missing maybe_unused
pzehner Apr 22, 2024
205fd15
Replace deprecated sycl::device_ptr/sycl::host_ptr
masterleinad Apr 22, 2024
5932685
Introduce alias based on feature macro
masterleinad Apr 22, 2024
a782773
Kokkos::Impl::SYCLTypes:: -> Kokkos::Impl::sycl_
masterleinad Apr 22, 2024
e2b7bb9
Merge pull request #6958 from masterleinad/sycl_replace_deprecated_us…
crtrott Apr 23, 2024
cf59f31
Merge pull request #6943 from dalg24/kokkos_swap_specialization_for_k…
crtrott Apr 23, 2024
ab3cae4
Fix wrong macro guards for deprecated Kokkos::pair<T1,void> specializ…
dalg24 Apr 24, 2024
fafe861
Fix support for Kokkos::Array of const-qualified element type
dalg24 Apr 24, 2024
2e82fdd
Merge pull request #6961 from dalg24/fixup_deprcated_guards_pair_void
dalg24 Apr 24, 2024
63eef46
Try to fix the CUDA 11.0 build
dalg24 Apr 24, 2024
ebb1cb3
Revert "Try to fix the CUDA 11.0 build"
dalg24 Apr 25, 2024
031f6d9
Alternate definition of Impl::is_nothrow_swappable_v for NVCC version…
dalg24 Apr 25, 2024
2391f17
Avoid introducing a 2nd definition of the Impl::swappable trait
dalg24 Apr 25, 2024
d434f87
Do not require OpenMP support for languages other than CXX
dalg24 Apr 25, 2024
19ca9ce
Update version
crtrott Apr 26, 2024
9686392
Add Linux Foundation notice and fix C++ standard
crtrott Apr 26, 2024
1864287
Merge pull request #6967 from crtrott/update-readme-kk-version
crtrott Apr 26, 2024
7e7709f
SYCL: Avoid deprecated floating-point number abs overloads (#6959)
masterleinad Apr 27, 2024
4ec8296
OpenMPTarget: Update loop order in MDRange (#6925)
rgayatri23 Apr 28, 2024
4fd1864
Restore previous types when create_mirror_view returns the source view
pzehner Apr 29, 2024
0140760
Fix linting
pzehner Apr 29, 2024
5ad1ff4
Remove unused namespaces
pzehner Apr 29, 2024
775be75
Merge branch 'develop' into refactor/create-mirror/constexpr
pzehner Apr 29, 2024
9eb9507
Merge branch 'develop' into refactor/create-mirror/constexpr
pzehner Apr 30, 2024
77ea52f
Threads: Don't silently allow m_instance to be a nullptr (#6969)
masterleinad May 1, 2024
4f416f3
Merge pull request #6965 from dalg24/cmake_openmp_cxx
dalg24 May 1, 2024
f699a2c
Fix enabling OpenMP with HIP and "compile as CMake language"
dalg24 May 1, 2024
2574b80
Fix OpenMP+CUDA when `Kokkos_ENABLE_COMPILE_AS_CMAKE_LANGUAGE` is `ON`
dalg24 May 1, 2024
27b3ced
Merge pull request #6949 from Rombur/nightly_deprecated
dalg24 May 1, 2024
15d13f2
Merge pull request #6882 from Rombur/hip_atomic_fetch
dalg24 May 1, 2024
ed4d254
Merge pull request #6972 from dalg24/fix_kokkos_compile_language_cuda…
crtrott May 1, 2024
dbd7f58
Merge pull request #6962 from dalg24/kokkos_array_const_qualified_ele…
crtrott May 1, 2024
1c8b624
Merge branch 'develop' into refactor/create-mirror/constexpr
pzehner May 2, 2024
ccd0126
Fix fedora CI builds with flang-new
masterleinad May 1, 2024
45a1404
Fix Copyright file
crtrott May 2, 2024
85610f4
Merge pull request #6984 from crtrott/Copyright
crtrott May 2, 2024
a75dc70
Merge pull request #6982 from masterleinad/fix_fedora
crtrott May 2, 2024
c6d8647
Also use is_nothrow_swappable workaround for Intel Classic Compilers …
masterleinad May 2, 2024
b29b14f
Merge branch 'develop' into refactor/create-mirror/constexpr
pzehner May 3, 2024
69567f3
Add thread-safety tests (#6938)
masterleinad May 3, 2024
9c79202
Fix deprecation warnings with GCC for pair<T1,void> comparison operators
dalg24 May 3, 2024
7b8e3a6
Fix TPL_LIBRARY_SUFFIXES for 32-bit build
masterleinad May 6, 2024
2826017
Avoid duplicated definition of KOKKOS_IMPL_32BIT
masterleinad May 6, 2024
ccadc7d
Disable failing parallel_scan_with_reducers test
masterleinad May 6, 2024
06e4c5b
Merge pull request #6989 from dalg24/deprecated_attribute_comparison_…
dalg24 May 6, 2024
e4cc686
Merge pull request #6990 from masterleinad/fix_32bit_tpl_library_path
dalg24 May 7, 2024
d61d75a
Fix a bug when using realloc on views of non-default constructible el…
aprokop May 8, 2024
50a862c
SYCL: Prepare Parallel* for Graphs (#6988)
masterleinad May 8, 2024
f5b3422
SYCL: Fix deprecation in custom parallel_for RangePolicy implementation
masterleinad May 8, 2024
37986fd
[ci skip] update changelog for 4.3.1 (#6995)
ndellingwood May 8, 2024
7cad3e7
OpenMPTarget: Use mutex lock for parallel scan.
May 8, 2024
a69e81a
Merge pull request #6998 from rgayatri23/ompt_scan_lock
dalg24 May 9, 2024
5a5306c
Merge pull request #6997 from masterleinad/sycl_fix_custom_parallel_f…
dalg24 May 9, 2024
00170ae
Remove cuSPARSE TPL
dalg24 May 9, 2024
506da18
Merge pull request #7002 from dalg24/rm_tpl_cusparse
dalg24 May 9, 2024
1d9d0df
SYCL: Print submission command queue property (#7004)
masterleinad May 10, 2024
df018d9
Suppress deprecated warnings via pragma push/pop in the tests (#6999)
dalg24 May 13, 2024
be07ebc
Merge branch 'develop' into refactor/create-mirror/constexpr
pzehner May 14, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ jobs:
-DKokkos_ENABLE_DEPRECATED_CODE_4=ON \
-DKokkos_ENABLE_DEPRECATION_WARNINGS=OFF \
-DKokkos_ENABLE_COMPILER_WARNINGS=ON \
-DCMAKE_CXX_FLAGS="-Werror -m32 -DKOKKOS_IMPL_32BIT" \
-DCMAKE_CXX_FLAGS="-Werror -m32" \
-DCMAKE_CXX_COMPILER=g++ \
-DCMAKE_BUILD_TYPE=RelWithDebInfo
- name: Build
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/continuous-integration-workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
continue-on-error: true
strategy:
matrix:
distro: ['fedora:latest', 'fedora:rawhide', 'ubuntu:latest']
distro: ['fedora:latest', 'ubuntu:latest']
cxx: ['g++', 'clang++']
cxx_extra_flags: ['']
cmake_build_type: ['Release', 'Debug']
Expand Down
37 changes: 36 additions & 1 deletion .jenkins_nightly
Original file line number Diff line number Diff line change
Expand Up @@ -95,13 +95,48 @@ pipeline {
-DKokkos_ENABLE_BENCHMARKS=ON \
-DKokkos_ENABLE_EXAMPLES=ON \
-DKokkos_ENABLE_TESTS=ON \
-DKokkos_ENABLE_DEPRECATION_WARNINGS=OFF \
-DKokkos_ENABLE_DEPRECATED_CODE_4=ON \
-DKokkos_ENABLE_DEPRECATION_WARNINGS=ON \
-DKokkos_ENABLE_SERIAL=ON \
.. && \
make -j8 && ctest --verbose
'''
}
}
stage('HIP-ROCM-6.0') {
agent {
dockerfile {
filename 'Dockerfile.hipcc'
dir 'scripts/docker'
additionalBuildArgs '--build-arg BASE=rocm/dev-ubuntu-20.04:6.0.2-complete'
label 'rocm-docker && AMD_Radeon_Instinct_MI210'
args '-v /tmp/ccache.kokkos:/tmp/ccache --device=/dev/kfd --device=/dev/dri --security-opt seccomp=unconfined --group-add video --env HIP_VISIBLE_DEVICES=$HIP_VISIBLE_DEVICES'
}
}
steps {
sh 'ccache --zero-stats'
sh '''rm -rf build && mkdir -p build && cd build && \
cmake \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_CXX_COMPILER=hipcc \
-DCMAKE_CXX_FLAGS="-Werror -Wno-unused-command-line-argument" \
-DCMAKE_CXX_STANDARD=20 \
-DKokkos_ARCH_NATIVE=ON \
-DKokkos_ENABLE_COMPILER_WARNINGS=ON \
-DKokkos_ENABLE_DEPRECATED_CODE_4=ON \
-DKokkos_ENABLE_DEPRECATION_WARNINGS=ON \
-DKokkos_ENABLE_TESTS=ON \
-DKokkos_ENABLE_BENCHMARKS=ON \
-DKokkos_ENABLE_HIP=ON \
.. && \
make -j8 && ctest --verbose'''
}
post {
always {
sh 'ccache --show-stats'
}
}
}
}
}
}
Expand Down
113 changes: 113 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,118 @@
# CHANGELOG

## [4.3.01](https://github.com/kokkos/kokkos/tree/4.3.01)
[Full Changelog](https://github.com/kokkos/kokkos/compare/4.3.00...4.3.01)

### Backend and Architecture Enhancements:

#### HIP:
* MI300 support unified memory [\#6877](https://github.com/kokkos/kokkos/pull/6877)

### Bug Fixes
* Serial: Use the provided execution space instance in TeamPolicy [\#6951](https://github.com/kokkos/kokkos/pull/6951)
* `nvcc_wrapper`: bring back support for `--fmad` option [\#6931](https://github.com/kokkos/kokkos/pull/6931)
* Fix CUDA reduction overflow for `RangePolicy` [\#6578](https://github.com/kokkos/kokkos/pull/6578)

## [4.3.00](https://github.com/kokkos/kokkos/tree/4.3.00) (2024-03-19)
[Full Changelog](https://github.com/kokkos/kokkos/compare/4.2.01...4.3.00)

### Features:
* Add `Experimental::sort_by_key(exec, keys, values)` algorithm [\#6801](https://github.com/kokkos/kokkos/pull/6801)

### Backend and Architecture Enhancements:

#### CUDA:
* Experimental multi-GPU support (from the same process) [\#6782](https://github.com/kokkos/kokkos/pull/6782)
* Link against CUDA libraries even with KOKKOS_ENABLE_COMPILE_AS_CMAKE_LANGUAGE [\#6701](https://github.com/kokkos/kokkos/pull/6701)
* Don't use the compiler launcher script if the CMake compile language is CUDA. [\#6704](https://github.com/kokkos/kokkos/pull/6704)
* nvcc(wrapper): adding "long" and "short" versions for all flags [\#6615](https://github.com/kokkos/kokkos/pull/6615)

#### HIP:
* Fix compilation when using amdclang (with ROCm >= 5.7) and RDC [\#6857](https://github.com/kokkos/kokkos/pull/6857)
* Use rocthrust for sorting, when available [\#6793](https://github.com/kokkos/kokkos/pull/6793)

#### SYCL:
* We only support OneAPI SYCL implementation: add check during initialization
* Error out on initialization if the backend is different from `ext_oneapi_*` [\#6784](https://github.com/kokkos/kokkos/pull/6784)
* Filter GPU devices for `ext_onapi_*` GPU devices [\#6758](https://github.com/kokkos/kokkos/pull/6784)
* Performance Improvements
* Avoid unnecessary zero-memset of the scratch flags in SYCL [\#6739](https://github.com/kokkos/kokkos/pull/6739)
* Use host-pinned memory to copy reduction/scan result [\#6500](https://github.com/kokkos/kokkos/pull/6500)
* Address deprecations after oneAPI 2023.2.0 [\#6577](https://github.com/kokkos/kokkos/pull/6739)
* Make sure to call find_dependency for oneDPL if necessary [\#6870](https://github.com/kokkos/kokkos/pull/6870)

#### OpenMPTarget:
* Use LLVM extensions for dynamic shared memory [\#6380](https://github.com/kokkos/kokkos/pull/6380)
* Guard scratch memory usage in ParallelReduce [\#6585 ](https://github.com/kokkos/kokkos/pull/6585)
* Update linker flags for Intel GPUs update [\#6735](https://github.com/kokkos/kokkos/pull/6735)
* Improve handling of printf on Intel GPUs [\#6652](https://github.com/kokkos/kokkos/pull/6652)

#### OpenACC:
* Add atomics support [\#6446](https://github.com/kokkos/kokkos/pull/6446)
* Make the OpenACC backend asynchronous [\#6772](https://github.com/kokkos/kokkos/pull/6772)

#### Threads:
* Add missing broadcast to TeamThreadRange parallel_scan [\#6601](https://github.com/kokkos/kokkos/pull/6601)

#### OpenMP:
* Improve performance of view initializations and filling with zeros [\#6573](https://github.com/kokkos/kokkos/pull/6573)

### General Enhancements

* Improve performance of random number generation when using a normal distribution on GPUs [\#6556](https://github.com/kokkos/kokkos/pull/6556)
* Allocate temporary view with the user-provided execution space instance and do not initialize in `unique` algorithm [\#6598](https://github.com/kokkos/kokkos/pull/6598)
* Add deduction guide for `Kokkos::Array` [\#6373](https://github.com/kokkos/kokkos/pull/6373)
* Provide new public headers `<Kokkos_Clamp.hpp>` and `<Kokkos_MinMax.hpp>` [\#6687](https://github.com/kokkos/kokkos/pull/6687)
* Fix/improvement to `remove_if` parallel algorithm: use the provided execution space instance for temporary allocations and drop unnecessaryinitialization + avoid evaluating twice the predicate during final pass [\#6747](https://github.com/kokkos/kokkos/pull/6747)
* Add runtime function to query the number of devices and make device ID consistent with `KOKKOS_VISIBLE_DEVICES` [\#6713](https://github.com/kokkos/kokkos/pull/6713)
* simd: support `vector_aligned_tag` [\#6243](https://github.com/kokkos/kokkos/pull/6243)
* Avoid unnecessary allocation when default constructing Bitset [\#6524](https://github.com/kokkos/kokkos/pull/6524)
* Fix constness for views in std algorithms [\#6813](https://github.com/kokkos/kokkos/pull/6813)
* Improve error message on unsafe implicit conversion in MDRangePolicy [\#6855](https://github.com/kokkos/kokkos/pull/6855)
* CTAD (deduction guides) for RangePolicy [\#6850](https://github.com/kokkos/kokkos/pull/6850)
* CTAD (deduction guides) for MDRangePolicy [\#5516](https://github.com/kokkos/kokkos/pull/5516)

### Build System Changes
* Require `Kokkos_ENABLE_ATOMICS_BYPASS` option to bypass atomic operation for Serial backend only builds [\#6692](https://github.com/kokkos/kokkos/pull/6692)
* Add support for RISCV and the Milk-V's Pioneer [\#6773](https://github.com/kokkos/kokkos/pull/6773)
* Add C++26 standard to CMake setup [\#6733](https://github.com/kokkos/kokkos/pull/6733)
* Fix Makefile when using gnu_generate_makefile.sh and make >= 4.3 [\#6606](https://github.com/kokkos/kokkos/pull/6606)
* Cuda: Fix configuring with CMake >= 3.28.4 - temporary fallback to internal CudaToolkit.cmake [\#6898](https://github.com/kokkos/kokkos/pull/6898)

### Incompatibilities (i.e. breaking changes)
* Remove all `DEPRECATED_CODE_3` option and all code that was guarded by it [\#6523](https://github.com/kokkos/kokkos/pull/6523)
* Drop guards to accommodate external code defining `KOKKOS_ASSERT` [\#6665](https://github.com/kokkos/kokkos/pull/6665)
* `Profiling::ProfilingSection(std::string)` constructor marked explicit and nodiscard [\#6690](https://github.com/kokkos/kokkos/pull/6690)
* Add bound check preconditions for `RangePolicy` and `MDRangePolicy` [\#6617](https://github.com/kokkos/kokkos/pull/6617) [\#6726](https://github.com/kokkos/kokkos/pull/6726)
* Add checks for unsafe implicit conversions in RangePolicy [\#6754](https://github.com/kokkos/kokkos/pull/6754)
* Remove Kokkos::[b]half_t volatile overloads [\#6579](https://github.com/kokkos/kokkos/pull/6579)
* Remove KOKKOS_IMPL_DO_NOT_USE_PRINTF [\#6593](https://github.com/kokkos/kokkos/pull/6593)
* Check matching static extents in View constructor [\#5190 ](https://github.com/kokkos/kokkos/pull/5190)
* Tools(profiling): fix typo Kokkos_Tools_Optim[i]zationGoal [\#6642](https://github.com/kokkos/kokkos/pull/6642)
* Remove variadic range policy constructor (disallow passing multiple trailing chunk size arguments) [\#6845](https://github.com/kokkos/kokkos/pull/6845)
* Improve message on view out of bounds access and always abort [\#6861](https://github.com/kokkos/kokkos/pull/6861)
* Drop `KOKKOS_ENABLE_INTEL_MM_ALLOC` macro [\#6797](https://github.com/kokkos/kokkos/pull/6797)
* Remove `Kokkos::Experimental::LogicalMemorySpace` (without going through deprecation) [\#6557](https://github.com/kokkos/kokkos/pull/6557)
* Remove `Experimental::HBWSpace` and support for linking against memkind [\#6791](https://github.com/kokkos/kokkos/pull/6791)
* Drop librt TPL and associated `KOKKOS_ENABLE_LIBRT` macro [\#6798](https://github.com/kokkos/kokkos/pull/6798)
* Drop support for old CPU architectures (`ARCH_BGQ`, `ARCH_POWER7`, `ARCH_WSM` and associated `ARCH_SSE4` macro) [\#6806](https://github.com/kokkos/kokkos/pull/6806)
* Drop support for deprecated command-line arguments and environment variables [\#6744](https://github.com/kokkos/kokkos/pull/6744)

### Deprecations
* Provide kokkos_swap as part of Core and deprecate Experimental::swap in Algorithms [\#6697](https://github.com/kokkos/kokkos/pull/6697)
* Deprecate {Cuda,HIP}::detect_device_count() and Cuda::[detect_]device_arch() [\#6710](https://github.com/kokkos/kokkos/pull/6710)
* Deprecate `ExecutionSpace::in_parallel()` [\#6582](https://github.com/kokkos/kokkos/pull/6582)

### Bug Fixes
* Fix team-level MDRange reductions: [\#6511](https://github.com/kokkos/kokkos/pull/6511)
* Fix CUDA and SYCL small value type (16-bit) team reductions [\#5334](https://github.com/kokkos/kokkos/pull/5334)
* Enable `{transform_}exclusive_scan` in place [\#6667](https://github.com/kokkos/kokkos/pull/6667)
* `fill_random` overload that do not take an execution space instance argument should fence [\#6658](https://github.com/kokkos/kokkos/pull/6658)
* HIP,Cuda,OpenMPTarget: Fixup use provided execution space when copying host inaccessible reduction result [\#6777](https://github.com/kokkos/kokkos/pull/6777)
* Fix typo in `cuda_func_set_attribute[s]_wrapper` preventing proper setting of desired occupancy [\#6786](https://github.com/kokkos/kokkos/pull/6786)
* Avoid undefined behavior due to conversion between signed and unsigned integers in shift_{right, left}_team_impl [\#6821](https://github.com/kokkos/kokkos/pull/6821)
* Fix a bug in Makefile.kokkos when using AMD GPU architectures as `AMD_GFXYYY` [\#6892](https://github.com/kokkos/kokkos/pull/6892)

## [4.2.01](https://github.com/kokkos/kokkos/tree/4.2.01) (2023-12-07)
[Full Changelog](https://github.com/kokkos/kokkos/compare/4.2.00...4.2.01)

Expand Down
49 changes: 8 additions & 41 deletions Copyright.txt
Original file line number Diff line number Diff line change
@@ -1,41 +1,8 @@
//@HEADER
// ************************************************************************
//
// Kokkos v. 3.0
// Copyright (2020) National Technology & Engineering
// Solutions of Sandia, LLC (NTESS).
//
// Under the terms of Contract DE-NA0003525 with NTESS,
// the U.S. Government retains certain rights in this software.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// 1. Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
//
// 2. Redistributions in binary form must reproduce the above copyright
// notice, this list of conditions and the following disclaimer in the
// documentation and/or other materials provided with the distribution.
//
// 3. Neither the name of the Corporation nor the names of the
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
// PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
// LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Questions? Contact Christian R. Trott ([email protected])
//
// ************************************************************************
//@HEADER
************************************************************************

Kokkos v. 4.0
Copyright (2022) National Technology & Engineering
Solutions of Sandia, LLC (NTESS).

Under the terms of Contract DE-NA0003525 with NTESS,
the U.S. Government retains certain rights in this software.
10 changes: 0 additions & 10 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,13 +1,3 @@
************************************************************************

Kokkos v. 4.0
Copyright (2022) National Technology & Engineering
Solutions of Sandia, LLC (NTESS).

Under the terms of Contract DE-NA0003525 with NTESS,
the U.S. Government retains certain rights in this software.


==============================================================================
Kokkos is under the Apache License v2.0 with LLVM Exceptions:
==============================================================================
Expand Down
Loading