Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate CUB iterators existing in Thrust #3304

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

bernhardmgruber
Copy link
Contributor

Fixes: #3261

cub/test/catch2_test_device_reduce.cuh Outdated Show resolved Hide resolved
@@ -231,7 +231,7 @@ struct dispatch_streaming_arg_reduce_t
cudaStream_t stream)
{
// Constant iterator to provide the offset of the current partition for the user-provided input iterator
using constant_offset_it_t = ConstantInputIterator<GlobalOffsetT>;
using constant_offset_it_t = THRUST_NS_QUALIFIER::constant_iterator<GlobalOffsetT>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please check if the sass remains the same after this change for cub.bench.reduce.arg_extrema.base?

@bernhardmgruber bernhardmgruber force-pushed the depr_cub_iterators branch 5 times, most recently from d66b741 to 286521b Compare January 13, 2025 11:21
Copy link

copy-pr-bot bot commented Jan 13, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@bernhardmgruber
Copy link
Contributor Author

/ok to test

Copy link
Contributor

🟨 CI finished in 1h 57m: Pass: 92%/78 | Total: 2d 04h | Avg: 40m 14s | Max: 1h 11m | Hits: 180%/12368
  • 🟨 cub: Pass: 89%/38 | Total: 1d 07h | Avg: 50m 19s | Max: 1h 11m | Hits: 79%/3108

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  88%/36  | Total:  1d 05h | Avg: 49m 50s | Max:  1h 11m | Hits:  79%/3108  
      🟩 arm64              Pass: 100%/2   | Total:  1h 57m | Avg: 58m 58s | Max:  1h 00m
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 05m
      🔍 nvcc               Pass:  88%/36  | Total:  1d 05h | Avg: 49m 33s | Max:  1h 11m | Hits:  79%/3108  
    🔍 gpu: v100 🔍
      🟩 h100               Pass: 100%/2   | Total: 40m 59s | Avg: 20m 29s | Max: 24m 56s
      🔍 v100               Pass:  88%/36  | Total:  1d 07h | Avg: 51m 58s | Max:  1h 11m | Hits:  79%/3108  
    🟨 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 55m | Avg: 59m 09s | Max:  1h 05m | Hits:  81%/777   
      🟥 12.5               Pass:   0%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 10m
      🟨 12.6               Pass:  93%/31  | Total:  1d 00h | Avg: 47m 48s | Max:  1h 11m | Hits:  79%/2331  
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 05m
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 55m | Avg: 59m 09s | Max:  1h 05m | Hits:  81%/777   
      🟥 nvcc12.5           Pass:   0%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 10m
      🟨 nvcc12.6           Pass:  93%/29  | Total: 22h 34m | Avg: 46m 41s | Max:  1h 11m | Hits:  79%/2331  
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 50m | Avg: 57m 36s | Max:  1h 00m
      🟩 Clang15            Pass: 100%/1   | Total: 54m 23s | Avg: 54m 23s | Max: 54m 23s
      🟩 Clang16            Pass: 100%/1   | Total: 53m 32s | Avg: 53m 32s | Max: 53m 32s
      🟩 Clang17            Pass: 100%/1   | Total: 57m 52s | Avg: 57m 52s | Max: 57m 52s
      🟨 Clang18            Pass:  85%/7   | Total:  5h 25m | Avg: 46m 27s | Max:  1h 05m
      🟩 GCC7               Pass: 100%/2   | Total:  1h 53m | Avg: 56m 36s | Max: 58m 57s
      🟩 GCC8               Pass: 100%/1   | Total: 52m 30s | Avg: 52m 30s | Max: 52m 30s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 52m | Avg: 56m 05s | Max: 56m 40s
      🟩 GCC10              Pass: 100%/1   | Total: 59m 01s | Avg: 59m 01s | Max: 59m 01s
      🟩 GCC11              Pass: 100%/1   | Total: 54m 32s | Avg: 54m 32s | Max: 54m 32s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 35m | Avg: 31m 50s | Max: 54m 31s
      🟨 GCC13              Pass:  87%/8   | Total:  4h 52m | Avg: 36m 32s | Max:  1h 00m
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 10m | Hits:  81%/1554  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 21m | Avg:  1h 10m | Max:  1h 11m | Hits:  77%/1554  
      🟥 NVHPC24.7          Pass:   0%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 10m
    🟨 cxx_family
      🟨 Clang              Pass:  92%/14  | Total: 12h 01m | Avg: 51m 31s | Max:  1h 05m
      🟨 GCC                Pass:  94%/18  | Total: 12h 59m | Avg: 43m 17s | Max:  1h 00m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 37m | Avg:  1h 09m | Max:  1h 11m | Hits:  79%/3108  
      🟥 NVHPC              Pass:   0%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 10m
    🟨 jobs
      🟨 Build              Pass:  93%/31  | Total:  1d 05h | Avg: 57m 27s | Max:  1h 11m | Hits:  79%/3108  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 21m 28s | Avg: 21m 28s | Max: 21m 28s
      🟩 GraphCapture       Pass: 100%/1   | Total: 17m 27s | Avg: 17m 27s | Max: 17m 27s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 02m | Avg: 20m 55s | Max: 26m 37s
      🟥 TestGPU            Pass:   0%/2   | Total: 29m 22s | Avg: 14m 41s | Max: 23m 48s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 40m 59s | Avg: 20m 29s | Max: 24m 56s
      🟩 90a                Pass: 100%/1   | Total: 23m 45s | Avg: 23m 45s | Max: 23m 45s
    🟨 std
      🟨 17                 Pass:  92%/14  | Total: 14h 07m | Avg:  1h 00m | Max:  1h 11m | Hits:  81%/2331  
      🟨 20                 Pass:  87%/24  | Total: 17h 44m | Avg: 44m 20s | Max:  1h 09m | Hits:  74%/777   
    
  • 🟨 cccl_c_parallel: Pass: 50%/2 | Total: 7m 08s | Avg: 3m 34s | Max: 4m 50s

    🚨 jobs: Test 🚨
      🟩 Build              Pass: 100%/1   | Total:  2m 18s | Avg:  2m 18s | Max:  2m 18s
      🔥 Test               Pass:   0%/1   | Total:  4m 50s | Avg:  4m 50s | Max:  4m 50s
    🟨 cpu
      🟨 amd64              Pass:  50%/2   | Total:  7m 08s | Avg:  3m 34s | Max:  4m 50s
    🟨 ctk
      🟨 12.6               Pass:  50%/2   | Total:  7m 08s | Avg:  3m 34s | Max:  4m 50s
    🟨 cudacxx
      🟨 nvcc12.6           Pass:  50%/2   | Total:  7m 08s | Avg:  3m 34s | Max:  4m 50s
    🟨 cudacxx_family
      🟨 nvcc               Pass:  50%/2   | Total:  7m 08s | Avg:  3m 34s | Max:  4m 50s
    🟨 cxx
      🟨 GCC13              Pass:  50%/2   | Total:  7m 08s | Avg:  3m 34s | Max:  4m 50s
    🟨 cxx_family
      🟨 GCC                Pass:  50%/2   | Total:  7m 08s | Avg:  3m 34s | Max:  4m 50s
    🟨 gpu
      🟨 v100               Pass:  50%/2   | Total:  7m 08s | Avg:  3m 34s | Max:  4m 50s
    
  • 🟥 python: Pass: 0%/1 | Total: 25m 02s | Avg: 25m 02s | Max: 25m 02s

    🟥 cpu
      🟥 amd64              Pass:   0%/1   | Total: 25m 02s | Avg: 25m 02s | Max: 25m 02s
    🟥 ctk
      🟥 12.6               Pass:   0%/1   | Total: 25m 02s | Avg: 25m 02s | Max: 25m 02s
    🟥 cudacxx
      🟥 nvcc12.6           Pass:   0%/1   | Total: 25m 02s | Avg: 25m 02s | Max: 25m 02s
    🟥 cudacxx_family
      🟥 nvcc               Pass:   0%/1   | Total: 25m 02s | Avg: 25m 02s | Max: 25m 02s
    🟥 cxx
      🟥 GCC13              Pass:   0%/1   | Total: 25m 02s | Avg: 25m 02s | Max: 25m 02s
    🟥 cxx_family
      🟥 GCC                Pass:   0%/1   | Total: 25m 02s | Avg: 25m 02s | Max: 25m 02s
    🟥 gpu
      🟥 v100               Pass:   0%/1   | Total: 25m 02s | Avg: 25m 02s | Max: 25m 02s
    🟥 jobs
      🟥 Test               Pass:   0%/1   | Total: 25m 02s | Avg: 25m 02s | Max: 25m 02s
    
  • 🟩 thrust: Pass: 100%/37 | Total: 19h 54m | Avg: 32m 16s | Max: 1h 02m | Hits: 213%/9260

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 43m 50s | Avg: 21m 55s | Max: 25m 30s
    🟩 cpu
      🟩 amd64              Pass: 100%/35  | Total: 18h 57m | Avg: 32m 29s | Max:  1h 02m | Hits: 213%/9260  
      🟩 arm64              Pass: 100%/2   | Total: 57m 19s | Avg: 28m 39s | Max: 30m 49s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  2h 57m | Avg: 35m 26s | Max: 53m 34s | Hits: 175%/1852  
      🟩 12.5               Pass: 100%/2   | Total:  1h 50m | Avg: 55m 03s | Max: 58m 31s
      🟩 12.6               Pass: 100%/30  | Total: 15h 07m | Avg: 30m 14s | Max:  1h 02m | Hits: 223%/7408  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 52m 52s | Avg: 26m 26s | Max: 28m 31s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  2h 57m | Avg: 35m 26s | Max: 53m 34s | Hits: 175%/1852  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 50m | Avg: 55m 03s | Max: 58m 31s
      🟩 nvcc12.6           Pass: 100%/28  | Total: 14h 14m | Avg: 30m 30s | Max:  1h 02m | Hits: 223%/7408  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 52m 52s | Avg: 26m 26s | Max: 28m 31s
      🟩 nvcc               Pass: 100%/35  | Total: 19h 01m | Avg: 32m 36s | Max:  1h 02m | Hits: 213%/9260  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  2h 05m | Avg: 31m 16s | Max: 32m 27s
      🟩 Clang15            Pass: 100%/1   | Total: 30m 29s | Avg: 30m 29s | Max: 30m 29s
      🟩 Clang16            Pass: 100%/1   | Total: 31m 36s | Avg: 31m 36s | Max: 31m 36s
      🟩 Clang17            Pass: 100%/1   | Total: 30m 49s | Avg: 30m 49s | Max: 30m 49s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 43m | Avg: 23m 20s | Max: 28m 58s
      🟩 GCC7               Pass: 100%/2   | Total: 58m 44s | Avg: 29m 22s | Max: 30m 11s
      🟩 GCC8               Pass: 100%/1   | Total: 29m 37s | Avg: 29m 37s | Max: 29m 37s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 06m | Avg: 33m 07s | Max: 33m 29s
      🟩 GCC10              Pass: 100%/1   | Total: 30m 47s | Avg: 30m 47s | Max: 30m 47s
      🟩 GCC11              Pass: 100%/1   | Total: 30m 09s | Avg: 30m 09s | Max: 30m 09s
      🟩 GCC12              Pass: 100%/1   | Total: 35m 31s | Avg: 35m 31s | Max: 35m 31s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 00m | Avg: 22m 32s | Max: 34m 02s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 51m | Avg: 55m 54s | Max: 58m 14s | Hits: 175%/3704  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 39m | Avg: 53m 12s | Max:  1h 02m | Hits: 239%/5556  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 50m | Avg: 55m 03s | Max: 58m 31s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/14  | Total:  6h 21m | Avg: 27m 14s | Max: 32m 27s
      🟩 GCC                Pass: 100%/16  | Total:  7h 11m | Avg: 26m 57s | Max: 35m 31s
      🟩 MSVC               Pass: 100%/5   | Total:  4h 31m | Avg: 54m 16s | Max:  1h 02m | Hits: 213%/9260  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 50m | Avg: 55m 03s | Max: 58m 31s
    🟩 gpu
      🟩 v100               Pass: 100%/37  | Total: 19h 54m | Avg: 32m 16s | Max:  1h 02m | Hits: 213%/9260  
    🟩 jobs
      🟩 Build              Pass: 100%/31  | Total: 18h 13m | Avg: 35m 16s | Max:  1h 02m | Hits: 175%/7408  
      🟩 TestCPU            Pass: 100%/3   | Total: 51m 28s | Avg: 17m 09s | Max: 36m 41s | Hits: 365%/1852  
      🟩 TestGPU            Pass: 100%/3   | Total: 49m 26s | Avg: 16m 28s | Max: 19m 42s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 19m 48s | Avg: 19m 48s | Max: 19m 48s
    🟩 std
      🟩 17                 Pass: 100%/14  | Total:  8h 55m | Avg: 38m 14s | Max:  1h 00m | Hits: 175%/5556  
      🟩 20                 Pass: 100%/21  | Total: 10h 15m | Avg: 29m 17s | Max:  1h 02m | Hits: 270%/3704  
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 78)

# Runner
53 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport branch/2.8.x cub For all items related to CUB
Projects
Status: In Review
Development

Successfully merging this pull request may close these issues.

Deprecate CUB iterators which exist in Thrust or libcu++
2 participants