Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use cuda::std::min/max in Thrust #3364

Merged
merged 4 commits into from
Jan 13, 2025

Conversation

bernhardmgruber
Copy link
Contributor

No description provided.

@bernhardmgruber bernhardmgruber requested review from a team as code owners January 13, 2025 00:55
@bernhardmgruber bernhardmgruber added the thrust For all items related to Thrust. label Jan 13, 2025
@bernhardmgruber bernhardmgruber requested a review from a team as a code owner January 13, 2025 09:59
Copy link

copy-pr-bot bot commented Jan 13, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@bernhardmgruber bernhardmgruber changed the title Use cuda::stdmin/max in Thrust Use cuda::std::min/max in Thrust Jan 13, 2025
@bernhardmgruber
Copy link
Contributor Author

/ok to test

1 similar comment
@bernhardmgruber
Copy link
Contributor Author

/ok to test

Copy link
Contributor

🟩 CI finished in 1h 51m: Pass: 100%/78 | Total: 2d 03h | Avg: 39m 25s | Max: 1h 09m | Hits: 295%/12340
  • 🟩 cub: Pass: 100%/38 | Total: 1d 06h | Avg: 48m 52s | Max: 1h 09m | Hits: 414%/3120

    🟩 cpu
      🟩 amd64              Pass: 100%/36  | Total:  1d 05h | Avg: 48m 32s | Max:  1h 09m | Hits: 414%/3120  
      🟩 arm64              Pass: 100%/2   | Total:  1h 49m | Avg: 54m 52s | Max: 55m 44s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 54m | Avg: 58m 57s | Max:  1h 06m | Hits: 415%/780   
      🟩 12.5               Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 07m
      🟩 12.6               Pass: 100%/31  | Total: 23h 48m | Avg: 46m 04s | Max:  1h 09m | Hits: 414%/2340  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 58m | Avg: 59m 12s | Max:  1h 02m
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 54m | Avg: 58m 57s | Max:  1h 06m | Hits: 415%/780   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 07m
      🟩 nvcc12.6           Pass: 100%/29  | Total: 21h 49m | Avg: 45m 09s | Max:  1h 09m | Hits: 414%/2340  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 58m | Avg: 59m 12s | Max:  1h 02m
      🟩 nvcc               Pass: 100%/36  | Total:  1d 04h | Avg: 48m 17s | Max:  1h 09m | Hits: 414%/3120  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 49m | Avg: 57m 19s | Max:  1h 01m
      🟩 Clang15            Pass: 100%/1   | Total: 51m 31s | Avg: 51m 31s | Max: 51m 31s
      🟩 Clang16            Pass: 100%/1   | Total: 51m 55s | Avg: 51m 55s | Max: 51m 55s
      🟩 Clang17            Pass: 100%/1   | Total: 52m 40s | Avg: 52m 40s | Max: 52m 40s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 25m | Avg: 46m 34s | Max:  1h 02m
      🟩 GCC7               Pass: 100%/2   | Total:  1h 51m | Avg: 55m 55s | Max: 56m 00s
      🟩 GCC8               Pass: 100%/1   | Total: 50m 37s | Avg: 50m 37s | Max: 50m 37s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 44m | Avg: 52m 27s | Max: 52m 28s
      🟩 GCC10              Pass: 100%/1   | Total: 54m 09s | Avg: 54m 09s | Max: 54m 09s
      🟩 GCC11              Pass: 100%/1   | Total: 56m 21s | Avg: 56m 21s | Max: 56m 21s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 38m | Avg: 32m 46s | Max: 56m 29s
      🟩 GCC13              Pass: 100%/8   | Total:  4h 24m | Avg: 33m 01s | Max: 55m 44s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 16m | Avg:  1h 08m | Max:  1h 09m | Hits: 415%/1560  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 08m | Hits: 414%/1560  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 07m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/14  | Total: 11h 51m | Avg: 50m 48s | Max:  1h 02m
      🟩 GCC                Pass: 100%/18  | Total: 12h 20m | Avg: 41m 07s | Max: 56m 29s
      🟩 MSVC               Pass: 100%/4   | Total:  4h 31m | Avg:  1h 07m | Max:  1h 09m | Hits: 414%/3120  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 07m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 41m 49s | Avg: 20m 54s | Max: 25m 43s
      🟩 v100               Pass: 100%/36  | Total:  1d 06h | Avg: 50m 25s | Max:  1h 09m | Hits: 414%/3120  
    🟩 jobs
      🟩 Build              Pass: 100%/31  | Total:  1d 04h | Avg: 55m 24s | Max:  1h 09m | Hits: 414%/3120  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 21m 44s | Avg: 21m 44s | Max: 21m 44s
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 55s | Avg: 16m 55s | Max: 16m 55s
      🟩 HostLaunch         Pass: 100%/3   | Total: 54m 47s | Avg: 18m 15s | Max: 20m 29s
      🟩 TestGPU            Pass: 100%/2   | Total: 46m 10s | Avg: 23m 05s | Max: 25m 08s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 41m 49s | Avg: 20m 54s | Max: 25m 43s
      🟩 90a                Pass: 100%/1   | Total: 22m 19s | Avg: 22m 19s | Max: 22m 19s
    🟩 std
      🟩 17                 Pass: 100%/14  | Total: 13h 32m | Avg: 58m 03s | Max:  1h 09m | Hits: 415%/2340  
      🟩 20                 Pass: 100%/24  | Total: 17h 24m | Avg: 43m 30s | Max:  1h 06m | Hits: 413%/780   
    
  • 🟩 thrust: Pass: 100%/37 | Total: 19h 36m | Avg: 31m 47s | Max: 59m 38s | Hits: 254%/9220

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 41m 55s | Avg: 20m 57s | Max: 25m 51s
    🟩 cpu
      🟩 amd64              Pass: 100%/35  | Total: 18h 39m | Avg: 31m 59s | Max: 59m 38s | Hits: 254%/9220  
      🟩 arm64              Pass: 100%/2   | Total: 56m 27s | Avg: 28m 13s | Max: 29m 48s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  3h 03m | Avg: 36m 37s | Max: 59m 38s | Hits: 227%/1844  
      🟩 12.5               Pass: 100%/2   | Total:  1h 44m | Avg: 52m 13s | Max: 55m 21s
      🟩 12.6               Pass: 100%/30  | Total: 14h 48m | Avg: 29m 37s | Max: 58m 23s | Hits: 261%/7376  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 53m 50s | Avg: 26m 55s | Max: 27m 17s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 03m | Avg: 36m 37s | Max: 59m 38s | Hits: 227%/1844  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 44m | Avg: 52m 13s | Max: 55m 21s
      🟩 nvcc12.6           Pass: 100%/28  | Total: 13h 54m | Avg: 29m 48s | Max: 58m 23s | Hits: 261%/7376  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 53m 50s | Avg: 26m 55s | Max: 27m 17s
      🟩 nvcc               Pass: 100%/35  | Total: 18h 42m | Avg: 32m 03s | Max: 59m 38s | Hits: 254%/9220  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  1h 56m | Avg: 29m 14s | Max: 30m 55s
      🟩 Clang15            Pass: 100%/1   | Total: 32m 09s | Avg: 32m 09s | Max: 32m 09s
      🟩 Clang16            Pass: 100%/1   | Total: 29m 09s | Avg: 29m 09s | Max: 29m 09s
      🟩 Clang17            Pass: 100%/1   | Total: 32m 42s | Avg: 32m 42s | Max: 32m 42s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 45m | Avg: 23m 34s | Max: 29m 42s
      🟩 GCC7               Pass: 100%/2   | Total: 58m 28s | Avg: 29m 14s | Max: 29m 43s
      🟩 GCC8               Pass: 100%/1   | Total: 32m 29s | Avg: 32m 29s | Max: 32m 29s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 04m | Avg: 32m 12s | Max: 35m 05s
      🟩 GCC10              Pass: 100%/1   | Total: 34m 07s | Avg: 34m 07s | Max: 34m 07s
      🟩 GCC11              Pass: 100%/1   | Total: 34m 16s | Avg: 34m 16s | Max: 34m 16s
      🟩 GCC12              Pass: 100%/1   | Total: 31m 17s | Avg: 31m 17s | Max: 31m 17s
      🟩 GCC13              Pass: 100%/8   | Total:  2h 58m | Avg: 22m 20s | Max: 35m 54s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 52m | Avg: 56m 16s | Max: 59m 38s | Hits: 227%/3688  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 29m | Avg: 49m 46s | Max: 58m 23s | Hits: 273%/5532  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 44m | Avg: 52m 13s | Max: 55m 21s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/14  | Total:  6h 16m | Avg: 26m 51s | Max: 32m 42s
      🟩 GCC                Pass: 100%/16  | Total:  7h 13m | Avg: 27m 06s | Max: 35m 54s
      🟩 MSVC               Pass: 100%/5   | Total:  4h 21m | Avg: 52m 22s | Max: 59m 38s | Hits: 254%/9220  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 44m | Avg: 52m 13s | Max: 55m 21s
    🟩 gpu
      🟩 v100               Pass: 100%/37  | Total: 19h 36m | Avg: 31m 47s | Max: 59m 38s | Hits: 254%/9220  
    🟩 jobs
      🟩 Build              Pass: 100%/31  | Total: 17h 56m | Avg: 34m 43s | Max: 59m 38s | Hits: 227%/7376  
      🟩 TestCPU            Pass: 100%/3   | Total: 52m 43s | Avg: 17m 34s | Max: 37m 06s | Hits: 365%/1844  
      🟩 TestGPU            Pass: 100%/3   | Total: 47m 03s | Avg: 15m 41s | Max: 18m 42s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 17m 04s | Avg: 17m 04s | Max: 17m 04s
    🟩 std
      🟩 17                 Pass: 100%/14  | Total:  8h 38m | Avg: 37m 02s | Max: 59m 38s | Hits: 227%/5532  
      🟩 20                 Pass: 100%/21  | Total: 10h 15m | Avg: 29m 19s | Max: 58m 23s | Hits: 296%/3688  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 10m 36s | Avg: 5m 18s | Max: 8m 33s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 10m 36s | Avg:  5m 18s | Max:  8m 33s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total: 10m 36s | Avg:  5m 18s | Max:  8m 33s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total: 10m 36s | Avg:  5m 18s | Max:  8m 33s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 10m 36s | Avg:  5m 18s | Max:  8m 33s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 10m 36s | Avg:  5m 18s | Max:  8m 33s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 10m 36s | Avg:  5m 18s | Max:  8m 33s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total: 10m 36s | Avg:  5m 18s | Max:  8m 33s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 03s | Avg:  2m 03s | Max:  2m 03s
      🟩 Test               Pass: 100%/1   | Total:  8m 33s | Avg:  8m 33s | Max:  8m 33s
    
  • 🟩 python: Pass: 100%/1 | Total: 30m 54s | Avg: 30m 54s | Max: 30m 54s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 30m 54s | Avg: 30m 54s | Max: 30m 54s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 30m 54s | Avg: 30m 54s | Max: 30m 54s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 30m 54s | Avg: 30m 54s | Max: 30m 54s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 30m 54s | Avg: 30m 54s | Max: 30m 54s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 30m 54s | Avg: 30m 54s | Max: 30m 54s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 30m 54s | Avg: 30m 54s | Max: 30m 54s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 30m 54s | Avg: 30m 54s | Max: 30m 54s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 30m 54s | Avg: 30m 54s | Max: 30m 54s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 78)

# Runner
53 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@bernhardmgruber bernhardmgruber merged commit c339a52 into NVIDIA:main Jan 13, 2025
92 checks passed
@bernhardmgruber bernhardmgruber deleted the thrust_minmax branch January 13, 2025 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
thrust For all items related to Thrust.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants