Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PTX shfl_sync #3241

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

PTX shfl_sync #3241

wants to merge 10 commits into from

Conversation

fbusato
Copy link
Contributor

@fbusato fbusato commented Jan 4, 2025

Related to #2976

Description

Provide C++ implementation of PTX shfl_sync.

In addition to CUDA intrinsics, the function provide the following features:

  • Returns the "lane predicate" can be used for subsequent operations instead of check the lane validity manually.
  • Perform basic input checks

@fbusato fbusato requested review from a team as code owners January 4, 2025 00:46
@fbusato fbusato requested review from miscco and wmaxey January 4, 2025 00:46
Copy link
Contributor

github-actions bot commented Jan 4, 2025

🟩 CI finished in 1h 49m: Pass: 100%/170 | Total: 3d 02h | Avg: 26m 12s | Max: 1h 08m | Hits: 76%/22526
  • 🟩 libcudacxx: Pass: 100%/48 | Total: 7h 21m | Avg: 9m 11s | Max: 32m 13s | Hits: 84%/9822

    🟩 cpu
      🟩 amd64              Pass: 100%/46  | Total:  7h 13m | Avg:  9m 26s | Max: 32m 13s | Hits:  84%/9822  
      🟩 arm64              Pass: 100%/2   | Total:  7m 36s | Avg:  3m 48s | Max:  3m 51s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total: 38m 54s | Avg:  5m 33s | Max: 20m 16s | Hits:  96%/2241  
      🟩 12.5               Pass: 100%/2   | Total: 17m 28s | Avg:  8m 44s | Max:  9m 05s
      🟩 12.6               Pass: 100%/39  | Total:  6h 25m | Avg:  9m 52s | Max: 32m 13s | Hits:  80%/7581  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/4   | Total:  1h 06m | Avg: 16m 43s | Max: 21m 42s
      🟩 nvcc11.1           Pass: 100%/7   | Total: 38m 54s | Avg:  5m 33s | Max: 20m 16s | Hits:  96%/2241  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 17m 28s | Avg:  8m 44s | Max:  9m 05s
      🟩 nvcc12.6           Pass: 100%/35  | Total:  5h 18m | Avg:  9m 05s | Max: 32m 13s | Hits:  80%/7581  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/4   | Total:  1h 06m | Avg: 16m 43s | Max: 21m 42s
      🟩 nvcc               Pass: 100%/44  | Total:  6h 14m | Avg:  8m 30s | Max: 32m 13s | Hits:  84%/9822  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total: 15m 54s | Avg:  3m 58s | Max:  4m 44s
      🟩 Clang10            Pass: 100%/1   | Total:  5m 13s | Avg:  5m 13s | Max:  5m 13s
      🟩 Clang11            Pass: 100%/1   | Total:  4m 46s | Avg:  4m 46s | Max:  4m 46s
      🟩 Clang12            Pass: 100%/1   | Total:  4m 30s | Avg:  4m 30s | Max:  4m 30s
      🟩 Clang13            Pass: 100%/1   | Total:  4m 15s | Avg:  4m 15s | Max:  4m 15s
      🟩 Clang14            Pass: 100%/1   | Total:  4m 12s | Avg:  4m 12s | Max:  4m 12s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 17s | Avg:  4m 17s | Max:  4m 17s
      🟩 Clang16            Pass: 100%/1   | Total:  4m 34s | Avg:  4m 34s | Max:  4m 34s
      🟩 Clang17            Pass: 100%/1   | Total:  4m 45s | Avg:  4m 45s | Max:  4m 45s
      🟩 Clang18            Pass: 100%/8   | Total:  1h 37m | Avg: 12m 10s | Max: 21m 42s
      🟩 GCC6               Pass: 100%/2   | Total:  5m 54s | Avg:  2m 57s | Max:  3m 04s
      🟩 GCC7               Pass: 100%/2   | Total:  6m 57s | Avg:  3m 28s | Max:  3m 29s
      🟩 GCC8               Pass: 100%/1   | Total:  3m 38s | Avg:  3m 38s | Max:  3m 38s
      🟩 GCC9               Pass: 100%/3   | Total:  9m 36s | Avg:  3m 12s | Max:  3m 46s
      🟩 GCC10              Pass: 100%/1   | Total:  3m 55s | Avg:  3m 55s | Max:  3m 55s
      🟩 GCC11              Pass: 100%/1   | Total:  3m 55s | Avg:  3m 55s | Max:  3m 55s
      🟩 GCC12              Pass: 100%/1   | Total:  4m 09s | Avg:  4m 09s | Max:  4m 09s
      🟩 GCC13              Pass: 100%/10  | Total:  2h 37m | Avg: 15m 45s | Max: 32m 13s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  5m 46s | Avg:  5m 46s | Max:  5m 46s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 20m 16s | Avg: 20m 16s | Max: 20m 16s | Hits:  96%/2241  
      🟩 MSVC14.29          Pass: 100%/1   | Total: 25m 52s | Avg: 25m 52s | Max: 25m 52s | Hits:  47%/2478  
      🟩 MSVC14.39          Pass: 100%/2   | Total: 26m 42s | Avg: 13m 21s | Max: 13m 52s | Hits:  96%/5103  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 17m 28s | Avg:  8m 44s | Max:  9m 05s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/20  | Total:  2h 29m | Avg:  7m 29s | Max: 21m 42s
      🟩 GCC                Pass: 100%/21  | Total:  3h 15m | Avg:  9m 19s | Max: 32m 13s
      🟩 Intel              Pass: 100%/1   | Total:  5m 46s | Avg:  5m 46s | Max:  5m 46s
      🟩 MSVC               Pass: 100%/4   | Total:  1h 12m | Avg: 18m 12s | Max: 25m 52s | Hits:  84%/9822  
      🟩 NVHPC              Pass: 100%/2   | Total: 17m 28s | Avg:  8m 44s | Max:  9m 05s
    🟩 gpu
      🟩 v100               Pass: 100%/48  | Total:  7h 21m | Avg:  9m 11s | Max: 32m 13s | Hits:  84%/9822  
    🟩 jobs
      🟩 Build              Pass: 100%/41  | Total:  4h 46m | Avg:  6m 58s | Max: 25m 52s | Hits:  84%/9822  
      🟩 NVRTC              Pass: 100%/4   | Total:  1h 52m | Avg: 28m 12s | Max: 32m 13s
      🟩 Test               Pass: 100%/2   | Total: 40m 29s | Avg: 20m 14s | Max: 22m 58s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  1m 58s | Avg:  1m 58s | Max:  1m 58s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total: 12m 22s | Avg: 12m 22s | Max: 12m 22s
      🟩 90a                Pass: 100%/2   | Total: 20m 31s | Avg: 10m 15s | Max: 12m 46s
    🟩 std
      🟩 11                 Pass: 100%/6   | Total: 37m 42s | Avg:  6m 17s | Max: 21m 11s
      🟩 14                 Pass: 100%/5   | Total:  1h 03m | Avg: 12m 45s | Max: 32m 13s | Hits:  96%/2241  
      🟩 17                 Pass: 100%/13  | Total:  2h 11m | Avg: 10m 07s | Max: 29m 55s | Hits:  72%/4956  
      🟩 20                 Pass: 100%/23  | Total:  3h 26m | Avg:  8m 58s | Max: 29m 29s | Hits:  96%/2625  
    
  • 🟩 cub: Pass: 100%/47 | Total: 1d 14h | Avg: 48m 38s | Max: 1h 03m | Hits: 66%/3132

    🟩 cpu
      🟩 amd64              Pass: 100%/45  | Total:  1d 12h | Avg: 48m 19s | Max:  1h 03m | Hits:  66%/3132  
      🟩 arm64              Pass: 100%/2   | Total:  1h 51m | Avg: 55m 49s | Max: 56m 57s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  5h 37m | Avg: 48m 11s | Max: 53m 42s | Hits:  66%/783   
      🟩 12.5               Pass: 100%/2   | Total:  2h 04m | Avg:  1h 02m | Max:  1h 03m
      🟩 12.6               Pass: 100%/38  | Total:  1d 06h | Avg: 47m 59s | Max:  1h 01m | Hits:  66%/2349  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 01m
      🟩 nvcc11.1           Pass: 100%/7   | Total:  5h 37m | Avg: 48m 11s | Max: 53m 42s | Hits:  66%/783   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 04m | Avg:  1h 02m | Max:  1h 03m
      🟩 nvcc12.6           Pass: 100%/36  | Total:  1d 04h | Avg: 47m 18s | Max:  1h 01m | Hits:  66%/2349  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 01m
      🟩 nvcc               Pass: 100%/45  | Total:  1d 12h | Avg: 48m 07s | Max:  1h 03m | Hits:  66%/3132  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  3h 18m | Avg: 49m 36s | Max: 53m 49s
      🟩 Clang10            Pass: 100%/1   | Total: 54m 06s | Avg: 54m 06s | Max: 54m 06s
      🟩 Clang11            Pass: 100%/1   | Total: 56m 32s | Avg: 56m 32s | Max: 56m 32s
      🟩 Clang12            Pass: 100%/1   | Total: 53m 25s | Avg: 53m 25s | Max: 53m 25s
      🟩 Clang13            Pass: 100%/1   | Total: 53m 09s | Avg: 53m 09s | Max: 53m 09s
      🟩 Clang14            Pass: 100%/1   | Total: 54m 27s | Avg: 54m 27s | Max: 54m 27s
      🟩 Clang15            Pass: 100%/1   | Total: 50m 31s | Avg: 50m 31s | Max: 50m 31s
      🟩 Clang16            Pass: 100%/1   | Total: 51m 59s | Avg: 51m 59s | Max: 51m 59s
      🟩 Clang17            Pass: 100%/1   | Total: 54m 15s | Avg: 54m 15s | Max: 54m 15s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 30m | Avg: 47m 14s | Max:  1h 01m
      🟩 GCC6               Pass: 100%/2   | Total:  1h 37m | Avg: 48m 52s | Max: 49m 15s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 47m | Avg: 53m 48s | Max: 56m 16s
      🟩 GCC8               Pass: 100%/1   | Total: 52m 27s | Avg: 52m 27s | Max: 52m 27s
      🟩 GCC9               Pass: 100%/3   | Total:  2h 31m | Avg: 50m 33s | Max: 57m 25s
      🟩 GCC10              Pass: 100%/1   | Total: 57m 03s | Avg: 57m 03s | Max: 57m 03s
      🟩 GCC11              Pass: 100%/1   | Total: 55m 58s | Avg: 55m 58s | Max: 55m 58s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 36m | Avg: 32m 02s | Max: 58m 12s
      🟩 GCC13              Pass: 100%/8   | Total:  4h 47m | Avg: 35m 59s | Max: 57m 27s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
      🟩 MSVC14.16          Pass: 100%/1   | Total: 53m 42s | Avg: 53m 42s | Max: 53m 42s | Hits:  66%/783   
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m | Hits:  66%/783   
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 00m | Hits:  66%/1566  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 04m | Avg:  1h 02m | Max:  1h 03m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total: 15h 57m | Avg: 50m 23s | Max:  1h 01m
      🟩 GCC                Pass: 100%/21  | Total: 15h 06m | Avg: 43m 10s | Max: 58m 12s
      🟩 Intel              Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
      🟩 MSVC               Pass: 100%/4   | Total:  3h 56m | Avg: 59m 01s | Max:  1h 01m | Hits:  66%/3132  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 04m | Avg:  1h 02m | Max:  1h 03m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 37m 56s | Avg: 18m 58s | Max: 22m 02s
      🟩 v100               Pass: 100%/45  | Total:  1d 13h | Avg: 49m 57s | Max:  1h 03m | Hits:  66%/3132  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total:  1d 11h | Avg: 53m 01s | Max:  1h 03m | Hits:  66%/3132  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 18m 34s | Avg: 18m 34s | Max: 18m 34s
      🟩 GraphCapture       Pass: 100%/1   | Total: 19m 49s | Avg: 19m 49s | Max: 19m 49s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 15m | Avg: 25m 11s | Max: 29m 59s
      🟩 TestGPU            Pass: 100%/2   | Total: 51m 21s | Avg: 25m 40s | Max: 28m 12s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 37m 56s | Avg: 18m 58s | Max: 22m 02s
      🟩 90a                Pass: 100%/1   | Total: 24m 57s | Avg: 24m 57s | Max: 24m 57s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  4h 04m | Avg: 48m 54s | Max: 52m 57s
      🟩 14                 Pass: 100%/4   | Total:  3h 33m | Avg: 53m 15s | Max: 56m 16s | Hits:  66%/783   
      🟩 17                 Pass: 100%/12  | Total: 11h 07m | Avg: 55m 39s | Max:  1h 01m | Hits:  66%/1566  
      🟩 20                 Pass: 100%/26  | Total: 19h 20m | Avg: 44m 38s | Max:  1h 03m | Hits:  66%/783   
    
  • 🟩 thrust: Pass: 100%/46 | Total: 1d 01h | Avg: 33m 32s | Max: 1h 08m | Hits: 70%/9260

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total:  1h 04m | Avg: 32m 29s | Max: 34m 41s
    🟩 cpu
      🟩 amd64              Pass: 100%/44  | Total:  1d 00h | Avg: 33m 27s | Max:  1h 08m | Hits:  70%/9260  
      🟩 arm64              Pass: 100%/2   | Total:  1h 10m | Avg: 35m 27s | Max: 38m 47s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  3h 36m | Avg: 30m 57s | Max: 53m 10s | Hits:  63%/1852  
      🟩 12.5               Pass: 100%/2   | Total:  1h 50m | Avg: 55m 04s | Max: 55m 24s
      🟩 12.6               Pass: 100%/37  | Total: 20h 16m | Avg: 32m 52s | Max:  1h 08m | Hits:  72%/7408  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 01m | Avg: 30m 42s | Max: 32m 31s
      🟩 nvcc11.1           Pass: 100%/7   | Total:  3h 36m | Avg: 30m 57s | Max: 53m 10s | Hits:  63%/1852  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 50m | Avg: 55m 04s | Max: 55m 24s
      🟩 nvcc12.6           Pass: 100%/35  | Total: 19h 14m | Avg: 32m 59s | Max:  1h 08m | Hits:  72%/7408  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 01m | Avg: 30m 42s | Max: 32m 31s
      🟩 nvcc               Pass: 100%/44  | Total:  1d 00h | Avg: 33m 40s | Max:  1h 08m | Hits:  70%/9260  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  1h 50m | Avg: 27m 39s | Max: 31m 11s
      🟩 Clang10            Pass: 100%/1   | Total: 37m 13s | Avg: 37m 13s | Max: 37m 13s
      🟩 Clang11            Pass: 100%/1   | Total: 34m 37s | Avg: 34m 37s | Max: 34m 37s
      🟩 Clang12            Pass: 100%/1   | Total: 30m 30s | Avg: 30m 30s | Max: 30m 30s
      🟩 Clang13            Pass: 100%/1   | Total: 35m 16s | Avg: 35m 16s | Max: 35m 16s
      🟩 Clang14            Pass: 100%/1   | Total: 31m 19s | Avg: 31m 19s | Max: 31m 19s
      🟩 Clang15            Pass: 100%/1   | Total: 34m 35s | Avg: 34m 35s | Max: 34m 35s
      🟩 Clang16            Pass: 100%/1   | Total: 31m 27s | Avg: 31m 27s | Max: 31m 27s
      🟩 Clang17            Pass: 100%/1   | Total: 33m 26s | Avg: 33m 26s | Max: 33m 26s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 58m | Avg: 25m 29s | Max: 35m 30s
      🟩 GCC6               Pass: 100%/2   | Total: 55m 00s | Avg: 27m 30s | Max: 29m 50s
      🟩 GCC7               Pass: 100%/2   | Total: 58m 35s | Avg: 29m 17s | Max: 31m 18s
      🟩 GCC8               Pass: 100%/1   | Total: 36m 02s | Avg: 36m 02s | Max: 36m 02s
      🟩 GCC9               Pass: 100%/3   | Total:  1h 30m | Avg: 30m 12s | Max: 34m 33s
      🟩 GCC10              Pass: 100%/1   | Total: 32m 31s | Avg: 32m 31s | Max: 32m 31s
      🟩 GCC11              Pass: 100%/1   | Total: 35m 26s | Avg: 35m 26s | Max: 35m 26s
      🟩 GCC12              Pass: 100%/1   | Total: 36m 35s | Avg: 36m 35s | Max: 36m 35s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 39m | Avg: 27m 23s | Max: 38m 47s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 43m 09s | Avg: 43m 09s | Max: 43m 09s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 53m 10s | Avg: 53m 10s | Max: 53m 10s | Hits:  63%/1852  
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m | Hits:  63%/1852  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 31m | Avg: 50m 39s | Max:  1h 08m | Hits:  75%/5556  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 50m | Avg: 55m 04s | Max: 55m 24s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  9h 17m | Avg: 29m 20s | Max: 37m 13s
      🟩 GCC                Pass: 100%/19  | Total:  9h 23m | Avg: 29m 40s | Max: 38m 47s
      🟩 Intel              Pass: 100%/1   | Total: 43m 09s | Avg: 43m 09s | Max: 43m 09s
      🟩 MSVC               Pass: 100%/5   | Total:  4h 28m | Avg: 53m 40s | Max:  1h 08m | Hits:  70%/9260  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 50m | Avg: 55m 04s | Max: 55m 24s
    🟩 gpu
      🟩 v100               Pass: 100%/46  | Total:  1d 01h | Avg: 33m 32s | Max:  1h 08m | Hits:  70%/9260  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total:  1d 00h | Avg: 36m 08s | Max:  1h 08m | Hits:  63%/7408  
      🟩 TestCPU            Pass: 100%/3   | Total: 38m 21s | Avg: 12m 47s | Max: 22m 40s | Hits:  99%/1852  
      🟩 TestGPU            Pass: 100%/3   | Total: 58m 44s | Avg: 19m 34s | Max: 34m 41s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 20m 37s | Avg: 20m 37s | Max: 20m 37s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  2h 07m | Avg: 25m 30s | Max: 29m 09s
      🟩 14                 Pass: 100%/4   | Total:  2h 25m | Avg: 36m 22s | Max: 53m 10s | Hits:  63%/1852  
      🟩 17                 Pass: 100%/12  | Total:  8h 06m | Avg: 40m 31s | Max:  1h 03m | Hits:  63%/3704  
      🟩 20                 Pass: 100%/23  | Total: 11h 58m | Avg: 31m 15s | Max:  1h 08m | Hits:  81%/3704  
    
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 24m | Avg: 5m 34s | Max: 21m 56s | Hits: 90%/312

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  2h 10m | Avg:  5m 56s | Max: 21m 56s | Hits:  90%/312   
      🟩 arm64              Pass: 100%/4   | Total: 14m 02s | Avg:  3m 30s | Max:  3m 39s
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 18m 02s | Avg:  6m 00s | Max: 10m 59s | Hits:  90%/156   
      🟩 12.5               Pass: 100%/2   | Total: 11m 26s | Avg:  5m 43s | Max:  5m 50s
      🟩 12.6               Pass: 100%/21  | Total:  1h 55m | Avg:  5m 29s | Max: 21m 56s | Hits:  91%/156   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 18m 02s | Avg:  6m 00s | Max: 10m 59s | Hits:  90%/156   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 11m 26s | Avg:  5m 43s | Max:  5m 50s
      🟩 nvcc12.6           Pass: 100%/21  | Total:  1h 55m | Avg:  5m 29s | Max: 21m 56s | Hits:  91%/156   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 24m | Avg:  5m 34s | Max: 21m 56s | Hits:  90%/312   
    🟩 cxx
      🟩 Clang9             Pass: 100%/1   | Total:  3m 29s | Avg:  3m 29s | Max:  3m 29s
      🟩 Clang10            Pass: 100%/1   | Total:  4m 14s | Avg:  4m 14s | Max:  4m 14s
      🟩 Clang11            Pass: 100%/1   | Total:  3m 53s | Avg:  3m 53s | Max:  3m 53s
      🟩 Clang12            Pass: 100%/1   | Total:  3m 47s | Avg:  3m 47s | Max:  3m 47s
      🟩 Clang13            Pass: 100%/1   | Total:  4m 00s | Avg:  4m 00s | Max:  4m 00s
      🟩 Clang14            Pass: 100%/1   | Total:  3m 44s | Avg:  3m 44s | Max:  3m 44s
      🟩 Clang15            Pass: 100%/1   | Total:  3m 42s | Avg:  3m 42s | Max:  3m 42s
      🟩 Clang16            Pass: 100%/1   | Total:  3m 57s | Avg:  3m 57s | Max:  3m 57s
      🟩 Clang17            Pass: 100%/1   | Total:  4m 01s | Avg:  4m 01s | Max:  4m 01s
      🟩 Clang18            Pass: 100%/4   | Total: 32m 25s | Avg:  8m 06s | Max: 21m 56s
      🟩 GCC9               Pass: 100%/1   | Total:  3m 34s | Avg:  3m 34s | Max:  3m 34s
      🟩 GCC10              Pass: 100%/1   | Total:  3m 55s | Avg:  3m 55s | Max:  3m 55s
      🟩 GCC11              Pass: 100%/1   | Total:  3m 46s | Avg:  3m 46s | Max:  3m 46s
      🟩 GCC12              Pass: 100%/2   | Total: 20m 31s | Avg: 10m 15s | Max: 16m 40s
      🟩 GCC13              Pass: 100%/4   | Total: 13m 59s | Avg:  3m 29s | Max:  3m 39s
      🟩 MSVC14.36          Pass: 100%/1   | Total: 10m 59s | Avg: 10m 59s | Max: 10m 59s | Hits:  90%/156   
      🟩 MSVC14.39          Pass: 100%/1   | Total:  9m 29s | Avg:  9m 29s | Max:  9m 29s | Hits:  91%/156   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 11m 26s | Avg:  5m 43s | Max:  5m 50s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/13  | Total:  1h 07m | Avg:  5m 10s | Max: 21m 56s
      🟩 GCC                Pass: 100%/9   | Total: 45m 45s | Avg:  5m 05s | Max: 16m 40s
      🟩 MSVC               Pass: 100%/2   | Total: 20m 28s | Avg: 10m 14s | Max: 10m 59s | Hits:  90%/312   
      🟩 NVHPC              Pass: 100%/2   | Total: 11m 26s | Avg:  5m 43s | Max:  5m 50s
    🟩 gpu
      🟩 v100               Pass: 100%/26  | Total:  2h 24m | Avg:  5m 34s | Max: 21m 56s | Hits:  90%/312   
    🟩 jobs
      🟩 Build              Pass: 100%/24  | Total:  1h 46m | Avg:  4m 25s | Max: 10m 59s | Hits:  90%/312   
      🟩 Test               Pass: 100%/2   | Total: 38m 36s | Avg: 19m 18s | Max: 21m 56s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  3m 23s | Avg:  3m 23s | Max:  3m 23s
      🟩 90a                Pass: 100%/1   | Total:  3m 23s | Avg:  3m 23s | Max:  3m 23s
    🟩 std
      🟩 17                 Pass: 100%/6   | Total: 22m 57s | Avg:  3m 49s | Max:  5m 36s
      🟩 20                 Pass: 100%/20  | Total:  2h 01m | Avg:  6m 05s | Max: 21m 56s | Hits:  90%/312   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 10m 31s | Avg: 5m 15s | Max: 8m 23s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 10m 31s | Avg:  5m 15s | Max:  8m 23s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total: 10m 31s | Avg:  5m 15s | Max:  8m 23s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total: 10m 31s | Avg:  5m 15s | Max:  8m 23s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 10m 31s | Avg:  5m 15s | Max:  8m 23s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 10m 31s | Avg:  5m 15s | Max:  8m 23s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 10m 31s | Avg:  5m 15s | Max:  8m 23s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total: 10m 31s | Avg:  5m 15s | Max:  8m 23s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 08s | Avg:  2m 08s | Max:  2m 08s
      🟩 Test               Pass: 100%/1   | Total:  8m 23s | Avg:  8m 23s | Max:  8m 23s
    
  • 🟩 python: Pass: 100%/1 | Total: 28m 15s | Avg: 28m 15s | Max: 28m 15s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 28m 15s | Avg: 28m 15s | Max: 28m 15s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 28m 15s | Avg: 28m 15s | Max: 28m 15s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 28m 15s | Avg: 28m 15s | Max: 28m 15s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 28m 15s | Avg: 28m 15s | Max: 28m 15s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 28m 15s | Avg: 28m 15s | Max: 28m 15s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 28m 15s | Avg: 28m 15s | Max: 28m 15s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 28m 15s | Avg: 28m 15s | Max: 28m 15s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 28m 15s | Avg: 28m 15s | Max: 28m 15s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 170)

# Runner
125 linux-amd64-cpu16
19 linux-amd64-gpu-v100-latest-1
15 windows-amd64-cpu16
10 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@gevtushenko
Copy link
Collaborator

@fbusato please add a sub-issue to #101 on deprecating and later dropping shuffle fasilities from CUB (util_ptx.cuh) in fawor of libcu++ ones.

Copy link
Contributor

@bernhardmgruber bernhardmgruber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this a quick review. I would love to have @ahendriksen's opinion, since it touches his work on the PTX exposure. Also, he has a way better PTX understanding than me.

Copy link
Contributor

@ahendriksen ahendriksen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have sent some comments in private as well. The data parameter should be a template parameter to allow shuffling any 32-bit value.

docs/libcudacxx/ptx/instructions/manual/shfl_sync.rst Outdated Show resolved Hide resolved

template <dot_shfl_mode _ShuffleMode>
_CCCL_DEVICE static inline _CUDA_VSTD::uint32_t __shfl_sync_dst_lane(
shfl_mode_t<_ShuffleMode> __shfl_mode,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as @ahendriksen said this must be a template argument, otherwise it would not be usable in an if constexpr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit annoying that the compiler has to instantiate multiple functions, one for each type, that perform exactly the same functionality.

@bernhardmgruber
Copy link
Contributor

@fbusato please add a sub-issue to #101 on deprecating and later dropping shuffle fasilities from CUB (util_ptx.cuh) in fawor of libcu++ ones.

I could not find an item for this, so I added one now: "Review and deprecate features from CUB util_ptx.cuh"

Copy link
Contributor

github-actions bot commented Jan 7, 2025

🟩 CI finished in 1h 37m: Pass: 100%/170 | Total: 2d 17h | Avg: 23m 17s | Max: 1h 05m | Hits: 82%/22529
  • 🟩 libcudacxx: Pass: 100%/48 | Total: 7h 27m | Avg: 9m 19s | Max: 36m 11s | Hits: 98%/9825

    🟩 cpu
      🟩 amd64              Pass: 100%/46  | Total:  7h 19m | Avg:  9m 33s | Max: 36m 11s | Hits:  98%/9825  
      🟩 arm64              Pass: 100%/2   | Total:  7m 24s | Avg:  3m 42s | Max:  3m 48s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total: 37m 19s | Avg:  5m 19s | Max: 19m 20s | Hits:  97%/2241  
      🟩 12.5               Pass: 100%/2   | Total: 17m 50s | Avg:  8m 55s | Max:  9m 13s
      🟩 12.6               Pass: 100%/39  | Total:  6h 32m | Avg: 10m 03s | Max: 36m 11s | Hits:  98%/7584  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/4   | Total:  1h 07m | Avg: 16m 57s | Max: 22m 06s
      🟩 nvcc11.1           Pass: 100%/7   | Total: 37m 19s | Avg:  5m 19s | Max: 19m 20s | Hits:  97%/2241  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 17m 50s | Avg:  8m 55s | Max:  9m 13s
      🟩 nvcc12.6           Pass: 100%/35  | Total:  5h 24m | Avg:  9m 16s | Max: 36m 11s | Hits:  98%/7584  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/4   | Total:  1h 07m | Avg: 16m 57s | Max: 22m 06s
      🟩 nvcc               Pass: 100%/44  | Total:  6h 19m | Avg:  8m 37s | Max: 36m 11s | Hits:  98%/9825  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total: 15m 57s | Avg:  3m 59s | Max:  4m 59s
      🟩 Clang10            Pass: 100%/1   | Total:  5m 21s | Avg:  5m 21s | Max:  5m 21s
      🟩 Clang11            Pass: 100%/1   | Total:  4m 34s | Avg:  4m 34s | Max:  4m 34s
      🟩 Clang12            Pass: 100%/1   | Total:  4m 25s | Avg:  4m 25s | Max:  4m 25s
      🟩 Clang13            Pass: 100%/1   | Total:  4m 24s | Avg:  4m 24s | Max:  4m 24s
      🟩 Clang14            Pass: 100%/1   | Total:  4m 13s | Avg:  4m 13s | Max:  4m 13s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 37s | Avg:  4m 37s | Max:  4m 37s
      🟩 Clang16            Pass: 100%/1   | Total:  4m 21s | Avg:  4m 21s | Max:  4m 21s
      🟩 Clang17            Pass: 100%/1   | Total:  4m 32s | Avg:  4m 32s | Max:  4m 32s
      🟩 Clang18            Pass: 100%/8   | Total:  1h 56m | Avg: 14m 34s | Max: 36m 11s
      🟩 GCC6               Pass: 100%/2   | Total:  5m 13s | Avg:  2m 36s | Max:  2m 42s
      🟩 GCC7               Pass: 100%/2   | Total:  6m 37s | Avg:  3m 18s | Max:  3m 24s
      🟩 GCC8               Pass: 100%/1   | Total:  3m 49s | Avg:  3m 49s | Max:  3m 49s
      🟩 GCC9               Pass: 100%/3   | Total: 10m 25s | Avg:  3m 28s | Max:  4m 24s
      🟩 GCC10              Pass: 100%/1   | Total:  4m 04s | Avg:  4m 04s | Max:  4m 04s
      🟩 GCC11              Pass: 100%/1   | Total:  4m 01s | Avg:  4m 01s | Max:  4m 01s
      🟩 GCC12              Pass: 100%/1   | Total:  4m 08s | Avg:  4m 08s | Max:  4m 08s
      🟩 GCC13              Pass: 100%/10  | Total:  2h 32m | Avg: 15m 15s | Max: 35m 46s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  6m 12s | Avg:  6m 12s | Max:  6m 12s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 19m 20s | Avg: 19m 20s | Max: 19m 20s | Hits:  97%/2241  
      🟩 MSVC14.29          Pass: 100%/1   | Total: 13m 29s | Avg: 13m 29s | Max: 13m 29s | Hits:  98%/2479  
      🟩 MSVC14.39          Pass: 100%/2   | Total: 30m 35s | Avg: 15m 17s | Max: 15m 22s | Hits:  98%/5105  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 17m 50s | Avg:  8m 55s | Max:  9m 13s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/20  | Total:  2h 49m | Avg:  8m 27s | Max: 36m 11s
      🟩 GCC                Pass: 100%/21  | Total:  3h 10m | Avg:  9m 05s | Max: 35m 46s
      🟩 Intel              Pass: 100%/1   | Total:  6m 12s | Avg:  6m 12s | Max:  6m 12s
      🟩 MSVC               Pass: 100%/4   | Total:  1h 03m | Avg: 15m 51s | Max: 19m 20s | Hits:  98%/9825  
      🟩 NVHPC              Pass: 100%/2   | Total: 17m 50s | Avg:  8m 55s | Max:  9m 13s
    🟩 gpu
      🟩 v100               Pass: 100%/48  | Total:  7h 27m | Avg:  9m 19s | Max: 36m 11s | Hits:  98%/9825  
    🟩 jobs
      🟩 Build              Pass: 100%/41  | Total:  4h 38m | Avg:  6m 47s | Max: 22m 06s | Hits:  98%/9825  
      🟩 NVRTC              Pass: 100%/4   | Total:  1h 34m | Avg: 23m 39s | Max: 33m 46s
      🟩 Test               Pass: 100%/2   | Total:  1h 11m | Avg: 35m 58s | Max: 36m 11s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 03s | Avg:  2m 03s | Max:  2m 03s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total: 13m 07s | Avg: 13m 07s | Max: 13m 07s
      🟩 90a                Pass: 100%/2   | Total: 20m 41s | Avg: 10m 20s | Max: 12m 42s
    🟩 std
      🟩 11                 Pass: 100%/6   | Total: 34m 42s | Avg:  5m 47s | Max: 18m 48s
      🟩 14                 Pass: 100%/5   | Total:  1h 04m | Avg: 12m 50s | Max: 33m 46s | Hits:  97%/2241  
      🟩 17                 Pass: 100%/13  | Total:  1h 53m | Avg:  8m 44s | Max: 20m 52s | Hits:  98%/4958  
      🟩 20                 Pass: 100%/23  | Total:  3h 52m | Avg: 10m 07s | Max: 36m 11s | Hits:  97%/2626  
    
  • 🟩 cub: Pass: 100%/47 | Total: 1d 08h | Avg: 41m 58s | Max: 1h 05m | Hits: 66%/3132

    🟩 cpu
      🟩 amd64              Pass: 100%/45  | Total:  1d 07h | Avg: 41m 21s | Max:  1h 05m | Hits:  66%/3132  
      🟩 arm64              Pass: 100%/2   | Total:  1h 51m | Avg: 55m 52s | Max: 56m 02s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  3h 30m | Avg: 30m 05s | Max: 57m 46s | Hits:  66%/783   
      🟩 12.5               Pass: 100%/2   | Total:  2h 05m | Avg:  1h 02m | Max:  1h 03m
      🟩 12.6               Pass: 100%/38  | Total:  1d 03h | Avg: 43m 04s | Max:  1h 05m | Hits:  66%/2349  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 02m
      🟩 nvcc11.1           Pass: 100%/7   | Total:  3h 30m | Avg: 30m 05s | Max: 57m 46s | Hits:  66%/783   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 05m | Avg:  1h 02m | Max:  1h 03m
      🟩 nvcc12.6           Pass: 100%/36  | Total:  1d 01h | Avg: 42m 05s | Max:  1h 05m | Hits:  66%/2349  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 02m
      🟩 nvcc               Pass: 100%/45  | Total:  1d 06h | Avg: 41m 08s | Max:  1h 05m | Hits:  66%/3132  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  1h 13m | Avg: 18m 21s | Max: 49m 45s
      🟩 Clang10            Pass: 100%/1   | Total: 52m 50s | Avg: 52m 50s | Max: 52m 50s
      🟩 Clang11            Pass: 100%/1   | Total: 51m 42s | Avg: 51m 42s | Max: 51m 42s
      🟩 Clang12            Pass: 100%/1   | Total: 53m 09s | Avg: 53m 09s | Max: 53m 09s
      🟩 Clang13            Pass: 100%/1   | Total: 55m 22s | Avg: 55m 22s | Max: 55m 22s
      🟩 Clang14            Pass: 100%/1   | Total: 56m 55s | Avg: 56m 55s | Max: 56m 55s
      🟩 Clang15            Pass: 100%/1   | Total: 56m 39s | Avg: 56m 39s | Max: 56m 39s
      🟩 Clang16            Pass: 100%/1   | Total: 55m 11s | Avg: 55m 11s | Max: 55m 11s
      🟩 Clang17            Pass: 100%/1   | Total: 50m 48s | Avg: 50m 48s | Max: 50m 48s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 26m | Avg: 46m 39s | Max:  1h 02m
      🟩 GCC6               Pass: 100%/2   | Total: 47m 59s | Avg: 23m 59s | Max: 43m 25s
      🟩 GCC7               Pass: 100%/2   | Total: 17m 59s | Avg:  8m 59s | Max:  9m 00s
      🟩 GCC8               Pass: 100%/1   | Total: 59m 11s | Avg: 59m 11s | Max: 59m 11s
      🟩 GCC9               Pass: 100%/3   | Total:  1h 44m | Avg: 34m 53s | Max: 54m 01s
      🟩 GCC10              Pass: 100%/1   | Total: 57m 35s | Avg: 57m 35s | Max: 57m 35s
      🟩 GCC11              Pass: 100%/1   | Total: 57m 02s | Avg: 57m 02s | Max: 57m 02s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 32m | Avg: 30m 49s | Max: 52m 36s
      🟩 GCC13              Pass: 100%/8   | Total:  4h 32m | Avg: 34m 06s | Max:  1h 01m
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 58m 27s | Avg: 58m 27s | Max: 58m 27s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 57m 46s | Avg: 57m 46s | Max: 57m 46s | Hits:  66%/783   
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m | Hits:  66%/783   
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 05m | Hits:  66%/1566  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 05m | Avg:  1h 02m | Max:  1h 03m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total: 13h 52m | Avg: 43m 49s | Max:  1h 02m
      🟩 GCC                Pass: 100%/21  | Total: 11h 49m | Avg: 33m 47s | Max:  1h 01m
      🟩 Intel              Pass: 100%/1   | Total: 58m 27s | Avg: 58m 27s | Max: 58m 27s
      🟩 MSVC               Pass: 100%/4   | Total:  4h 06m | Avg:  1h 01m | Max:  1h 05m | Hits:  66%/3132  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 05m | Avg:  1h 02m | Max:  1h 03m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 39m 53s | Avg: 19m 56s | Max: 23m 51s
      🟩 v100               Pass: 100%/45  | Total:  1d 08h | Avg: 42m 57s | Max:  1h 05m | Hits:  66%/3132  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total:  1d 06h | Avg: 45m 56s | Max:  1h 05m | Hits:  66%/3132  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 20m 06s | Avg: 20m 06s | Max: 20m 06s
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 02s | Avg: 16m 02s | Max: 16m 02s
      🟩 HostLaunch         Pass: 100%/3   | Total: 52m 59s | Avg: 17m 39s | Max: 19m 37s
      🟩 TestGPU            Pass: 100%/2   | Total: 46m 11s | Avg: 23m 05s | Max: 23m 11s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 39m 53s | Avg: 19m 56s | Max: 23m 51s
      🟩 90a                Pass: 100%/1   | Total: 26m 04s | Avg: 26m 04s | Max: 26m 04s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  1h 11m | Avg: 14m 14s | Max: 43m 25s
      🟩 14                 Pass: 100%/4   | Total:  1h 20m | Avg: 20m 11s | Max: 57m 46s | Hits:  66%/783   
      🟩 17                 Pass: 100%/12  | Total: 11h 21m | Avg: 56m 45s | Max:  1h 05m | Hits:  66%/1566  
      🟩 20                 Pass: 100%/26  | Total: 19h 00m | Avg: 43m 50s | Max:  1h 02m | Hits:  66%/783   
    
  • 🟩 thrust: Pass: 100%/46 | Total: 22h 03m | Avg: 28m 46s | Max: 1h 04m | Hits: 70%/9260

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 46m 09s | Avg: 23m 04s | Max: 31m 02s
    🟩 cpu
      🟩 amd64              Pass: 100%/44  | Total: 21h 00m | Avg: 28m 39s | Max:  1h 04m | Hits:  70%/9260  
      🟩 arm64              Pass: 100%/2   | Total:  1h 02m | Avg: 31m 26s | Max: 34m 02s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  2h 11m | Avg: 18m 47s | Max: 55m 52s | Hits:  63%/1852  
      🟩 12.5               Pass: 100%/2   | Total:  1h 43m | Avg: 51m 38s | Max: 52m 03s
      🟩 12.6               Pass: 100%/37  | Total: 18h 09m | Avg: 29m 26s | Max:  1h 04m | Hits:  72%/7408  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 58m 39s | Avg: 29m 19s | Max: 31m 46s
      🟩 nvcc11.1           Pass: 100%/7   | Total:  2h 11m | Avg: 18m 47s | Max: 55m 52s | Hits:  63%/1852  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 43m | Avg: 51m 38s | Max: 52m 03s
      🟩 nvcc12.6           Pass: 100%/35  | Total: 17h 10m | Avg: 29m 26s | Max:  1h 04m | Hits:  72%/7408  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 58m 39s | Avg: 29m 19s | Max: 31m 46s
      🟩 nvcc               Pass: 100%/44  | Total: 21h 05m | Avg: 28m 45s | Max:  1h 04m | Hits:  70%/9260  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total: 43m 49s | Avg: 10m 57s | Max: 27m 39s
      🟩 Clang10            Pass: 100%/1   | Total: 39m 17s | Avg: 39m 17s | Max: 39m 17s
      🟩 Clang11            Pass: 100%/1   | Total: 32m 54s | Avg: 32m 54s | Max: 32m 54s
      🟩 Clang12            Pass: 100%/1   | Total: 31m 23s | Avg: 31m 23s | Max: 31m 23s
      🟩 Clang13            Pass: 100%/1   | Total: 34m 30s | Avg: 34m 30s | Max: 34m 30s
      🟩 Clang14            Pass: 100%/1   | Total: 35m 48s | Avg: 35m 48s | Max: 35m 48s
      🟩 Clang15            Pass: 100%/1   | Total: 32m 28s | Avg: 32m 28s | Max: 32m 28s
      🟩 Clang16            Pass: 100%/1   | Total: 32m 15s | Avg: 32m 15s | Max: 32m 15s
      🟩 Clang17            Pass: 100%/1   | Total: 36m 52s | Avg: 36m 52s | Max: 36m 52s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 50m | Avg: 24m 22s | Max: 32m 51s
      🟩 GCC6               Pass: 100%/2   | Total:  8m 31s | Avg:  4m 15s | Max:  4m 27s
      🟩 GCC7               Pass: 100%/2   | Total:  9m 37s | Avg:  4m 48s | Max:  4m 54s
      🟩 GCC8               Pass: 100%/1   | Total: 35m 24s | Avg: 35m 24s | Max: 35m 24s
      🟩 GCC9               Pass: 100%/3   | Total:  1h 08m | Avg: 22m 47s | Max: 33m 12s
      🟩 GCC10              Pass: 100%/1   | Total: 35m 49s | Avg: 35m 49s | Max: 35m 49s
      🟩 GCC11              Pass: 100%/1   | Total: 36m 22s | Avg: 36m 22s | Max: 36m 22s
      🟩 GCC12              Pass: 100%/1   | Total: 38m 51s | Avg: 38m 51s | Max: 38m 51s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 14m | Avg: 24m 19s | Max: 36m 33s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 43m 11s | Avg: 43m 11s | Max: 43m 11s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 55m 52s | Avg: 55m 52s | Max: 55m 52s | Hits:  63%/1852  
      🟩 MSVC14.29          Pass: 100%/1   | Total: 56m 39s | Avg: 56m 39s | Max: 56m 39s | Hits:  63%/1852  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 27m | Avg: 49m 10s | Max:  1h 04m | Hits:  75%/5556  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 43m | Avg: 51m 38s | Max: 52m 03s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  8h 09m | Avg: 25m 46s | Max: 39m 17s
      🟩 GCC                Pass: 100%/19  | Total:  7h 07m | Avg: 22m 30s | Max: 38m 51s
      🟩 Intel              Pass: 100%/1   | Total: 43m 11s | Avg: 43m 11s | Max: 43m 11s
      🟩 MSVC               Pass: 100%/5   | Total:  4h 20m | Avg: 52m 00s | Max:  1h 04m | Hits:  70%/9260  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 43m | Avg: 51m 38s | Max: 52m 03s
    🟩 gpu
      🟩 v100               Pass: 100%/46  | Total: 22h 03m | Avg: 28m 46s | Max:  1h 04m | Hits:  70%/9260  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total: 20h 41m | Avg: 31m 01s | Max:  1h 04m | Hits:  63%/7408  
      🟩 TestCPU            Pass: 100%/3   | Total: 39m 57s | Avg: 13m 19s | Max: 25m 10s | Hits:  99%/1852  
      🟩 TestGPU            Pass: 100%/3   | Total: 42m 42s | Avg: 14m 14s | Max: 15m 07s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 21m 53s | Avg: 21m 53s | Max: 21m 53s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total: 22m 46s | Avg:  4m 33s | Max:  5m 27s
      🟩 14                 Pass: 100%/4   | Total:  1h 11m | Avg: 17m 54s | Max: 55m 52s | Hits:  63%/1852  
      🟩 17                 Pass: 100%/12  | Total:  7h 46m | Avg: 38m 51s | Max: 58m 04s | Hits:  63%/3704  
      🟩 20                 Pass: 100%/23  | Total: 11h 56m | Avg: 31m 10s | Max:  1h 04m | Hits:  81%/3704  
    
  • 🟩 cudax: Pass: 100%/26 | Total: 3h 00m | Avg: 6m 55s | Max: 36m 20s | Hits: 90%/312

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  2h 46m | Avg:  7m 34s | Max: 36m 20s | Hits:  90%/312   
      🟩 arm64              Pass: 100%/4   | Total: 13m 22s | Avg:  3m 20s | Max:  3m 29s
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 16m 24s | Avg:  5m 28s | Max:  9m 15s | Hits:  90%/156   
      🟩 12.5               Pass: 100%/2   | Total: 12m 54s | Avg:  6m 27s | Max:  6m 47s
      🟩 12.6               Pass: 100%/21  | Total:  2h 30m | Avg:  7m 10s | Max: 36m 20s | Hits:  90%/156   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 16m 24s | Avg:  5m 28s | Max:  9m 15s | Hits:  90%/156   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 12m 54s | Avg:  6m 27s | Max:  6m 47s
      🟩 nvcc12.6           Pass: 100%/21  | Total:  2h 30m | Avg:  7m 10s | Max: 36m 20s | Hits:  90%/156   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  3h 00m | Avg:  6m 55s | Max: 36m 20s | Hits:  90%/312   
    🟩 cxx
      🟩 Clang9             Pass: 100%/1   | Total:  3m 50s | Avg:  3m 50s | Max:  3m 50s
      🟩 Clang10            Pass: 100%/1   | Total:  4m 24s | Avg:  4m 24s | Max:  4m 24s
      🟩 Clang11            Pass: 100%/1   | Total:  3m 46s | Avg:  3m 46s | Max:  3m 46s
      🟩 Clang12            Pass: 100%/1   | Total:  4m 06s | Avg:  4m 06s | Max:  4m 06s
      🟩 Clang13            Pass: 100%/1   | Total:  3m 41s | Avg:  3m 41s | Max:  3m 41s
      🟩 Clang14            Pass: 100%/1   | Total:  3m 41s | Avg:  3m 41s | Max:  3m 41s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 16s | Avg:  4m 16s | Max:  4m 16s
      🟩 Clang16            Pass: 100%/1   | Total:  4m 28s | Avg:  4m 28s | Max:  4m 28s
      🟩 Clang17            Pass: 100%/1   | Total:  4m 05s | Avg:  4m 05s | Max:  4m 05s
      🟩 Clang18            Pass: 100%/4   | Total: 46m 32s | Avg: 11m 38s | Max: 35m 52s
      🟩 GCC9               Pass: 100%/1   | Total:  3m 19s | Avg:  3m 19s | Max:  3m 19s
      🟩 GCC10              Pass: 100%/1   | Total:  4m 05s | Avg:  4m 05s | Max:  4m 05s
      🟩 GCC11              Pass: 100%/1   | Total:  3m 58s | Avg:  3m 58s | Max:  3m 58s
      🟩 GCC12              Pass: 100%/2   | Total: 40m 19s | Avg: 20m 09s | Max: 36m 20s
      🟩 GCC13              Pass: 100%/4   | Total: 13m 18s | Avg:  3m 19s | Max:  3m 29s
      🟩 MSVC14.36          Pass: 100%/1   | Total:  9m 15s | Avg:  9m 15s | Max:  9m 15s | Hits:  90%/156   
      🟩 MSVC14.39          Pass: 100%/1   | Total: 10m 06s | Avg: 10m 06s | Max: 10m 06s | Hits:  90%/156   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 12m 54s | Avg:  6m 27s | Max:  6m 47s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/13  | Total:  1h 22m | Avg:  6m 22s | Max: 35m 52s
      🟩 GCC                Pass: 100%/9   | Total:  1h 04m | Avg:  7m 13s | Max: 36m 20s
      🟩 MSVC               Pass: 100%/2   | Total: 19m 21s | Avg:  9m 40s | Max: 10m 06s | Hits:  90%/312   
      🟩 NVHPC              Pass: 100%/2   | Total: 12m 54s | Avg:  6m 27s | Max:  6m 47s
    🟩 gpu
      🟩 v100               Pass: 100%/26  | Total:  3h 00m | Avg:  6m 55s | Max: 36m 20s | Hits:  90%/312   
    🟩 jobs
      🟩 Build              Pass: 100%/24  | Total:  1h 47m | Avg:  4m 29s | Max: 10m 06s | Hits:  90%/312   
      🟩 Test               Pass: 100%/2   | Total:  1h 12m | Avg: 36m 06s | Max: 36m 20s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  3m 05s | Avg:  3m 05s | Max:  3m 05s
      🟩 90a                Pass: 100%/1   | Total:  3m 20s | Avg:  3m 20s | Max:  3m 20s
    🟩 std
      🟩 17                 Pass: 100%/6   | Total: 23m 42s | Avg:  3m 57s | Max:  6m 47s
      🟩 20                 Pass: 100%/20  | Total:  2h 36m | Avg:  7m 49s | Max: 36m 20s | Hits:  90%/312   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 8m 45s | Avg: 4m 22s | Max: 6m 37s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  8m 45s | Avg:  4m 22s | Max:  6m 37s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  8m 45s | Avg:  4m 22s | Max:  6m 37s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  8m 45s | Avg:  4m 22s | Max:  6m 37s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  8m 45s | Avg:  4m 22s | Max:  6m 37s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  8m 45s | Avg:  4m 22s | Max:  6m 37s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  8m 45s | Avg:  4m 22s | Max:  6m 37s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total:  8m 45s | Avg:  4m 22s | Max:  6m 37s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 08s | Avg:  2m 08s | Max:  2m 08s
      🟩 Test               Pass: 100%/1   | Total:  6m 37s | Avg:  6m 37s | Max:  6m 37s
    
  • 🟩 python: Pass: 100%/1 | Total: 26m 13s | Avg: 26m 13s | Max: 26m 13s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 26m 13s | Avg: 26m 13s | Max: 26m 13s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 26m 13s | Avg: 26m 13s | Max: 26m 13s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 26m 13s | Avg: 26m 13s | Max: 26m 13s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 26m 13s | Avg: 26m 13s | Max: 26m 13s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 26m 13s | Avg: 26m 13s | Max: 26m 13s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 26m 13s | Avg: 26m 13s | Max: 26m 13s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 26m 13s | Avg: 26m 13s | Max: 26m 13s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 26m 13s | Avg: 26m 13s | Max: 26m 13s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 170)

# Runner
125 linux-amd64-cpu16
19 linux-amd64-gpu-v100-latest-1
15 windows-amd64-cpu16
10 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@fbusato
Copy link
Contributor Author

fbusato commented Jan 9, 2025

@ahendriksen @miscco I modified the return type and added the predicate as an output parameter in the last commit

Copy link
Contributor

github-actions bot commented Jan 9, 2025

🟨 CI finished in 2h 59m: Pass: 98%/164 | Total: 3d 03h | Avg: 27m 26s | Max: 1h 13m | Hits: 434%/15316
  • 🟨 cub: Pass: 93%/45 | Total: 1d 12h | Avg: 48m 14s | Max: 1h 12m

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  93%/43  | Total:  1d 10h | Avg: 47m 56s | Max:  1h 12m
      🟩 arm64              Pass: 100%/2   | Total:  1h 49m | Avg: 54m 35s | Max: 54m 49s
    🔍 ctk: 12.6 🔍
      🟩 11.1               Pass: 100%/6   | Total:  4h 42m | Avg: 47m 00s | Max: 48m 31s
      🟩 12.5               Pass: 100%/2   | Total:  1h 59m | Avg: 59m 58s | Max:  1h 00m
      🔍 12.6               Pass:  91%/37  | Total:  1d 05h | Avg: 47m 48s | Max:  1h 12m
    🔍 cudacxx: nvcc12.6 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 55m | Avg: 57m 59s | Max: 59m 50s
      🟩 nvcc11.1           Pass: 100%/6   | Total:  4h 42m | Avg: 47m 00s | Max: 48m 31s
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 59m | Avg: 59m 58s | Max:  1h 00m
      🔍 nvcc12.6           Pass:  91%/35  | Total:  1d 03h | Avg: 47m 13s | Max:  1h 12m
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 55m | Avg: 57m 59s | Max: 59m 50s
      🔍 nvcc               Pass:  93%/43  | Total:  1d 10h | Avg: 47m 47s | Max:  1h 12m
    🚨 cxx_family: MSVC 🚨
      🟩 Clang              Pass: 100%/19  | Total: 15h 56m | Avg: 50m 18s | Max: 59m 50s
      🟩 GCC                Pass: 100%/21  | Total: 14h 52m | Avg: 42m 30s | Max: 58m 25s
      🔥 MSVC               Pass:   0%/3   | Total:  3h 22m | Avg:  1h 07m | Max:  1h 12m
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 59m | Avg: 59m 58s | Max:  1h 00m
    🔍 gpu: v100 🔍
      🟩 h100               Pass: 100%/2   | Total: 39m 04s | Avg: 19m 32s | Max: 22m 53s
      🔍 v100               Pass:  93%/43  | Total:  1d 11h | Avg: 49m 34s | Max:  1h 12m
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  92%/38  | Total:  1d 09h | Avg: 52m 47s | Max:  1h 12m
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 19m 11s | Avg: 19m 11s | Max: 19m 11s
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 29s | Avg: 16m 29s | Max: 16m 29s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 09m | Avg: 23m 04s | Max: 31m 38s
      🟩 TestGPU            Pass: 100%/2   | Total: 59m 52s | Avg: 29m 56s | Max: 35m 07s
    🟨 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  3h 25m | Avg: 51m 23s | Max: 59m 04s
      🟩 Clang10            Pass: 100%/1   | Total: 52m 29s | Avg: 52m 29s | Max: 52m 29s
      🟩 Clang11            Pass: 100%/1   | Total: 52m 25s | Avg: 52m 25s | Max: 52m 25s
      🟩 Clang12            Pass: 100%/1   | Total: 54m 00s | Avg: 54m 00s | Max: 54m 00s
      🟩 Clang13            Pass: 100%/1   | Total: 51m 38s | Avg: 51m 38s | Max: 51m 38s
      🟩 Clang14            Pass: 100%/1   | Total: 51m 38s | Avg: 51m 38s | Max: 51m 38s
      🟩 Clang15            Pass: 100%/1   | Total: 54m 22s | Avg: 54m 22s | Max: 54m 22s
      🟩 Clang16            Pass: 100%/1   | Total: 51m 58s | Avg: 51m 58s | Max: 51m 58s
      🟩 Clang17            Pass: 100%/1   | Total: 56m 57s | Avg: 56m 57s | Max: 56m 57s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 25m | Avg: 46m 25s | Max: 59m 50s
      🟩 GCC7               Pass: 100%/4   | Total:  3h 20m | Avg: 50m 14s | Max: 58m 25s
      🟩 GCC8               Pass: 100%/1   | Total: 54m 55s | Avg: 54m 55s | Max: 54m 55s
      🟩 GCC9               Pass: 100%/3   | Total:  2h 28m | Avg: 49m 29s | Max: 53m 59s
      🟩 GCC10              Pass: 100%/1   | Total: 54m 57s | Avg: 54m 57s | Max: 54m 57s
      🟩 GCC11              Pass: 100%/1   | Total: 53m 43s | Avg: 53m 43s | Max: 53m 43s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 32m | Avg: 30m 43s | Max: 53m 07s
      🟩 GCC13              Pass: 100%/8   | Total:  4h 47m | Avg: 35m 56s | Max: 54m 49s
      🟥 MSVC14.29          Pass:   0%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
      🟥 MSVC14.39          Pass:   0%/2   | Total:  2h 19m | Avg:  1h 09m | Max:  1h 12m
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 59m | Avg: 59m 58s | Max:  1h 00m
    🟨 std
      🟩 11                 Pass: 100%/5   | Total:  4h 06m | Avg: 49m 14s | Max: 51m 18s
      🟩 14                 Pass: 100%/2   | Total:  1h 57m | Avg: 58m 44s | Max: 59m 04s
      🟨 17                 Pass:  83%/12  | Total: 10h 53m | Avg: 54m 25s | Max:  1h 07m
      🟨 20                 Pass:  96%/26  | Total: 19h 14m | Avg: 44m 23s | Max:  1h 12m
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 39m 04s | Avg: 19m 32s | Max: 22m 53s
      🟩 90a                Pass: 100%/1   | Total: 23m 59s | Avg: 23m 59s | Max: 23m 59s
    
  • 🟩 libcudacxx: Pass: 100%/46 | Total: 12h 18m | Avg: 16m 03s | Max: 1h 13m | Hits: 628%/7596

    🟩 cpu
      🟩 amd64              Pass: 100%/44  | Total: 11h 53m | Avg: 16m 13s | Max:  1h 13m | Hits: 628%/7596  
      🟩 arm64              Pass: 100%/2   | Total: 24m 52s | Avg: 12m 26s | Max: 20m 49s
    🟩 ctk
      🟩 11.1               Pass: 100%/6   | Total:  1h 09m | Avg: 11m 31s | Max: 22m 48s
      🟩 12.5               Pass: 100%/2   | Total:  1h 01m | Avg: 30m 50s | Max: 31m 31s
      🟩 12.6               Pass: 100%/38  | Total: 10h 07m | Avg: 15m 59s | Max:  1h 13m | Hits: 628%/7596  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/4   | Total:  1h 06m | Avg: 16m 41s | Max: 21m 57s
      🟩 nvcc11.1           Pass: 100%/6   | Total:  1h 09m | Avg: 11m 31s | Max: 22m 48s
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 01m | Avg: 30m 50s | Max: 31m 31s
      🟩 nvcc12.6           Pass: 100%/34  | Total:  9h 00m | Avg: 15m 54s | Max:  1h 13m | Hits: 628%/7596  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/4   | Total:  1h 06m | Avg: 16m 41s | Max: 21m 57s
      🟩 nvcc               Pass: 100%/42  | Total: 11h 11m | Avg: 15m 59s | Max:  1h 13m | Hits: 628%/7596  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total: 46m 53s | Avg: 11m 43s | Max: 21m 54s
      🟩 Clang10            Pass: 100%/1   | Total: 25m 20s | Avg: 25m 20s | Max: 25m 20s
      🟩 Clang11            Pass: 100%/1   | Total: 20m 57s | Avg: 20m 57s | Max: 20m 57s
      🟩 Clang12            Pass: 100%/1   | Total: 22m 13s | Avg: 22m 13s | Max: 22m 13s
      🟩 Clang13            Pass: 100%/1   | Total:  4m 36s | Avg:  4m 36s | Max:  4m 36s
      🟩 Clang14            Pass: 100%/1   | Total:  4m 22s | Avg:  4m 22s | Max:  4m 22s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 24s | Avg:  4m 24s | Max:  4m 24s
      🟩 Clang16            Pass: 100%/1   | Total: 23m 28s | Avg: 23m 28s | Max: 23m 28s
      🟩 Clang17            Pass: 100%/1   | Total:  4m 15s | Avg:  4m 15s | Max:  4m 15s
      🟩 Clang18            Pass: 100%/8   | Total:  1h 42m | Avg: 12m 47s | Max: 21m 57s
      🟩 GCC7               Pass: 100%/4   | Total: 32m 37s | Avg:  8m 09s | Max: 22m 48s
      🟩 GCC8               Pass: 100%/1   | Total:  4m 02s | Avg:  4m 02s | Max:  4m 02s
      🟩 GCC9               Pass: 100%/3   | Total: 28m 35s | Avg:  9m 31s | Max: 22m 56s
      🟩 GCC10              Pass: 100%/1   | Total:  4m 05s | Avg:  4m 05s | Max:  4m 05s
      🟩 GCC11              Pass: 100%/1   | Total: 20m 49s | Avg: 20m 49s | Max: 20m 49s
      🟩 GCC12              Pass: 100%/1   | Total:  4m 25s | Avg:  4m 25s | Max:  4m 25s
      🟩 GCC13              Pass: 100%/10  | Total:  4h 00m | Avg: 24m 03s | Max:  1h 13m
      🟩 MSVC14.29          Pass: 100%/1   | Total: 26m 50s | Avg: 26m 50s | Max: 26m 50s | Hits: 606%/2483  
      🟩 MSVC14.39          Pass: 100%/2   | Total: 56m 00s | Avg: 28m 00s | Max: 29m 17s | Hits: 638%/5113  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 01m | Avg: 30m 50s | Max: 31m 31s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/20  | Total:  4h 18m | Avg: 12m 56s | Max: 25m 20s
      🟩 GCC                Pass: 100%/21  | Total:  5h 35m | Avg: 15m 57s | Max:  1h 13m
      🟩 MSVC               Pass: 100%/3   | Total:  1h 22m | Avg: 27m 36s | Max: 29m 17s | Hits: 628%/7596  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 01m | Avg: 30m 50s | Max: 31m 31s
    🟩 gpu
      🟩 v100               Pass: 100%/46  | Total: 12h 18m | Avg: 16m 03s | Max:  1h 13m | Hits: 628%/7596  
    🟩 jobs
      🟩 Build              Pass: 100%/39  | Total:  9h 02m | Avg: 13m 53s | Max: 31m 31s | Hits: 628%/7596  
      🟩 NVRTC              Pass: 100%/4   | Total:  1h 41m | Avg: 25m 18s | Max: 28m 13s
      🟩 Test               Pass: 100%/2   | Total:  1h 33m | Avg: 46m 40s | Max:  1h 13m
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  1m 50s | Avg:  1m 50s | Max:  1m 50s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total: 12m 49s | Avg: 12m 49s | Max: 12m 49s
      🟩 90a                Pass: 100%/2   | Total: 20m 47s | Avg: 10m 23s | Max: 13m 21s
    🟩 std
      🟩 11                 Pass: 100%/6   | Total:  1h 17m | Avg: 12m 54s | Max: 22m 58s
      🟩 14                 Pass: 100%/3   | Total: 37m 01s | Avg: 12m 20s | Max: 28m 13s
      🟩 17                 Pass: 100%/13  | Total:  3h 45m | Avg: 17m 22s | Max: 30m 10s | Hits: 640%/4966  
      🟩 20                 Pass: 100%/23  | Total:  6h 36m | Avg: 17m 13s | Max:  1h 13m | Hits: 605%/2630  
    
  • 🟩 thrust: Pass: 100%/44 | Total: 23h 23m | Avg: 31m 53s | Max: 1h 05m | Hits: 229%/7408

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 41m 33s | Avg: 20m 46s | Max: 28m 45s
    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total: 22h 17m | Avg: 31m 50s | Max:  1h 05m | Hits: 229%/7408  
      🟩 arm64              Pass: 100%/2   | Total:  1h 05m | Avg: 32m 43s | Max: 36m 00s
    🟩 ctk
      🟩 11.1               Pass: 100%/6   | Total:  2h 36m | Avg: 26m 04s | Max: 31m 43s
      🟩 12.5               Pass: 100%/2   | Total:  1h 49m | Avg: 54m 59s | Max: 56m 21s
      🟩 12.6               Pass: 100%/36  | Total: 18h 56m | Avg: 31m 34s | Max:  1h 05m | Hits: 229%/7408  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 02m | Avg: 31m 07s | Max: 33m 25s
      🟩 nvcc11.1           Pass: 100%/6   | Total:  2h 36m | Avg: 26m 04s | Max: 31m 43s
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 49m | Avg: 54m 59s | Max: 56m 21s
      🟩 nvcc12.6           Pass: 100%/34  | Total: 17h 54m | Avg: 31m 35s | Max:  1h 05m | Hits: 229%/7408  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 02m | Avg: 31m 07s | Max: 33m 25s
      🟩 nvcc               Pass: 100%/42  | Total: 22h 20m | Avg: 31m 55s | Max:  1h 05m | Hits: 229%/7408  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  1h 49m | Avg: 27m 24s | Max: 34m 35s
      🟩 Clang10            Pass: 100%/1   | Total: 34m 33s | Avg: 34m 33s | Max: 34m 33s
      🟩 Clang11            Pass: 100%/1   | Total: 32m 01s | Avg: 32m 01s | Max: 32m 01s
      🟩 Clang12            Pass: 100%/1   | Total: 31m 36s | Avg: 31m 36s | Max: 31m 36s
      🟩 Clang13            Pass: 100%/1   | Total: 32m 23s | Avg: 32m 23s | Max: 32m 23s
      🟩 Clang14            Pass: 100%/1   | Total: 34m 15s | Avg: 34m 15s | Max: 34m 15s
      🟩 Clang15            Pass: 100%/1   | Total: 31m 28s | Avg: 31m 28s | Max: 31m 28s
      🟩 Clang16            Pass: 100%/1   | Total: 36m 31s | Avg: 36m 31s | Max: 36m 31s
      🟩 Clang17            Pass: 100%/1   | Total: 34m 32s | Avg: 34m 32s | Max: 34m 32s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 53m | Avg: 24m 48s | Max: 33m 25s
      🟩 GCC7               Pass: 100%/4   | Total:  1h 47m | Avg: 26m 55s | Max: 31m 38s
      🟩 GCC8               Pass: 100%/1   | Total: 32m 13s | Avg: 32m 13s | Max: 32m 13s
      🟩 GCC9               Pass: 100%/3   | Total:  1h 32m | Avg: 30m 59s | Max: 36m 55s
      🟩 GCC10              Pass: 100%/1   | Total: 34m 30s | Avg: 34m 30s | Max: 34m 30s
      🟩 GCC11              Pass: 100%/1   | Total: 36m 50s | Avg: 36m 50s | Max: 36m 50s
      🟩 GCC12              Pass: 100%/1   | Total: 36m 27s | Avg: 36m 27s | Max: 36m 27s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 12m | Avg: 24m 04s | Max: 39m 33s
      🟩 MSVC14.29          Pass: 100%/1   | Total: 51m 52s | Avg: 51m 52s | Max: 51m 52s | Hits: 183%/1852  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 37m | Avg: 52m 27s | Max:  1h 05m | Hits: 244%/5556  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 49m | Avg: 54m 59s | Max: 56m 21s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  9h 10m | Avg: 28m 58s | Max: 36m 31s
      🟩 GCC                Pass: 100%/19  | Total:  8h 53m | Avg: 28m 03s | Max: 39m 33s
      🟩 MSVC               Pass: 100%/4   | Total:  3h 29m | Avg: 52m 18s | Max:  1h 05m | Hits: 229%/7408  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 49m | Avg: 54m 59s | Max: 56m 21s
    🟩 gpu
      🟩 v100               Pass: 100%/44  | Total: 23h 23m | Avg: 31m 53s | Max:  1h 05m | Hits: 229%/7408  
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 21h 53m | Avg: 34m 34s | Max:  1h 05m | Hits: 183%/5556  
      🟩 TestCPU            Pass: 100%/3   | Total: 52m 14s | Avg: 17m 24s | Max: 37m 15s | Hits: 365%/1852  
      🟩 TestGPU            Pass: 100%/3   | Total: 37m 10s | Avg: 12m 23s | Max: 12m 48s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 20m 47s | Avg: 20m 47s | Max: 20m 47s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  1h 58m | Avg: 23m 37s | Max: 25m 35s
      🟩 14                 Pass: 100%/2   | Total:  1h 06m | Avg: 33m 06s | Max: 34m 35s
      🟩 17                 Pass: 100%/12  | Total:  7h 27m | Avg: 37m 15s | Max: 54m 18s | Hits: 183%/3704  
      🟩 20                 Pass: 100%/23  | Total: 12h 10m | Avg: 31m 44s | Max:  1h 05m | Hits: 274%/3704  
    
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 30m | Avg: 5m 47s | Max: 20m 03s | Hits: 574%/312

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  2h 16m | Avg:  6m 12s | Max: 20m 03s | Hits: 574%/312   
      🟩 arm64              Pass: 100%/4   | Total: 14m 06s | Avg:  3m 31s | Max:  3m 33s
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 19m 25s | Avg:  6m 28s | Max: 12m 01s | Hits: 574%/156   
      🟩 12.5               Pass: 100%/2   | Total: 11m 42s | Avg:  5m 51s | Max:  6m 04s
      🟩 12.6               Pass: 100%/21  | Total:  1h 59m | Avg:  5m 41s | Max: 20m 03s | Hits: 574%/156   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 19m 25s | Avg:  6m 28s | Max: 12m 01s | Hits: 574%/156   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 11m 42s | Avg:  5m 51s | Max:  6m 04s
      🟩 nvcc12.6           Pass: 100%/21  | Total:  1h 59m | Avg:  5m 41s | Max: 20m 03s | Hits: 574%/156   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 30m | Avg:  5m 47s | Max: 20m 03s | Hits: 574%/312   
    🟩 cxx
      🟩 Clang9             Pass: 100%/1   | Total:  3m 51s | Avg:  3m 51s | Max:  3m 51s
      🟩 Clang10            Pass: 100%/1   | Total:  4m 20s | Avg:  4m 20s | Max:  4m 20s
      🟩 Clang11            Pass: 100%/1   | Total:  3m 57s | Avg:  3m 57s | Max:  3m 57s
      🟩 Clang12            Pass: 100%/1   | Total:  4m 06s | Avg:  4m 06s | Max:  4m 06s
      🟩 Clang13            Pass: 100%/1   | Total:  3m 57s | Avg:  3m 57s | Max:  3m 57s
      🟩 Clang14            Pass: 100%/1   | Total:  3m 36s | Avg:  3m 36s | Max:  3m 36s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 14s | Avg:  4m 14s | Max:  4m 14s
      🟩 Clang16            Pass: 100%/1   | Total:  4m 00s | Avg:  4m 00s | Max:  4m 00s
      🟩 Clang17            Pass: 100%/1   | Total:  4m 03s | Avg:  4m 03s | Max:  4m 03s
      🟩 Clang18            Pass: 100%/4   | Total: 31m 11s | Avg:  7m 47s | Max: 20m 03s
      🟩 GCC9               Pass: 100%/1   | Total:  3m 33s | Avg:  3m 33s | Max:  3m 33s
      🟩 GCC10              Pass: 100%/1   | Total:  3m 37s | Avg:  3m 37s | Max:  3m 37s
      🟩 GCC11              Pass: 100%/1   | Total:  3m 37s | Avg:  3m 37s | Max:  3m 37s
      🟩 GCC12              Pass: 100%/2   | Total: 22m 59s | Avg: 11m 29s | Max: 19m 03s
      🟩 GCC13              Pass: 100%/4   | Total: 13m 23s | Avg:  3m 20s | Max:  3m 33s
      🟩 MSVC14.36          Pass: 100%/1   | Total: 12m 01s | Avg: 12m 01s | Max: 12m 01s | Hits: 574%/156   
      🟩 MSVC14.39          Pass: 100%/1   | Total: 12m 27s | Avg: 12m 27s | Max: 12m 27s | Hits: 574%/156   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 11m 42s | Avg:  5m 51s | Max:  6m 04s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/13  | Total:  1h 07m | Avg:  5m 10s | Max: 20m 03s
      🟩 GCC                Pass: 100%/9   | Total: 47m 09s | Avg:  5m 14s | Max: 19m 03s
      🟩 MSVC               Pass: 100%/2   | Total: 24m 28s | Avg: 12m 14s | Max: 12m 27s | Hits: 574%/312   
      🟩 NVHPC              Pass: 100%/2   | Total: 11m 42s | Avg:  5m 51s | Max:  6m 04s
    🟩 gpu
      🟩 v100               Pass: 100%/26  | Total:  2h 30m | Avg:  5m 47s | Max: 20m 03s | Hits: 574%/312   
    🟩 jobs
      🟩 Build              Pass: 100%/24  | Total:  1h 51m | Avg:  4m 38s | Max: 12m 27s | Hits: 574%/312   
      🟩 Test               Pass: 100%/2   | Total: 39m 06s | Avg: 19m 33s | Max: 20m 03s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  3m 19s | Avg:  3m 19s | Max:  3m 19s
      🟩 90a                Pass: 100%/1   | Total:  3m 00s | Avg:  3m 00s | Max:  3m 00s
    🟩 std
      🟩 17                 Pass: 100%/6   | Total: 23m 51s | Avg:  3m 58s | Max:  6m 04s
      🟩 20                 Pass: 100%/20  | Total:  2h 06m | Avg:  6m 20s | Max: 20m 03s | Hits: 574%/312   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 11m 50s | Avg: 5m 55s | Max: 9m 53s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 11m 50s | Avg:  5m 55s | Max:  9m 53s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total: 11m 50s | Avg:  5m 55s | Max:  9m 53s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total: 11m 50s | Avg:  5m 55s | Max:  9m 53s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 11m 50s | Avg:  5m 55s | Max:  9m 53s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 11m 50s | Avg:  5m 55s | Max:  9m 53s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 11m 50s | Avg:  5m 55s | Max:  9m 53s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total: 11m 50s | Avg:  5m 55s | Max:  9m 53s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  1m 57s | Avg:  1m 57s | Max:  1m 57s
      🟩 Test               Pass: 100%/1   | Total:  9m 53s | Avg:  9m 53s | Max:  9m 53s
    
  • 🟩 python: Pass: 100%/1 | Total: 25m 13s | Avg: 25m 13s | Max: 25m 13s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 25m 13s | Avg: 25m 13s | Max: 25m 13s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 25m 13s | Avg: 25m 13s | Max: 25m 13s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 25m 13s | Avg: 25m 13s | Max: 25m 13s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 25m 13s | Avg: 25m 13s | Max: 25m 13s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 25m 13s | Avg: 25m 13s | Max: 25m 13s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 25m 13s | Avg: 25m 13s | Max: 25m 13s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 25m 13s | Avg: 25m 13s | Max: 25m 13s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 25m 13s | Avg: 25m 13s | Max: 25m 13s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 164)

# Runner
122 linux-amd64-cpu16
19 linux-amd64-gpu-v100-latest-1
12 windows-amd64-cpu16
10 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@@ -102,7 +96,7 @@ _CCCL_NODISCARD _CCCL_DEVICE static inline shfl_return_values<_Tp> shfl_sync(
"shfl.sync.sync.idx.b32 %0|p, %2, %3, %4, %5; \n\t\t"
"selp.s32 %1, 1, 0, p; \n\t"
"}"
: "=r"(__ret), "=r"(__pred)
: "=r"(__ret), "=r"(__pred1)
Copy link
Collaborator

@miscco miscco Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we just

static_cast<_CUDA_VSTD::int32_t>(__pred)

I thought bool were also 32bit on GPUs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion on that. @ahendriksen do you have any preference, bool or int for pred?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, sizeof(bool) is 1 even on CUDA. A single bool value can physically require 32-bit because there are no smaller registers on gpus

Copy link
Contributor

github-actions bot commented Jan 9, 2025

🟨 CI finished in 1h 37m: Pass: 99%/164 | Total: 1d 01h | Avg: 9m 19s | Max: 1h 11m | Hits: 536%/17656
  • 🟨 cub: Pass: 97%/45 | Total: 7h 44m | Avg: 10m 19s | Max: 1h 11m | Hits: 598%/2340

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  97%/43  | Total:  7h 34m | Avg: 10m 34s | Max:  1h 11m | Hits: 598%/2340  
      🟩 arm64              Pass: 100%/2   | Total:  9m 32s | Avg:  4m 46s | Max:  4m 52s
    🔍 ctk: 12.6 🔍
      🟩 11.1               Pass: 100%/6   | Total: 25m 39s | Avg:  4m 16s | Max:  4m 45s
      🟩 12.5               Pass: 100%/2   | Total: 18m 52s | Avg:  9m 26s | Max:  9m 47s
      🔍 12.6               Pass:  97%/37  | Total:  6h 59m | Avg: 11m 21s | Max:  1h 11m | Hits: 598%/2340  
    🔍 cudacxx: nvcc12.6 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  8m 44s | Avg:  4m 22s | Max:  4m 31s
      🟩 nvcc11.1           Pass: 100%/6   | Total: 25m 39s | Avg:  4m 16s | Max:  4m 45s
      🟩 nvcc12.5           Pass: 100%/2   | Total: 18m 52s | Avg:  9m 26s | Max:  9m 47s
      🔍 nvcc12.6           Pass:  97%/35  | Total:  6h 51m | Avg: 11m 45s | Max:  1h 11m | Hits: 598%/2340  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  8m 44s | Avg:  4m 22s | Max:  4m 31s
      🔍 nvcc               Pass:  97%/43  | Total:  7h 35m | Avg: 10m 35s | Max:  1h 11m | Hits: 598%/2340  
    🔍 cxx: GCC13 🔍
      🟩 Clang9             Pass: 100%/4   | Total: 21m 17s | Avg:  5m 19s | Max:  6m 19s
      🟩 Clang10            Pass: 100%/1   | Total:  6m 55s | Avg:  6m 55s | Max:  6m 55s
      🟩 Clang11            Pass: 100%/1   | Total:  5m 24s | Avg:  5m 24s | Max:  5m 24s
      🟩 Clang12            Pass: 100%/1   | Total:  5m 22s | Avg:  5m 22s | Max:  5m 22s
      🟩 Clang13            Pass: 100%/1   | Total:  5m 11s | Avg:  5m 11s | Max:  5m 11s
      🟩 Clang14            Pass: 100%/1   | Total:  5m 08s | Avg:  5m 08s | Max:  5m 08s
      🟩 Clang15            Pass: 100%/1   | Total:  5m 44s | Avg:  5m 44s | Max:  5m 44s
      🟩 Clang16            Pass: 100%/1   | Total:  5m 38s | Avg:  5m 38s | Max:  5m 38s
      🟩 Clang17            Pass: 100%/1   | Total:  5m 25s | Avg:  5m 25s | Max:  5m 25s
      🟩 Clang18            Pass: 100%/7   | Total:  1h 16m | Avg: 10m 56s | Max: 30m 40s
      🟩 GCC7               Pass: 100%/4   | Total: 18m 10s | Avg:  4m 32s | Max:  5m 10s
      🟩 GCC8               Pass: 100%/1   | Total:  5m 08s | Avg:  5m 08s | Max:  5m 08s
      🟩 GCC9               Pass: 100%/3   | Total: 13m 51s | Avg:  4m 37s | Max:  5m 18s
      🟩 GCC10              Pass: 100%/1   | Total:  5m 39s | Avg:  5m 39s | Max:  5m 39s
      🟩 GCC11              Pass: 100%/1   | Total:  5m 55s | Avg:  5m 55s | Max:  5m 55s
      🟩 GCC12              Pass: 100%/3   | Total: 26m 13s | Avg:  8m 44s | Max: 16m 14s
      🔍 GCC13              Pass:  87%/8   | Total:  2h 20m | Avg: 17m 30s | Max:  1h 11m
      🟩 MSVC14.29          Pass: 100%/1   | Total: 28m 19s | Avg: 28m 19s | Max: 28m 19s | Hits: 598%/780   
      🟩 MSVC14.39          Pass: 100%/2   | Total: 59m 38s | Avg: 29m 49s | Max: 30m 33s | Hits: 598%/1560  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 18m 52s | Avg:  9m 26s | Max:  9m 47s
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/19  | Total:  2h 22m | Avg:  7m 30s | Max: 30m 40s
      🔍 GCC                Pass:  95%/21  | Total:  3h 35m | Avg: 10m 14s | Max:  1h 11m
      🟩 MSVC               Pass: 100%/3   | Total:  1h 27m | Avg: 29m 19s | Max: 30m 33s | Hits: 598%/2340  
      🟩 NVHPC              Pass: 100%/2   | Total: 18m 52s | Avg:  9m 26s | Max:  9m 47s
    🔍 gpu: v100 🔍
      🟩 h100               Pass: 100%/2   | Total: 20m 21s | Avg: 10m 10s | Max: 16m 14s
      🔍 v100               Pass:  97%/43  | Total:  7h 24m | Avg: 10m 19s | Max:  1h 11m | Hits: 598%/2340  
    🔍 jobs: TestGPU 🔍
      🟩 Build              Pass: 100%/38  | Total:  4h 37m | Avg:  7m 17s | Max: 30m 33s | Hits: 598%/2340  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 17m 23s | Avg: 17m 23s | Max: 17m 23s
      🟩 GraphCapture       Pass: 100%/1   | Total: 14m 59s | Avg: 14m 59s | Max: 14m 59s
      🟩 HostLaunch         Pass: 100%/3   | Total: 53m 24s | Avg: 17m 48s | Max: 21m 02s
      🔍 TestGPU            Pass:  50%/2   | Total:  1h 41m | Avg: 50m 52s | Max:  1h 11m
    🔍 std: 20 🔍
      🟩 11                 Pass: 100%/5   | Total: 22m 51s | Avg:  4m 34s | Max:  5m 59s
      🟩 14                 Pass: 100%/2   | Total: 11m 29s | Avg:  5m 44s | Max:  6m 19s
      🟩 17                 Pass: 100%/12  | Total:  1h 55m | Avg:  9m 35s | Max: 30m 33s | Hits: 598%/1560  
      🔍 20                 Pass:  96%/26  | Total:  5h 15m | Avg: 12m 07s | Max:  1h 11m | Hits: 598%/780   
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 20m 21s | Avg: 10m 10s | Max: 16m 14s
      🟩 90a                Pass: 100%/1   | Total:  4m 18s | Avg:  4m 18s | Max:  4m 18s
    
  • 🟩 libcudacxx: Pass: 100%/46 | Total: 8h 20m | Avg: 10m 52s | Max: 37m 07s | Hits: 682%/7596

    🟩 cpu
      🟩 amd64              Pass: 100%/44  | Total:  8h 13m | Avg: 11m 12s | Max: 37m 07s | Hits: 682%/7596  
      🟩 arm64              Pass: 100%/2   | Total:  6m 58s | Avg:  3m 29s | Max:  3m 47s
    🟩 ctk
      🟩 11.1               Pass: 100%/6   | Total:  1h 11m | Avg: 11m 58s | Max: 23m 57s
      🟩 12.5               Pass: 100%/2   | Total: 16m 42s | Avg:  8m 21s | Max:  8m 27s
      🟩 12.6               Pass: 100%/38  | Total:  6h 51m | Avg: 10m 49s | Max: 37m 07s | Hits: 682%/7596  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/4   | Total:  1h 04m | Avg: 16m 13s | Max: 21m 28s
      🟩 nvcc11.1           Pass: 100%/6   | Total:  1h 11m | Avg: 11m 58s | Max: 23m 57s
      🟩 nvcc12.5           Pass: 100%/2   | Total: 16m 42s | Avg:  8m 21s | Max:  8m 27s
      🟩 nvcc12.6           Pass: 100%/34  | Total:  5h 46m | Avg: 10m 11s | Max: 37m 07s | Hits: 682%/7596  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/4   | Total:  1h 04m | Avg: 16m 13s | Max: 21m 28s
      🟩 nvcc               Pass: 100%/42  | Total:  7h 15m | Avg: 10m 21s | Max: 37m 07s | Hits: 682%/7596  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total: 36m 46s | Avg:  9m 11s | Max: 23m 57s
      🟩 Clang10            Pass: 100%/1   | Total: 14m 51s | Avg: 14m 51s | Max: 14m 51s
      🟩 Clang11            Pass: 100%/1   | Total:  4m 11s | Avg:  4m 11s | Max:  4m 11s
      🟩 Clang12            Pass: 100%/1   | Total:  4m 17s | Avg:  4m 17s | Max:  4m 17s
      🟩 Clang13            Pass: 100%/1   | Total:  4m 22s | Avg:  4m 22s | Max:  4m 22s
      🟩 Clang14            Pass: 100%/1   | Total:  4m 25s | Avg:  4m 25s | Max:  4m 25s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 21s | Avg:  4m 21s | Max:  4m 21s
      🟩 Clang16            Pass: 100%/1   | Total:  4m 23s | Avg:  4m 23s | Max:  4m 23s
      🟩 Clang17            Pass: 100%/1   | Total:  4m 42s | Avg:  4m 42s | Max:  4m 42s
      🟩 Clang18            Pass: 100%/8   | Total:  1h 40m | Avg: 12m 31s | Max: 23m 02s
      🟩 GCC7               Pass: 100%/4   | Total: 45m 16s | Avg: 11m 19s | Max: 23m 12s
      🟩 GCC8               Pass: 100%/1   | Total:  3m 52s | Avg:  3m 52s | Max:  3m 52s
      🟩 GCC9               Pass: 100%/3   | Total:  9m 47s | Avg:  3m 15s | Max:  3m 58s
      🟩 GCC10              Pass: 100%/1   | Total:  4m 03s | Avg:  4m 03s | Max:  4m 03s
      🟩 GCC11              Pass: 100%/1   | Total:  3m 35s | Avg:  3m 35s | Max:  3m 35s
      🟩 GCC12              Pass: 100%/1   | Total:  4m 00s | Avg:  4m 00s | Max:  4m 00s
      🟩 GCC13              Pass: 100%/10  | Total:  2h 27m | Avg: 14m 43s | Max: 37m 07s
      🟩 MSVC14.29          Pass: 100%/1   | Total: 26m 27s | Avg: 26m 27s | Max: 26m 27s | Hits: 682%/2483  
      🟩 MSVC14.39          Pass: 100%/2   | Total: 56m 35s | Avg: 28m 17s | Max: 29m 29s | Hits: 681%/5113  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 16m 42s | Avg:  8m 21s | Max:  8m 27s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/20  | Total:  3h 02m | Avg:  9m 07s | Max: 23m 57s
      🟩 GCC                Pass: 100%/21  | Total:  3h 37m | Avg: 10m 22s | Max: 37m 07s
      🟩 MSVC               Pass: 100%/3   | Total:  1h 23m | Avg: 27m 40s | Max: 29m 29s | Hits: 682%/7596  
      🟩 NVHPC              Pass: 100%/2   | Total: 16m 42s | Avg:  8m 21s | Max:  8m 27s
    🟩 gpu
      🟩 v100               Pass: 100%/46  | Total:  8h 20m | Avg: 10m 52s | Max: 37m 07s | Hits: 682%/7596  
    🟩 jobs
      🟩 Build              Pass: 100%/39  | Total:  5h 44m | Avg:  8m 50s | Max: 29m 29s | Hits: 682%/7596  
      🟩 NVRTC              Pass: 100%/4   | Total:  1h 54m | Avg: 28m 41s | Max: 37m 07s
      🟩 Test               Pass: 100%/2   | Total: 38m 33s | Avg: 19m 16s | Max: 23m 02s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  1m 54s | Avg:  1m 54s | Max:  1m 54s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total: 13m 17s | Avg: 13m 17s | Max: 13m 17s
      🟩 90a                Pass: 100%/2   | Total: 15m 43s | Avg:  7m 51s | Max: 12m 01s
    🟩 std
      🟩 11                 Pass: 100%/6   | Total:  1h 16m | Avg: 12m 42s | Max: 23m 57s
      🟩 14                 Pass: 100%/3   | Total: 45m 49s | Avg: 15m 16s | Max: 37m 07s
      🟩 17                 Pass: 100%/13  | Total:  2h 49m | Avg: 13m 03s | Max: 36m 57s | Hits: 682%/4966  
      🟩 20                 Pass: 100%/23  | Total:  3h 26m | Avg:  8m 58s | Max: 29m 29s | Hits: 681%/2630  
    
  • 🟩 thrust: Pass: 100%/44 | Total: 6h 41m | Avg: 9m 07s | Max: 34m 16s | Hits: 365%/7408

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 18m 53s | Avg:  9m 26s | Max: 13m 15s
    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  6h 31m | Avg:  9m 19s | Max: 34m 16s | Hits: 365%/7408  
      🟩 arm64              Pass: 100%/2   | Total:  9m 32s | Avg:  4m 46s | Max:  4m 54s
    🟩 ctk
      🟩 11.1               Pass: 100%/6   | Total: 24m 11s | Avg:  4m 01s | Max:  4m 29s
      🟩 12.5               Pass: 100%/2   | Total: 27m 55s | Avg: 13m 57s | Max: 14m 17s
      🟩 12.6               Pass: 100%/36  | Total:  5h 49m | Avg:  9m 42s | Max: 34m 16s | Hits: 365%/7408  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 29s | Avg:  5m 14s | Max:  5m 18s
      🟩 nvcc11.1           Pass: 100%/6   | Total: 24m 11s | Avg:  4m 01s | Max:  4m 29s
      🟩 nvcc12.5           Pass: 100%/2   | Total: 27m 55s | Avg: 13m 57s | Max: 14m 17s
      🟩 nvcc12.6           Pass: 100%/34  | Total:  5h 38m | Avg:  9m 57s | Max: 34m 16s | Hits: 365%/7408  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 29s | Avg:  5m 14s | Max:  5m 18s
      🟩 nvcc               Pass: 100%/42  | Total:  6h 30m | Avg:  9m 18s | Max: 34m 16s | Hits: 365%/7408  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total: 20m 44s | Avg:  5m 11s | Max:  6m 49s
      🟩 Clang10            Pass: 100%/1   | Total:  6m 25s | Avg:  6m 25s | Max:  6m 25s
      🟩 Clang11            Pass: 100%/1   | Total:  5m 07s | Avg:  5m 07s | Max:  5m 07s
      🟩 Clang12            Pass: 100%/1   | Total:  5m 35s | Avg:  5m 35s | Max:  5m 35s
      🟩 Clang13            Pass: 100%/1   | Total:  5m 21s | Avg:  5m 21s | Max:  5m 21s
      🟩 Clang14            Pass: 100%/1   | Total:  5m 10s | Avg:  5m 10s | Max:  5m 10s
      🟩 Clang15            Pass: 100%/1   | Total:  5m 39s | Avg:  5m 39s | Max:  5m 39s
      🟩 Clang16            Pass: 100%/1   | Total:  5m 16s | Avg:  5m 16s | Max:  5m 16s
      🟩 Clang17            Pass: 100%/1   | Total:  5m 08s | Avg:  5m 08s | Max:  5m 08s
      🟩 Clang18            Pass: 100%/7   | Total: 45m 56s | Avg:  6m 33s | Max: 12m 20s
      🟩 GCC7               Pass: 100%/4   | Total: 44m 46s | Avg: 11m 11s | Max: 31m 46s
      🟩 GCC8               Pass: 100%/1   | Total:  5m 21s | Avg:  5m 21s | Max:  5m 21s
      🟩 GCC9               Pass: 100%/3   | Total: 13m 43s | Avg:  4m 34s | Max:  5m 38s
      🟩 GCC10              Pass: 100%/1   | Total:  5m 08s | Avg:  5m 08s | Max:  5m 08s
      🟩 GCC11              Pass: 100%/1   | Total:  6m 03s | Avg:  6m 03s | Max:  6m 03s
      🟩 GCC12              Pass: 100%/1   | Total:  6m 04s | Avg:  6m 04s | Max:  6m 04s
      🟩 GCC13              Pass: 100%/8   | Total: 59m 08s | Avg:  7m 23s | Max: 13m 15s
      🟩 MSVC14.29          Pass: 100%/1   | Total: 29m 21s | Avg: 29m 21s | Max: 29m 21s | Hits: 365%/1852  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  1h 33m | Avg: 31m 10s | Max: 34m 16s | Hits: 365%/5556  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 27m 55s | Avg: 13m 57s | Max: 14m 17s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  1h 50m | Avg:  5m 48s | Max: 12m 20s
      🟩 GCC                Pass: 100%/19  | Total:  2h 20m | Avg:  7m 22s | Max: 31m 46s
      🟩 MSVC               Pass: 100%/4   | Total:  2h 02m | Avg: 30m 42s | Max: 34m 16s | Hits: 365%/7408  
      🟩 NVHPC              Pass: 100%/2   | Total: 27m 55s | Avg: 13m 57s | Max: 14m 17s
    🟩 gpu
      🟩 v100               Pass: 100%/44  | Total:  6h 41m | Avg:  9m 07s | Max: 34m 16s | Hits: 365%/7408  
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total:  5h 14m | Avg:  8m 16s | Max: 31m 46s | Hits: 365%/5556  
      🟩 TestCPU            Pass: 100%/3   | Total: 49m 35s | Avg: 16m 31s | Max: 34m 16s | Hits: 365%/1852  
      🟩 TestGPU            Pass: 100%/3   | Total: 37m 09s | Avg: 12m 23s | Max: 13m 15s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total:  4m 19s | Avg:  4m 19s | Max:  4m 19s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total: 48m 28s | Avg:  9m 41s | Max: 31m 46s
      🟩 14                 Pass: 100%/2   | Total: 12m 11s | Avg:  6m 05s | Max:  6m 49s
      🟩 17                 Pass: 100%/12  | Total:  1h 58m | Avg:  9m 52s | Max: 29m 21s | Hits: 365%/3704  
      🟩 20                 Pass: 100%/23  | Total:  3h 23m | Avg:  8m 50s | Max: 34m 16s | Hits: 365%/3704  
    
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 05m | Avg: 4m 49s | Max: 16m 14s | Hits: 582%/312

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  1h 55m | Avg:  5m 14s | Max: 16m 14s | Hits: 582%/312   
      🟩 arm64              Pass: 100%/4   | Total: 10m 24s | Avg:  2m 36s | Max:  2m 45s
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 17m 13s | Avg:  5m 44s | Max: 11m 16s | Hits: 582%/156   
      🟩 12.5               Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  5m 00s
      🟩 12.6               Pass: 100%/21  | Total:  1h 38m | Avg:  4m 41s | Max: 16m 14s | Hits: 582%/156   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 17m 13s | Avg:  5m 44s | Max: 11m 16s | Hits: 582%/156   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  5m 00s
      🟩 nvcc12.6           Pass: 100%/21  | Total:  1h 38m | Avg:  4m 41s | Max: 16m 14s | Hits: 582%/156   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 05m | Avg:  4m 49s | Max: 16m 14s | Hits: 582%/312   
    🟩 cxx
      🟩 Clang9             Pass: 100%/1   | Total:  3m 04s | Avg:  3m 04s | Max:  3m 04s
      🟩 Clang10            Pass: 100%/1   | Total:  3m 31s | Avg:  3m 31s | Max:  3m 31s
      🟩 Clang11            Pass: 100%/1   | Total:  3m 13s | Avg:  3m 13s | Max:  3m 13s
      🟩 Clang12            Pass: 100%/1   | Total:  2m 56s | Avg:  2m 56s | Max:  2m 56s
      🟩 Clang13            Pass: 100%/1   | Total:  3m 02s | Avg:  3m 02s | Max:  3m 02s
      🟩 Clang14            Pass: 100%/1   | Total:  3m 07s | Avg:  3m 07s | Max:  3m 07s
      🟩 Clang15            Pass: 100%/1   | Total:  3m 16s | Avg:  3m 16s | Max:  3m 16s
      🟩 Clang16            Pass: 100%/1   | Total:  3m 33s | Avg:  3m 33s | Max:  3m 33s
      🟩 Clang17            Pass: 100%/1   | Total:  3m 13s | Avg:  3m 13s | Max:  3m 13s
      🟩 Clang18            Pass: 100%/4   | Total: 24m 39s | Avg:  6m 09s | Max: 15m 50s
      🟩 GCC9               Pass: 100%/1   | Total:  2m 53s | Avg:  2m 53s | Max:  2m 53s
      🟩 GCC10              Pass: 100%/1   | Total:  3m 11s | Avg:  3m 11s | Max:  3m 11s
      🟩 GCC11              Pass: 100%/1   | Total:  3m 21s | Avg:  3m 21s | Max:  3m 21s
      🟩 GCC12              Pass: 100%/2   | Total: 19m 42s | Avg:  9m 51s | Max: 16m 14s
      🟩 GCC13              Pass: 100%/4   | Total: 10m 30s | Avg:  2m 37s | Max:  2m 55s
      🟩 MSVC14.36          Pass: 100%/1   | Total: 11m 16s | Avg: 11m 16s | Max: 11m 16s | Hits: 582%/156   
      🟩 MSVC14.39          Pass: 100%/1   | Total: 11m 14s | Avg: 11m 14s | Max: 11m 14s | Hits: 582%/156   
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  5m 00s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/13  | Total: 53m 34s | Avg:  4m 07s | Max: 15m 50s
      🟩 GCC                Pass: 100%/9   | Total: 39m 37s | Avg:  4m 24s | Max: 16m 14s
      🟩 MSVC               Pass: 100%/2   | Total: 22m 30s | Avg: 11m 15s | Max: 11m 16s | Hits: 582%/312   
      🟩 NVHPC              Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  5m 00s
    🟩 gpu
      🟩 v100               Pass: 100%/26  | Total:  2h 05m | Avg:  4m 49s | Max: 16m 14s | Hits: 582%/312   
    🟩 jobs
      🟩 Build              Pass: 100%/24  | Total:  1h 33m | Avg:  3m 53s | Max: 11m 16s | Hits: 582%/312   
      🟩 Test               Pass: 100%/2   | Total: 32m 04s | Avg: 16m 02s | Max: 16m 14s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  2m 35s | Avg:  2m 35s | Max:  2m 35s
      🟩 90a                Pass: 100%/1   | Total:  2m 55s | Avg:  2m 55s | Max:  2m 55s
    🟩 std
      🟩 17                 Pass: 100%/6   | Total: 18m 40s | Avg:  3m 06s | Max:  5m 00s
      🟩 20                 Pass: 100%/20  | Total:  1h 46m | Avg:  5m 20s | Max: 16m 14s | Hits: 582%/312   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 9m 16s | Avg: 4m 38s | Max: 7m 13s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 16s | Avg:  4m 38s | Max:  7m 13s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  9m 16s | Avg:  4m 38s | Max:  7m 13s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  9m 16s | Avg:  4m 38s | Max:  7m 13s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  9m 16s | Avg:  4m 38s | Max:  7m 13s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  9m 16s | Avg:  4m 38s | Max:  7m 13s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  9m 16s | Avg:  4m 38s | Max:  7m 13s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total:  9m 16s | Avg:  4m 38s | Max:  7m 13s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 03s | Avg:  2m 03s | Max:  2m 03s
      🟩 Test               Pass: 100%/1   | Total:  7m 13s | Avg:  7m 13s | Max:  7m 13s
    
  • 🟩 python: Pass: 100%/1 | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 164)

# Runner
122 linux-amd64-cpu16
19 linux-amd64-gpu-v100-latest-1
12 windows-amd64-cpu16
10 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

Copy link
Contributor

github-actions bot commented Jan 9, 2025

🟩 CI finished in 13h 56m: Pass: 100%/164 | Total: 1d 00h | Avg: 9m 00s | Max: 37m 07s | Hits: 536%/17656
  • 🟩 libcudacxx: Pass: 100%/46 | Total: 8h 20m | Avg: 10m 52s | Max: 37m 07s | Hits: 682%/7596

    🟩 cpu
      🟩 amd64              Pass: 100%/44  | Total:  8h 13m | Avg: 11m 12s | Max: 37m 07s | Hits: 682%/7596  
      🟩 arm64              Pass: 100%/2   | Total:  6m 58s | Avg:  3m 29s | Max:  3m 47s
    🟩 ctk
      🟩 11.1               Pass: 100%/6   | Total:  1h 11m | Avg: 11m 58s | Max: 23m 57s
      🟩 12.5               Pass: 100%/2   | Total: 16m 42s | Avg:  8m 21s | Max:  8m 27s
      🟩 12.6               Pass: 100%/38  | Total:  6h 51m | Avg: 10m 49s | Max: 37m 07s | Hits: 682%/7596  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/4   | Total:  1h 04m | Avg: 16m 13s | Max: 21m 28s
      🟩 nvcc11.1           Pass: 100%/6   | Total:  1h 11m | Avg: 11m 58s | Max: 23m 57s
      🟩 nvcc12.5           Pass: 100%/2   | Total: 16m 42s | Avg:  8m 21s | Max:  8m 27s
      🟩 nvcc12.6           Pass: 100%/34  | Total:  5h 46m | Avg: 10m 11s | Max: 37m 07s | Hits: 682%/7596  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/4   | Total:  1h 04m | Avg: 16m 13s | Max: 21m 28s
      🟩 nvcc               Pass: 100%/42  | Total:  7h 15m | Avg: 10m 21s | Max: 37m 07s | Hits: 682%/7596  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total: 36m 46s | Avg:  9m 11s | Max: 23m 57s
      🟩 Clang10            Pass: 100%/1   | Total: 14m 51s | Avg: 14m 51s | Max: 14m 51s
      🟩 Clang11            Pass: 100%/1   | Total:  4m 11s | Avg:  4m 11s | Max:  4m 11s
      🟩 Clang12            Pass: 100%/1   | Total:  4m 17s | Avg:  4m 17s | Max:  4m 17s
      🟩 Clang13            Pass: 100%/1   | Total:  4m 22s | Avg:  4m 22s | Max:  4m 22s
      🟩 Clang14            Pass: 100%/1   | Total:  4m 25s | Avg:  4m 25s | Max:  4m 25s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 21s | Avg:  4m 21s | Max:  4m 21s
      🟩 Clang16            Pass: 100%/1   | Total:  4m 23s | Avg:  4m 23s | Max:  4m 23s
      🟩 Clang17            Pass: 100%/1   | Total:  4m 42s | Avg:  4m 42s | Max:  4m 42s
      🟩 Clang18            Pass: 100%/8   | Total:  1h 40m | Avg: 12m 31s | Max: 23m 02s
      🟩 GCC7               Pass: 100%/4   | Total: 45m 16s | Avg: 11m 19s | Max: 23m 12s
      🟩 GCC8               Pass: 100%/1   | Total:  3m 52s | Avg:  3m 52s | Max:  3m 52s
      🟩 GCC9               Pass: 100%/3   | Total:  9m 47s | Avg:  3m 15s | Max:  3m 58s
      🟩 GCC10              Pass: 100%/1   | Total:  4m 03s | Avg:  4m 03s | Max:  4m 03s
      🟩 GCC11              Pass: 100%/1   | Total:  3m 35s | Avg:  3m 35s | Max:  3m 35s
      🟩 GCC12              Pass: 100%/1   | Total:  4m 00s | Avg:  4m 00s | Max:  4m 00s
      🟩 GCC13              Pass: 100%/10  | Total:  2h 27m | Avg: 14m 43s | Max: 37m 07s
      🟩 MSVC14.29          Pass: 100%/1   | Total: 26m 27s | Avg: 26m 27s | Max: 26m 27s | Hits: 682%/2483  
      🟩 MSVC14.39          Pass: 100%/2   | Total: 56m 35s | Avg: 28m 17s | Max: 29m 29s | Hits: 681%/5113  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 16m 42s | Avg:  8m 21s | Max:  8m 27s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/20  | Total:  3h 02m | Avg:  9m 07s | Max: 23m 57s
      🟩 GCC                Pass: 100%/21  | Total:  3h 37m | Avg: 10m 22s | Max: 37m 07s
      🟩 MSVC               Pass: 100%/3   | Total:  1h 23m | Avg: 27m 40s | Max: 29m 29s | Hits: 682%/7596  
      🟩 NVHPC              Pass: 100%/2   | Total: 16m 42s | Avg:  8m 21s | Max:  8m 27s
    🟩 gpu
      🟩 v100               Pass: 100%/46  | Total:  8h 20m | Avg: 10m 52s | Max: 37m 07s | Hits: 682%/7596  
    🟩 jobs
      🟩 Build              Pass: 100%/39  | Total:  5h 44m | Avg:  8m 50s | Max: 29m 29s | Hits: 682%/7596  
      🟩 NVRTC              Pass: 100%/4   | Total:  1h 54m | Avg: 28m 41s | Max: 37m 07s
      🟩 Test               Pass: 100%/2   | Total: 38m 33s | Avg: 19m 16s | Max: 23m 02s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  1m 54s | Avg:  1m 54s | Max:  1m 54s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total: 13m 17s | Avg: 13m 17s | Max: 13m 17s
      🟩 90a                Pass: 100%/2   | Total: 15m 43s | Avg:  7m 51s | Max: 12m 01s
    🟩 std
      🟩 11                 Pass: 100%/6   | Total:  1h 16m | Avg: 12m 42s | Max: 23m 57s
      🟩 14                 Pass: 100%/3   | Total: 45m 49s | Avg: 15m 16s | Max: 37m 07s
      🟩 17                 Pass: 100%/13  | Total:  2h 49m | Avg: 13m 03s | Max: 36m 57s | Hits: 682%/4966  
      🟩 20                 Pass: 100%/23  | Total:  3h 26m | Avg:  8m 58s | Max: 29m 29s | Hits: 681%/2630  
    
  • 🟩 cub: Pass: 100%/45 | Total: 6h 54m | Avg: 9m 12s | Max: 30m 40s | Hits: 598%/2340

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  6h 45m | Avg:  9m 25s | Max: 30m 40s | Hits: 598%/2340  
      🟩 arm64              Pass: 100%/2   | Total:  9m 32s | Avg:  4m 46s | Max:  4m 52s
    🟩 ctk
      🟩 11.1               Pass: 100%/6   | Total: 25m 39s | Avg:  4m 16s | Max:  4m 45s
      🟩 12.5               Pass: 100%/2   | Total: 18m 52s | Avg:  9m 26s | Max:  9m 47s
      🟩 12.6               Pass: 100%/37  | Total:  6h 10m | Avg: 10m 00s | Max: 30m 40s | Hits: 598%/2340  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  8m 44s | Avg:  4m 22s | Max:  4m 31s
      🟩 nvcc11.1           Pass: 100%/6   | Total: 25m 39s | Avg:  4m 16s | Max:  4m 45s
      🟩 nvcc12.5           Pass: 100%/2   | Total: 18m 52s | Avg:  9m 26s | Max:  9m 47s
      🟩 nvcc12.6           Pass: 100%/35  | Total:  6h 01m | Avg: 10m 19s | Max: 30m 40s | Hits: 598%/2340  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  8m 44s | Avg:  4m 22s | Max:  4m 31s
      🟩 nvcc               Pass: 100%/43  | Total:  6h 45m | Avg:  9m 26s | Max: 30m 40s | Hits: 598%/2340  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total: 21m 17s | Avg:  5m 19s | Max:  6m 19s
      🟩 Clang10            Pass: 100%/1   | Total:  6m 55s | Avg:  6m 55s | Max:  6m 55s
      🟩 Clang11            Pass: 100%/1   | Total:  5m 24s | Avg:  5m 24s | Max:  5m 24s
      🟩 Clang12            Pass: 100%/1   | Total:  5m 22s | Avg:  5m 22s | Max:  5m 22s
      🟩 Clang13            Pass: 100%/1   | Total:  5m 11s | Avg:  5m 11s | Max:  5m 11s
      🟩 Clang14            Pass: 100%/1   | Total:  5m 08s | Avg:  5m 08s | Max:  5m 08s
      🟩 Clang15            Pass: 100%/1   | Total:  5m 44s | Avg:  5m 44s | Max:  5m 44s
      🟩 Clang16            Pass: 100%/1   | Total:  5m 38s | Avg:  5m 38s | Max:  5m 38s
      🟩 Clang17            Pass: 100%/1   | Total:  5m 25s | Avg:  5m 25s | Max:  5m 25s
      🟩 Clang18            Pass: 100%/7   | Total:  1h 16m | Avg: 10m 56s | Max: 30m 40s
      🟩 GCC7               Pass: 100%/4   | Total: 18m 10s | Avg:  4m 32s | Max:  5m 10s
      🟩 GCC8               Pass: 100%/1   | Total:  5m 08s | Avg:  5m 08s | Max:  5m 08s
      🟩 GCC9               Pass: 100%/3   | Total: 13m 51s | Avg:  4m 37s | Max:  5m 18s
      🟩 GCC10              Pass: 100%/1   | Total:  5m 39s | Avg:  5m 39s | Max:  5m 39s
      🟩 GCC11              Pass: 100%/1   | Total:  5m 55s | Avg:  5m 55s | Max:  5m 55s
      🟩 GCC12              Pass: 100%/3   | Total: 26m 13s | Avg:  8m 44s | Max: 16m 14s
      🟩 GCC13              Pass: 100%/8   | Total:  1h 30m | Avg: 11m 16s | Max: 21m 10s
      🟩 MSVC14.29          Pass: 100%/1   | Total: 28m 19s | Avg: 28m 19s | Max: 28m 19s | Hits: 598%/780   
      🟩 MSVC14.39          Pass: 100%/2   | Total: 59m 38s | Avg: 29m 49s | Max: 30m 33s | Hits: 598%/1560  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 18m 52s | Avg:  9m 26s | Max:  9m 47s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  2h 22m | Avg:  7m 30s | Max: 30m 40s
      🟩 GCC                Pass: 100%/21  | Total:  2h 45m | Avg:  7m 51s | Max: 21m 10s
      🟩 MSVC               Pass: 100%/3   | Total:  1h 27m | Avg: 29m 19s | Max: 30m 33s | Hits: 598%/2340  
      🟩 NVHPC              Pass: 100%/2   | Total: 18m 52s | Avg:  9m 26s | Max:  9m 47s
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 20m 21s | Avg: 10m 10s | Max: 16m 14s
      🟩 v100               Pass: 100%/43  | Total:  6h 34m | Avg:  9m 10s | Max: 30m 40s | Hits: 598%/2340  
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total:  4h 37m | Avg:  7m 17s | Max: 30m 33s | Hits: 598%/2340  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 17m 23s | Avg: 17m 23s | Max: 17m 23s
      🟩 GraphCapture       Pass: 100%/1   | Total: 14m 59s | Avg: 14m 59s | Max: 14m 59s
      🟩 HostLaunch         Pass: 100%/3   | Total: 53m 24s | Avg: 17m 48s | Max: 21m 02s
      🟩 TestGPU            Pass: 100%/2   | Total: 51m 50s | Avg: 25m 55s | Max: 30m 40s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 20m 21s | Avg: 10m 10s | Max: 16m 14s
      🟩 90a                Pass: 100%/1   | Total:  4m 18s | Avg:  4m 18s | Max:  4m 18s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total: 22m 51s | Avg:  4m 34s | Max:  5m 59s
      🟩 14                 Pass: 100%/2   | Total: 11m 29s | Avg:  5m 44s | Max:  6m 19s
      🟩 17                 Pass: 100%/12  | Total:  1h 55m | Avg:  9m 35s | Max: 30m 33s | Hits: 598%/1560  
      🟩 20                 Pass: 100%/26  | Total:  4h 25m | Avg: 10m 12s | Max: 30m 40s | Hits: 598%/780   
    
  • 🟩 thrust: Pass: 100%/44 | Total: 6h 41m | Avg: 9m 07s | Max: 34m 16s | Hits: 365%/7408

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 18m 53s | Avg:  9m 26s | Max: 13m 15s
    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  6h 31m | Avg:  9m 19s | Max: 34m 16s | Hits: 365%/7408  
      🟩 arm64              Pass: 100%/2   | Total:  9m 32s | Avg:  4m 46s | Max:  4m 54s
    🟩 ctk
      🟩 11.1               Pass: 100%/6   | Total: 24m 11s | Avg:  4m 01s | Max:  4m 29s
      🟩 12.5               Pass: 100%/2   | Total: 27m 55s | Avg: 13m 57s | Max: 14m 17s
      🟩 12.6               Pass: 100%/36  | Total:  5h 49m | Avg:  9m 42s | Max: 34m 16s | Hits: 365%/7408  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 29s | Avg:  5m 14s | Max:  5m 18s
      🟩 nvcc11.1           Pass: 100%/6   | Total: 24m 11s | Avg:  4m 01s | Max:  4m 29s
      🟩 nvcc12.5           Pass: 100%/2   | Total: 27m 55s | Avg: 13m 57s | Max: 14m 17s
      🟩 nvcc12.6           Pass: 100%/34  | Total:  5h 38m | Avg:  9m 57s | Max: 34m 16s | Hits: 365%/7408  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 29s | Avg:  5m 14s | Max:  5m 18s
      🟩 nvcc               Pass: 100%/42  | Total:  6h 30m | Avg:  9m 18s | Max: 34m 16s | Hits: 365%/7408  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total: 20m 44s | Avg:  5m 11s | Max:  6m 49s
      🟩 Clang10            Pass: 100%/1   | Total:  6m 25s | Avg:  6m 25s | Max:  6m 25s
      🟩 Clang11            Pass: 100%/1   | Total:  5m 07s | Avg:  5m 07s | Max:  5m 07s
      🟩 Clang12            Pass: 100%/1   | Total:  5m 35s | Avg:  5m 35s | Max:  5m 35s
      🟩 Clang13            Pass: 100%/1   | Total:  5m 21s | Avg:  5m 21s | Max:  5m 21s
      🟩 Clang14            Pass: 100%/1   | Total:  5m 10s | Avg:  5m 10s | Max:  5m 10s
      🟩 Clang15            Pass: 100%/1   | Total:  5m 39s | Avg:  5m 39s | Max:  5m 39s
      🟩 Clang16            Pass: 100%/1   | Total:  5m 16s | Avg:  5m 16s | Max:  5m 16s
      🟩 Clang17            Pass: 100%/1   | Total:  5m 08s | Avg:  5m 08s | Max:  5m 08s
      🟩 Clang18            Pass: 100%/7   | Total: 45m 56s | Avg:  6m 33s | Max: 12m 20s
      🟩 GCC7               Pass: 100%/4   | Total: 44m 46s | Avg: 11m 11s | Max: 31m 46s
      🟩 GCC8               Pass: 100%/1   | Total:  5m 21s | Avg:  5m 21s | Max:  5m 21s
      🟩 GCC9               Pass: 100%/3   | Total: 13m 43s | Avg:  4m 34s | Max:  5m 38s
      🟩 GCC10              Pass: 100%/1   | Total:  5m 08s | Avg:  5m 08s | Max:  5m 08s
      🟩 GCC11              Pass: 100%/1   | Total:  6m 03s | Avg:  6m 03s | Max:  6m 03s
      🟩 GCC12              Pass: 100%/1   | Total:  6m 04s | Avg:  6m 04s | Max:  6m 04s
      🟩 GCC13              Pass: 100%/8   | Total: 59m 08s | Avg:  7m 23s | Max: 13m 15s
      🟩 MSVC14.29          Pass: 100%/1   | Total: 29m 21s | Avg: 29m 21s | Max: 29m 21s | Hits: 365%/1852  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  1h 33m | Avg: 31m 10s | Max: 34m 16s | Hits: 365%/5556  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 27m 55s | Avg: 13m 57s | Max: 14m 17s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  1h 50m | Avg:  5m 48s | Max: 12m 20s
      🟩 GCC                Pass: 100%/19  | Total:  2h 20m | Avg:  7m 22s | Max: 31m 46s
      🟩 MSVC               Pass: 100%/4   | Total:  2h 02m | Avg: 30m 42s | Max: 34m 16s | Hits: 365%/7408  
      🟩 NVHPC              Pass: 100%/2   | Total: 27m 55s | Avg: 13m 57s | Max: 14m 17s
    🟩 gpu
      🟩 v100               Pass: 100%/44  | Total:  6h 41m | Avg:  9m 07s | Max: 34m 16s | Hits: 365%/7408  
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total:  5h 14m | Avg:  8m 16s | Max: 31m 46s | Hits: 365%/5556  
      🟩 TestCPU            Pass: 100%/3   | Total: 49m 35s | Avg: 16m 31s | Max: 34m 16s | Hits: 365%/1852  
      🟩 TestGPU            Pass: 100%/3   | Total: 37m 09s | Avg: 12m 23s | Max: 13m 15s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total:  4m 19s | Avg:  4m 19s | Max:  4m 19s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total: 48m 28s | Avg:  9m 41s | Max: 31m 46s
      🟩 14                 Pass: 100%/2   | Total: 12m 11s | Avg:  6m 05s | Max:  6m 49s
      🟩 17                 Pass: 100%/12  | Total:  1h 58m | Avg:  9m 52s | Max: 29m 21s | Hits: 365%/3704  
      🟩 20                 Pass: 100%/23  | Total:  3h 23m | Avg:  8m 50s | Max: 34m 16s | Hits: 365%/3704  
    
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 05m | Avg: 4m 49s | Max: 16m 14s | Hits: 582%/312

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  1h 55m | Avg:  5m 14s | Max: 16m 14s | Hits: 582%/312   
      🟩 arm64              Pass: 100%/4   | Total: 10m 24s | Avg:  2m 36s | Max:  2m 45s
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 17m 13s | Avg:  5m 44s | Max: 11m 16s | Hits: 582%/156   
      🟩 12.5               Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  5m 00s
      🟩 12.6               Pass: 100%/21  | Total:  1h 38m | Avg:  4m 41s | Max: 16m 14s | Hits: 582%/156   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 17m 13s | Avg:  5m 44s | Max: 11m 16s | Hits: 582%/156   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  5m 00s
      🟩 nvcc12.6           Pass: 100%/21  | Total:  1h 38m | Avg:  4m 41s | Max: 16m 14s | Hits: 582%/156   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 05m | Avg:  4m 49s | Max: 16m 14s | Hits: 582%/312   
    🟩 cxx
      🟩 Clang9             Pass: 100%/1   | Total:  3m 04s | Avg:  3m 04s | Max:  3m 04s
      🟩 Clang10            Pass: 100%/1   | Total:  3m 31s | Avg:  3m 31s | Max:  3m 31s
      🟩 Clang11            Pass: 100%/1   | Total:  3m 13s | Avg:  3m 13s | Max:  3m 13s
      🟩 Clang12            Pass: 100%/1   | Total:  2m 56s | Avg:  2m 56s | Max:  2m 56s
      🟩 Clang13            Pass: 100%/1   | Total:  3m 02s | Avg:  3m 02s | Max:  3m 02s
      🟩 Clang14            Pass: 100%/1   | Total:  3m 07s | Avg:  3m 07s | Max:  3m 07s
      🟩 Clang15            Pass: 100%/1   | Total:  3m 16s | Avg:  3m 16s | Max:  3m 16s
      🟩 Clang16            Pass: 100%/1   | Total:  3m 33s | Avg:  3m 33s | Max:  3m 33s
      🟩 Clang17            Pass: 100%/1   | Total:  3m 13s | Avg:  3m 13s | Max:  3m 13s
      🟩 Clang18            Pass: 100%/4   | Total: 24m 39s | Avg:  6m 09s | Max: 15m 50s
      🟩 GCC9               Pass: 100%/1   | Total:  2m 53s | Avg:  2m 53s | Max:  2m 53s
      🟩 GCC10              Pass: 100%/1   | Total:  3m 11s | Avg:  3m 11s | Max:  3m 11s
      🟩 GCC11              Pass: 100%/1   | Total:  3m 21s | Avg:  3m 21s | Max:  3m 21s
      🟩 GCC12              Pass: 100%/2   | Total: 19m 42s | Avg:  9m 51s | Max: 16m 14s
      🟩 GCC13              Pass: 100%/4   | Total: 10m 30s | Avg:  2m 37s | Max:  2m 55s
      🟩 MSVC14.36          Pass: 100%/1   | Total: 11m 16s | Avg: 11m 16s | Max: 11m 16s | Hits: 582%/156   
      🟩 MSVC14.39          Pass: 100%/1   | Total: 11m 14s | Avg: 11m 14s | Max: 11m 14s | Hits: 582%/156   
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  5m 00s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/13  | Total: 53m 34s | Avg:  4m 07s | Max: 15m 50s
      🟩 GCC                Pass: 100%/9   | Total: 39m 37s | Avg:  4m 24s | Max: 16m 14s
      🟩 MSVC               Pass: 100%/2   | Total: 22m 30s | Avg: 11m 15s | Max: 11m 16s | Hits: 582%/312   
      🟩 NVHPC              Pass: 100%/2   | Total:  9m 55s | Avg:  4m 57s | Max:  5m 00s
    🟩 gpu
      🟩 v100               Pass: 100%/26  | Total:  2h 05m | Avg:  4m 49s | Max: 16m 14s | Hits: 582%/312   
    🟩 jobs
      🟩 Build              Pass: 100%/24  | Total:  1h 33m | Avg:  3m 53s | Max: 11m 16s | Hits: 582%/312   
      🟩 Test               Pass: 100%/2   | Total: 32m 04s | Avg: 16m 02s | Max: 16m 14s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  2m 35s | Avg:  2m 35s | Max:  2m 35s
      🟩 90a                Pass: 100%/1   | Total:  2m 55s | Avg:  2m 55s | Max:  2m 55s
    🟩 std
      🟩 17                 Pass: 100%/6   | Total: 18m 40s | Avg:  3m 06s | Max:  5m 00s
      🟩 20                 Pass: 100%/20  | Total:  1h 46m | Avg:  5m 20s | Max: 16m 14s | Hits: 582%/312   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 9m 16s | Avg: 4m 38s | Max: 7m 13s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 16s | Avg:  4m 38s | Max:  7m 13s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  9m 16s | Avg:  4m 38s | Max:  7m 13s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  9m 16s | Avg:  4m 38s | Max:  7m 13s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  9m 16s | Avg:  4m 38s | Max:  7m 13s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  9m 16s | Avg:  4m 38s | Max:  7m 13s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  9m 16s | Avg:  4m 38s | Max:  7m 13s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total:  9m 16s | Avg:  4m 38s | Max:  7m 13s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 03s | Avg:  2m 03s | Max:  2m 03s
      🟩 Test               Pass: 100%/1   | Total:  7m 13s | Avg:  7m 13s | Max:  7m 13s
    
  • 🟩 python: Pass: 100%/1 | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 27m 12s | Avg: 27m 12s | Max: 27m 12s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 164)

# Runner
122 linux-amd64-cpu16
19 linux-amd64-gpu-v100-latest-1
12 windows-amd64-cpu16
10 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

5 participants