You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, we are encountering the following error on the latest NCCL version. Downgrading to 2.21.5 solved the issue.
g148:14937:69361 [0] NCCL INFO Channel 00/1 : 0[0] -> 7[7] via P2P/CUMEM
g148:14937:69361 [0] NCCL INFO Channel 01/1 : 0[0] -> 8[0] [send] via NET/IB/9/GDRDMA/Shared
g148:14937:69361 [0] misc/shmutils.cc:93 NCCL WARN Call to open failed: No such file or directory
g148:14937:69361 [0] misc/shmutils.cc:129 NCCL WARN Error while attaching to shared memory segment /dev/shm/nccl-glnlGK (size 14156128), error: No such file or directory (2)
The text was updated successfully, but these errors were encountered:
Is it reproducible with something generic like all_reduce_perf from https://github.com/NVIDIA/nccl-tests? Could you share a complete log file obtained with NCCL_DEBUG=INFO NCCL_DEBUG_SUBSYS=INIT,ENV,BOOTSTRAP,ALLOC?
Hello, we are encountering the following error on the latest NCCL version. Downgrading to 2.21.5 solved the issue.
The text was updated successfully, but these errors were encountered: