-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU support under Ubuntu #223
Comments
On further review, looks like some of the GPU related bind mounts are not being automatically created. I found the contrib/gpu_activate_gpu_support.sh script, But, I am not sure when, where, or how it is being invoked. Please advise. Thanks, |
Michael, We don't have a GPU system to test with at NERSC. Let me ping some of the CSCS folks and see if they can comment. |
I have the same question, how does |
Found commit that removed the code which was using this script: c5e66cc, but I don't see any replacement for this functionality. |
We received no guidance on this and never got it to work. Sorry. Singularity worked as an alternative for us.
Cheers,
Michael
…-------- Original Message --------
From: Nikita Uvarov <[email protected]<mailto:[email protected]>>
Date: Mon, Aug 20, 2018, 10:55 AM
To: NERSC/shifter <[email protected]<mailto:[email protected]>>
CC: "Kushnir, Michael (NIH/NLM/LHC) [C]" <[email protected]<mailto:[email protected]>>,Author <[email protected]<mailto:[email protected]>>
Subject: Re: [NERSC/shifter] GPU support under Ubuntu (#223)
Found commit that removed the code which was using this script: c5e66cc<c5e66cc>, but I don't see any replacement for this functionality.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#223 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AldJnzTAI3l09UrATxNUWzKPoP3qweFLks5uSs2ggaJpZM4T-Mk1>.
|
NERSC will hopefully be able to help more directly on this in the near future. |
We are also facing the same issue. With /usr/lib/nvidia-384 loaded into the container, nvidia-smi is showing the GPUs present on the node. But when we try to execute deviceQuery and nbody benchwork it is throwing same error as Is there anyway another way to test GPUs with shifter and slurm integration? |
After digging through sources and git history, it seems that the plan is to replace an old GPU support with the new module system, see doc/modules.rst and doc/config/udiRoot.conf.rst. So, we added these lines to our config:
After this, users can start jobs which require nvidia libraries by specifying |
OS is Ubuntu 16.04. Nvidia drivers installed and working fine. Nvidia drivers and CUDA work fine in nvidia-docker. Using driver 384.111 and CUDA 9.0 for testing. Slurm+shifter working fine.
But, under shifter, I can't get GPU integration to work quite right. When running an image with nvidia-docker, drivers and utilities like nvidia-smi are available and work. When running the same container via shifter they are not.
If I make a copy of /usr/lib/nvidia-384 to my siteFs, and set the PATH and LD_LIBRARY_PATH, nvidia-smi retuns the expected output. However, CUDA demo apps (i.e. deviceQuery, etc...) says:
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
Result = FAIL
Thanks,
Michael
The text was updated successfully, but these errors were encountered: