On the Prometheus cluster we do not work on the login node. To start your work you need to reserve the resources on the computing nodes using SLURM. The following document will guide you for the process specific to the NGSchool2019 workshops and hackathons.
More general references:
The SLURM sbatch
scripts has been added to this repository and reside in the
directories dedicated to the particular workshops. After loading they will
reserve the resources for your session, set the path to the R library directory
and activate the appropriate Python virtual environment.
The example setup for your work would be:
If you do not have the repository cloned in your $HOME
directory clone it
to receive the files:
cd ~
git clone https://github.com/NGSchoolEU/ngs19.git
I you do have it pull all the recent changes:
cd ~/ngs19
git pull
To reserve the resources and run the jupyter session you need to pass
the appropriate configuration file to sbatch
. To do this for example for
the Deep Learning workshop:
sbatch ngs19/deep_learning.slurm
Remember to activate the jupyter-lab in the directory where you want to start your work.
After this wait a while until your submitted job is processed by the system.
Next proceed exactly how it was described by Klemens Noga during his introductory workshop and summarized in his guide
This schedules your work in the interactive jupyter session for 4 hour by default. Please save your intermediate work before this time runs out or your results will be lost! Alternatively you can rise the time limit set in the configuration file.
After finishing remember to free the resources by cancelling your job! To view all your submitted jobs run:
squeue -u $USER
To cancel your running job run:
scancel -j <job-id>
TO cancel all your jobs run:
scancel -u $USER
To work on the cluster in the interactive shell session run this example command:
srun --mem=4G --time=02:00:00 -p plgrid-now -N 1 --ntasks-per-node=1 -n 1 -A ngschool2019 --pty /bin/bash -l
This reserves single CPU/task on a single node with 4GB of RAM for 2 hours, using our current grant resources from the ngschool2019 grant.
Next you need to source the configuration file for your particular workshop to activate the virtual environment and load all the required environmental variables.
For example for the Deep Learning workshop you have to:
source "$PLG_GROUPS_SHARED/plggngschool/software/deep_learning/deep_learning.rc"
Now you can run R or Python interactive shell with all the modules or libraries available for your project.
The configuration files, R user library, CUDA proprietary library and Python virtual environments reside in the directory:
$PLG_GROUPS_SHARED/plggngschool/software
Users from the plggngs_adm
group have read/write privileges for this
directory.
The directory structure is as follows:
software/
├── cuda
│ ├── include
│ ├── lib64
│ └── NVIDIA_SLA_cuDNN_Support.txt
├── deep_learning
│ ├── deep_learning
│ ├── deep_learning.rc
│ └── requirements.txt
├── nlp
│ ├── nlp
│ ├── nlp.rc
│ └── requirements.txt
├── non_linear_models
│ ├── non_linear_models
│ ├── non_linear_models.rc
│ └── requirements.txt
├── R
│ ├── library
│ ├── Renviron.ngschool
│ └── Rprofile.ngschool
├── reinforcement_learning
│ ├── reinforcement_learning
│ ├── reinforcement_learning.rc
│ └── requirements.txt
└── unsupervised_learning
├── requirements.txt
├── unsupervised_learning
└── unsupervised_learning.rc
- The
*.rc
files contain configuration for each workshop. - The
cuda
directory contains the proprietary NVIDIA cuDNN library required for TensorFlow GPU computing. - The R directory contains the
Renviron.ngschool
file which point the R installation to theR/library
folder containing all of the locally installed package. Additionally it points to theRprofile.ngschool
file which sets the R optionbitmapType="cairo"
which tells thepng
graphical device not to use the X11 for plotting. - Each of the workshop folders contain similarly named folder with Python virtual environment and the required modules.
requirements.txt
files contain Python pip module requirements.
Users from the plggngs_adm
group which contains the organizers and hackathon
mentors can install Python modules with pip and R packages after sourcing their
appropriate *rc
file. The packages and modules installed in this way will be
available for all the users of their workshop. For example:
source $PLG_GROUPS_SHARED/plggngschool/software/deep_learning.rc
pip install seaborn
You can also do the same from within the jupyter-lab/jupyter-notebook terminal, without the need of sourcing the configuration file. The same is possible with jupyter shell magic, like:
!pip install seaborn
Important for TensorFlow and R users
By default R module loads cuda==9.0, and TensorFlow depends on cuda==10.0. When doing GPU computing with R and then switching to TensorFlow run:
module swap plgrid/apps/cuda/10.0
Similarly after doing GPU computing in TensorFlow and wanting to do the same in R run:
module swap plgrid/apps/cuda/9.0
# software/deep_learning/deep_learning.rc
export LD_LIBRARY_PATH="$PLG_GROUPS_SHARED/plggngschool/software/cuda/lib64:$LD_LIBRARY_PATH"
module load plgrid/apps/r/3.6.0
module load plgrid/libs/tensorflow-gpu/1.13.1-python-3.6
source "$PLG_GROUPS_SHARED/plggngschool/software/deep_learning/deep_learning/bin/activate"
source "$PLG_GROUPS_SHARED/plggngschool/software/R/Renviron.ngschool"
# software/deep_learning/requirements.txt
argparse
h5py
keras
numpy
pandas
pybigwig
pysam
rpy2
jupyterlab
#tensorflow >=1.8,<=1.14
# software/R/Renviron.ngschool
export R_LIBS_USER="$PLG_GROUPS_SHARED/plggngschool/software/R/library"
export R_PROFILE="$PLG_GROUPS_SHARED/plggngschool/software/R/Rprofile.ngschool"
# software/R/Rprofile.ngschool
options(bitmapType = 'cairo')