-
Notifications
You must be signed in to change notification settings - Fork 2
Guide: setting up workstation and cluster
Welcome! Our lab uses the following computer resources:
- Your personal computer for quick simple tasks
- The main workstation
npl1.in.hwlab
for developing code and for relatively lightweight tasks - The CPU and GPU servers,
np-cpu-1
,np-gpu-1
, andnp-gpu-2
for a similar purpose, although these have much more compute than the workstation - The o2 compute cluster, for running more intensive jobs.
Our workstation and servers are shared by all lab members for interactively developing code. They are not intended for larger, more intense jobs! To make sure everybody gets to share the resources, please be mindful and monitor your resource use with htop
. If you are running a compute-intensive or memory-intensive job, please submit them to o2. We have a guide below on how to use O2.
Once your have access to the workstations and o2, please configure your account to use our lab's shared python environments as follows:
- On the workstation, copy the lines in our shared .bashrc onto the top of your personal bashrc at
~/.bashrc
. This will set various environmental variables. - On o2, make sure that you are part of the
polizzi
user group so that you have read/write permissions to our lab shared directory. To check whether you do, type the commandgroups
to list the user groups your account is part of. Ifpolizzi
is not in the list, please email o2 IT and ask them to add you to the UNIX user grouppolizzi
. - Once you do, on o2, copy the lines in our shared .bashrc onto the top of your personal bashrc at
~/.bashrc
.
Most lab members choose to use vscode as a powerful, intuitive, and graphical way to connect to the workstation. It is easy and convenient for running python notebooks, developing python scripts, and general file management.
Install vscode on your personal computer and install then install the remote ssh extension. Use npl1.in.hwlab
as the remote host. This will let you run commands and notebooks on the workstation.
While vscode is convenient, it is limited to simple tasks. You will need other tools for more sophisticated tasks, such as uploading/downloading lots of files, viewing .pdb
files, submitting longer jobs (to continue running after exiting vscode), or extended text-based terminal commands.
For running text-based commands, it is recommended to use ssh
from a dedicated terminal app. This is a better alternative to the small default vscode terminal window!
- On Mac, you can use the built-in Terminal app or download iTerm2; on Windows, download putty. The default colors and font sizes are rather ugly; please modify preferences to find something that suits you better.
- Run the command
ssh [email protected]
. This will connect you to the workstation, log you in, and present you with a text-based "shell" to interfacing with the workstation. SSH stands for secure shell. If you are new to this way of using computers, please google "introduction to the unix shell" for some guides. - Under the hood,
ssh
is a protocol that your local computer uses to communicate with the workstation. Once the connection is established there are many things you can do. In fact, many other tools such as vscode andscp
are built on top of an underlyingssh
connection.
For submitting longer-running commands, it is recommended to use tmux
. See our Guide to using tmux
Transferring files to and from the workstation can be pretty clunky.
- Some use vscode; some use command-line tools such as
scp
orrsync
on the terminal (google these if curious). - To ease the friction of manually using
rsync
, I (jchang) like to use a helper script on my local machine as a wrapper torsync
. You can find it here. -
Try this! If you want to drag-and-drop files on the workstation as if they were local files on your computer, you can use the tool
sshfs
. This uses anssh
to mount a workstation directory as a separate filesystem (fs
) on your own computer (google "unix mount" if curious). Once you downloadsshfs
on your computer, run this command on your computer:sudo sshfs -o allow_other,default_permissions {USER}@transfer.sbgrid.org:/nfs/polizzi/{USER} /PATH/TO/MOUNT/POINT
. Here{USER}
is your sbgrid username, and/PATH/TO/MOUNT/POINT
is an empty folder on your computer. Your workstation files will then "magically" appear under that empty folder you specified. Note that you will have to reconnect each time you lose access to the internet because the underlying ssh connection will be terminated.
- One way is to transfer them to your local computer with the above methods and then open pymol.
- Another way is to use the protein viewer extension in vscode. It works but it's a little clunky.
- Finally, one way is to host a http server on the workstation and point your pymol to load files from there. Jody uses the script here. Just run the script and copy-paste the
load
commands into your pymol command window.
TODO
Some helpful links for now:
- How to choose a partition
- Scratch vs machine-local temporary filesystems. It is recommended to copy the vdM database to the machine-local temporary filesystem for faster i/o when using COMBS on O2. Transfer and unzipping will take a few min but the overall time savings can be substantial.