From 450bd3e6b17c0f777eb7303f030040838641deb1 Mon Sep 17 00:00:00 2001
From: Serguei Mokhov We receive support from the rest of AITS teams, such as NAG, SAG, FIS, and DOG.
We have a great number of open-source software available and installed on “Speed” – various
Python, CUDA versions, C++/Java compilers, OpenGL, OpenFOAM, OpenCV, TensorFlow,
-OpenMPI, OpenISS, MARF [24], etc. There are also a number of commercial packages, subject to
-licensing contributions, available, such as MATLAB [13, 23], Abaqus [1], Ansys, Fluent [2],
+OpenMPI, OpenISS, MARF [26], etc. There are also a number of commercial packages, subject to
+licensing contributions, available, such as MATLAB [13, 25], Abaqus [1], Ansys, Fluent [2],
etc.
To see the packages available, run ls -al /encs/pkg/ on speed.encs. In particular, there are
over 2200 programs available in /encs/bin and /encs/pkg under Scientific Linux 7 (EL7). We are
@@ -2103,13 +2105,26 @@ The work “Haotao Lai. An OpenISS framework specialization for deep learning-based
+ The work “Haotao Lai. An OpenISS framework specialization for deep learning-based
person re-identification. Master’s thesis, Department of Computer Science and
Software Engineering, Concordia University, Montreal, Canada, August 2019.
https://spectrum.library.concordia.ca/id/eprint/985788/” using TensorFlow and Keras
on OpenISS adjusted to run on Speed based on the repositories: and theirs forks by the team.
+ and theirs forks by the team.
+
+
+
For long term users who started off with Grid Engine here are some resources to make a transition
+ For long term users who started off with Grid Engine here are some resources to make a transition
and mapping to the job submission process.
Queues are called “partitions” in SLURM. Our mapping from the GE queues to SLURM
+ Queues are called “partitions” in SLURM. Our mapping from the GE queues to SLURM
partitions is as follows:
@@ -2176,11 +2188,11 @@ We also have a new partition pt that covers SPEED2 nodes, which previously did not
exist.
Commands and command options mappings are found in Figure 11 from Commands and command options mappings are found in Figure 11 fromSpeed: The GCS ENCS Cluster
Concordia University
Montreal, Quebec, Canada
rt-ex-hpc~AT~encs.concordia.ca
-
∗The group acknowledges the initial manual version VI produced by Dr. Scott Bunnell while with
us as well as Dr. Tariq Daradkeh for his instructional support of the users and contribution of
examples.1.2
https://www.concordia.ca/ginacody/aits.html
1.5
1.6 Available Software
3.3
tracking. In 34th British Machine Vision Conference (BMVC), Aberdeen, UK, November 2023.
https://arxiv.org/abs/2309.05829 and https://github.com/goutamyg/MVT
+
3.3
https://doi.org/10.1177/0278364920913945
- A History
-A.1 Acknowledgments
-
-
-
-A.1
2023; working on the scheduler, scheduling research, end user support, and integration
of examples, such as YOLOv3 in Section 2.15.4.0 other tasks. We have a continued
collaboration on HPC/scheduling research.
A.2 Migration from UGE to SLURM
-
We also have a new partition pt that covers SPEED2 nodes, which previously did not
+
https://slurm.schedmd.com/rosetta.pdf
https://slurm.schedmd.com/pdfs/summary.pdf
Other related helpful resources from similar organizations who either used SLURM for awhile or
+
https://slurm.schedmd.com/rosetta.pdf
https://slurm.schedmd.com/pdfs/summary.pdf
Other related helpful resources from similar organizations who either used SLURM for awhile or
also transitioned to it:
https://docs.alliancecan.ca/wiki/Running_jobs
https://www.depts.ttu.edu/hpcc/userguides/general_guides/Conversion_Table_1.pdf
https://docs.mpcdf.mpg.de/doc/computing/clusters/aux/migration-from-sge-to-slurm
Bourne shell/bash: Sample .bashrc file: +
+
Bourne shell/bash: Sample .bashrc file: @@ -2216,37 +2228,37 @@
Note that you will need to either log out and back in, or execute a new shell, for the +
+
Note that you will need to either log out and back in, or execute a new shell, for the environment changes in the updated .tcshrc or .bashrc file to be applied (important).
-+
Brief summary of Speed evolution phases. -
+
Brief summary of Speed evolution phases. +
Phase 4 had 7 SuperMicro servers with 4x A100 80GB GPUs each added, dubbed as “SPEED2”. We +
Phase 4 had 7 SuperMicro servers with 4x A100 80GB GPUs each added, dubbed as “SPEED2”. We also moved from Grid Engine to SLURM. -
+
Phase 3 had 4 vidpro nodes added from Dr. Amer totalling 6x P6 and 6x V100 GPUs +
Phase 3 had 4 vidpro nodes added from Dr. Amer totalling 6x P6 and 6x V100 GPUs added. -
+
Phase 2 saw 6x NVIDIA Tesla P6 added and 8x more compute nodes. The P6s replaced 4x of FirePro +
Phase 2 saw 6x NVIDIA Tesla P6 added and 8x more compute nodes. The P6s replaced 4x of FirePro S7150. -
+
Phase 1 of Speed was of the following configuration: +
Phase 1 of Speed was of the following configuration:
+
Below is a list of resources and facilities similar to Speed at various capacities. Depending on your +
Below is a list of resources and facilities similar to Speed at various capacities. Depending on your research group and needs, they might be available to you. They are not managed by HPC/NAG of AITS, so contact their respective representatives.
@@ -2553,7 +2565,7 @@There are various Lambda Labs other GPU servers and like computers acquired by individual +
There are various Lambda Labs other GPU servers and like computers acquired by individual researchers; if you are member of their research group, contact them directly. These resources are not managed by us.
- [23] Rob Schreiber. MATLAB. Scholarpedia, 2(6):2929, 2007. + [23]
Farshad Rezaei and Marius Paraschivoiu. Placing a small-scale vertical axis wind turbine on + roof-top corner of a building. In Proceedings of the CSME International Congress, June 2022. + https://doi.org/10.7939/r3-j7v7-m909. + ++ [24] Farshad Rezaei and Marius Paraschivoiu. Computational challenges of simulating vertical axis + wind turbine on the roof-top corner of a building. Progress in Canadian Mechanical Engineering, + 6, 1–6 2023. http://hdl.handle.net/11143/20861. +
++ [25] Rob Schreiber. MATLAB. Scholarpedia, 2(6):2929, 2007. http://www.scholarpedia.org/article/MATLAB.
- [24] The MARF Research and Development Group. The Modular Audio Recognition + [26] The MARF Research and Development Group. The Modular Audio Recognition Framework and its Applications. [online], 2002–2014. http://marf.sf.net and http://arxiv.org/abs/0905.1235, last viewed May 2015.