pyshpool
is for multiprocessing (especially using multiple nodes in parallel) in HPC by shell/python! Welcome aboard!
The current release version for pyshpool is v2.2 and v2.5.1 for pyshNode-static. Arm with pyshpool, feel free to enjoy yourself.
├── LICENSE
├── README.md
├── demoScipt.sh
├── inpTaskList.dat
├── poolRunShell-static
├── poolRunShell-v2.2
├── previousVersion
│ ├── pyshNodesV22-static
│ ├── pyshNodesV23-static
│ ├── pyshNodesV24-static
│ └── pyshNodesV242-static
├── pyshNodeV251-static
├── shellPoolv2.2
└── src
└── fishpool_free.jpeg
- python version V2.2
To use the python, you need to make sure that multiprocessing
is installed in your python environment. To install the multiprocessing
, please turn to this site.
- shell version V2.2
There is no prerequisite for shell version. U can enjoy pyshpool directly.
- pyshpool static version
You don't have to install any prerequisites for the pyshpool-static version and you just run the codes by one line command to enjoy yourself.
- pyshSplit (Updated: 2023-10-04)
Since for some clusters, the multinodes is not efficient enough (for queueing) and the pyshSplit is revised for the multiple jobs based on the single node. The input task list will be splited into the defined number of the jobs, and each job will be a single-node batch for the computation purpose.
- pyshNode static version (Updated: VersionV2.5.1 2022-01-24)
You don't have to install any prerequisites for the pyshNode version and you just run the codes by one line command to enjoy yourself. Make sure pyshpool-static is in the same folder with pyshNode-static. The command is quite similar to slurm script.
- pyshNode4 (Updated: VersionV2.6.1 2024-10-29)
Since the V4.0 of HPC is much more powerful, the default parameters have been updated and the accountName is updated for multi-node version. Make sure pyshpool-static is in the same folder with pyshNode-static with chmod of 777!
- pyshNode4-multipleThread (Updated: VersionV2.7.1 2025-02-02)
Since the V4.0 of HPC is much more powerful, the default parameters have been updated and the accountName is updated for multi-node version. Since some tasks are efficient in multiple threads instead of multiple processes, the multipleThread version is updated Make sure pyshpool-static is in the same folder with pyshNode-static with chmod of 777!
onelineCommand:
./pyshOnNodes
-p <str:partitionName>[general]
-J <str:>[multiNode]
-N <int:nodeNumber>[2]
-n <int:taskPerNode>[24]
-e <str:email>[NULL]
-w <str:nodelist>[NULL]
-i <inpTaskList>[inpTaskList.dat]
-A <str:>[accountName]
details:
./pyshOnNodes
--partition[-p] <str:partitionName>[general]
--job-name[-J] <str:>[multiNode]
--nodes[-N] <int:nodeNumber>[2]
--ntasks[-n] <int:taskPerNode>[24]
--mail-user[-e] <str:email>[NULL]
--nodelist[-w] <str:nodelist>[NULL]
--inputTaskList[-i] <inpTaskList>[inpTaskList.dat]
--account[-A] <str:>[accountName]
These varivables are the same with sbatch
--partition[-p] <str:partitionName>[general]
--job-name[-J] <str:>[multiNode]
--nodes[-N] <int:nodeNumber>[2]
--ntasks[-n] <int:taskPerNode>[24]
--nodelist[-w] <str:nodelist>[NULL]
--mail-user[-e] <str:email>[NULL]
<inpTaskList> is a list of your tasks and please try to
use the absolute path in the list.
Updated in 2022-01-24: Remove the two default settings ACCOUNT, QOS. They are very strong and may lead to errors in other hpc platform.
Without further description, you can enjoy multiprocess freely in HPC (high performance computing) by shell/python in two steps.
- make sure pyshpool is executive
chmod 777 pyshpool
- run the one-line-command to enjoy yourself
./pyshpool -c CPUNUM -i inputList
The input job list is your job command delimited by '\n'. For detailed information, please use help information or the document. The input job list or the running task list may be like the followings:
# command + args (outputs/logs)
bash demoScript.sh 1 > log1.log
bash demoScript.sh 2 > log2.log
bash demoScript.sh 3 > log3.log
bash demoScript.sh 4 > log4.log
bash demoScript.sh 5 > log5.log
bash demoScript.sh 6 > log6.log
bash demoScript.sh 7 > log7.log
bash demoScript.sh 8 > log8.log
bash demoScript.sh 9 > log9.log
bash demoScript.sh 10 > log10.log
Notes (Updated in 2022-01-24): If you are using some dependencies to run your programs, the absolute path for the command is recommended. Alternatively, you can use the binary dependncies without absolute path or load your environments in the same line as your programs.
If you are familiar with parallel/xargs/tee command, you will have very little time to command on pyshNode.
- Clone this github
git clone https://github.com/jligm-hash/pyshpool.git
- Check the executive pyshpool and the demo script list
chmod 777 poolRunShell-static; chmod 777 pyshNodes-static
cat demoScript.sh
head inpTaskList.dat
You will get
#!/bin/bash
RANDOM=$1
seudoRunningTime=$((1 + $RANDOM % 10))
echo "Running program $seudoRunningTime s"
sleep $seudoRunningTime
and
bash demoScript.sh 1 > log1.log
bash demoScript.sh 2 > log2.log
bash demoScript.sh 3 > log3.log
bash demoScript.sh 4 > log4.log
bash demoScript.sh 5 > log5.log
bash demoScript.sh 6 > log6.log
bash demoScript.sh 7 > log7.log
bash demoScript.sh 8 > log8.log
bash demoScript.sh 9 > log9.log
bash demoScript.sh 10 > log10.log
- Run one-command pyshNode and enjoy your self
./pyshOnNodes -p general -J demoPyshNodes -N 2 -n 24 -i inpTaskList.dat # forHpc2
./pyshOnNodes -p cpu-share -J demoPyshNodes -N 2 -n 40 -i inpTaskList.dat # forHpc3
Just email in github@jligm-hash to give you access.