Hybrid Spatiotemporal Neural Network (HSTNN) enables fine-grained hybridization of RNNs/LSTMs and SNNs. The three-stage training approach offers the flexible trade-off among accuracy, robustness, and computational cost.
Currently, this repository demonstrates the main results of HSTNN accuracy in four selected datasets. The training approaches and codes are all listed in experiments
. Each sub-directory in experiments
are named as:
[RNN Type]-[Dataset]-[#Neurons-per-Layer]-[SNN Lens]-[SNN Decay&Threshold]
Note that the Len is set to 0.5*k (Equation 9 of the manuscript), where k is the shape parameter of spiking surrogate functions.
In each case, the main training and inference python file is [RNN Type]-[Dataset].py
(e.g. rnn-ptb.py
). Users can read the implementations of neurons, networks, and the three-stage training process.
In order to enable multiple parallel GPU training or inference tasks, the control file mprun-[RNN Type]-[Dataset].py
(e.g. mprun-rnn-smnist.py
) can control the emit of multiple training/inference tasks. The hyperparameters (epochs, learning rate, dropout, clip, etc) are collected in the corresponding dictionaries.
After multiple training and inference tasks, there is an automatic result aggregator for data statistics. Currently, it supports two types (latex table scripts and csv files). The results in csv are stored in results
directory.
Clone the github repo by:
git clone https://github.com/shibizhao/hstnn.git
cd hstnn
git submodule update --init --recursive
In the author's practice, the virtual environment (e.g. Anaconda) is recommended.
conda create -n hstnn python=3.9.16
conda activate hstnn
Install the dependencies in requirements.txt:
pip install -r requirements.txt
Enable the environment variables by:
source setup.sh
We use four datasets for evaluation: Penn-Treebank, Sequential MNIST, Neuromorphic MNIST, and DvsGesture.
Download the Penn-Treebank dataset: Reference Website
Move the dataset files to data/penn-treebank
mkdir -p data/penn-treebank
mv <PTB Download Directory>/ptb.*.txt data/penn-treebank
The torchvision.datasets
will automatically download the MNIST dataset to the data/MNIST
directory if args.data
is set as ../../data/MNIST/
.
In order to eliminiate the read/write conflict of the dataset in the multiple parallel tasks, please enter data/
directory and execute:
cd data/
python smnist-dataset-gen.py
Download the DVS-Gesture dataset from IBM Link and extract the files in DvsGesture.tar.gz into ./data/DvsGesture
In the first execution (training or test process), it will check if ./data/DvsGesture/dvs_gestures_events.hdf5
exists and generate a new one if it does not exist.
Similarly, to eliminiate the read/write conflict of the dataset in the multiple parallel tasks, please enter data/
directory and execute:
cd data/
python3 dvs-hdf5-gen.py
Setup the environment variables and enter the corresponding experiment folder:
cd experiments/rnn-ptb-120-0.5-0.6-bias
Edit the code of parallel task manager, especially the avaiable gpus, seeds, and training/testing modes. For example, if there are two avaible GPUs,
gpus = 2
To train the RNNs using seed=1111, 2222, 3333, 4444, 5555:
PTB_XNN_Parameter["nhid"] = 120
PTB_XNN_Parameter["mode"] = "train1"
PTB_XNN_Parameter["seed"] = seed_list # or [1111, 2222, 3333, 4444, 5555]
PTB_XNN_Parameter["ratio"] = [0] # DO NOT set a list with more than one element
PTB_XNN_Parameter["model"] = ["rnn"]
It is worth to mention that the "nhid" setting is the number of neurons per layer. Considering the "nlayers" are set as 2 by default, there will be 240 RNNs in this network.
There will be 5 parallel tasks and they will be dispatched on the two GPUs in a round-robin fashion.
If you want to train the SNNs with the same settings (network architectures, learning rate, training epochs, and etc), please extend the list of "model", like:
PTB_XNN_Parameter["model"] = ["rnn", "snn"]
To enable these parallel training tasks, please uncomment the execution statement.
# RNN/SNN-Train/Test and HSTNN-Stage1-Train/Test
multi_task(path, PTB_XNN_Parameter)
Finally, execute the script in the current directory:
python mprun-rnn-ptb.py
During the training processes, we can get the training log in the folder log/
, such as: train_ptb_rnn_1111.log
After training, we can test the performance of the best model by modifying the working mode as test1
and re-run the script: python mprun-rnn-ptb.py
PTB_XNN_Parameter["mode"] = "test1"
And the performance will also be saved in the log
folder and named as: test_ptb_rnn_1111.log
To train and evaluate the performance of the directly hybrid networks (marked as ffs in the parallel task manager), please set the ratio
properly, which denotes SNN ratio in each layer. For example,
PTB_FFS_Parameter["nhid"] = 120
PTB_FFS_Parameter["mode"] = "train1"
PTB_FFS_Parameter["seed"] = seed_list # or [1111, 2222, 3333, 4444, 5555]
PTB_FFS_Parameter["ratio"] = [0.75]
PTB_FFS_Parameter["model"] = ["ffs"]
In this setting, the directly hybrid networks will have two RNN/SNN hybrid layers, and there are 90 SNNs and 30 RNNs in each layer. Please ensure the products of nhid
and ratio
are integers. If you want to training the instances with different SNN ratios, please extend the list of ratio
. Currently, the authors mainly select 0.05, 0.25, 0.5, 0.75, 0.95
as the basic list and 0.15, 0.35, 0.55, 0.65, 0.85
as the extended list.
Similarly, to enable the parallel tasks, please uncomment the corresponding statement:
# FFS-Train/Test
multi_task(path, PTB_FFS_Parameter)
After training, we can also get the performance on the testset using the best model by setting the working mode as test1
. And the training/testing logs for each instances will be saved in the log/
folder, such as:
train_ptb_ffs_1111_0.05.log
and test_ptb_ffs_1111_0.05.log
.
Considering there are three stages (Adaption, Selection, and Restoration) in the training process for HSTNNs, the first training stage can be merged into the training of Pure RNN/SNNs (because there is no ratio in the redundant HSTNNs), while the thrid training stage is similar with the training of the directly hybrid networks. Therefore, to get a trained HSTNNs in Adaption stage, we modify or extend the model
list in PTB_XNN_Parameter
by:
PTB_XNN_Parameter["model"] = ["rnn", "snn", "hybrid"]
With the setting of nhid
and nlayer
, the architecture of the redundant HSTNNs are constructed: There are nlayer
hybrid layers, and there are nhid
RNNs/LSTMs and nhid
SNNs in each layer.
Currently, the first training stages of RNN/LSTM, SNN, and Redundant HSTNN are all sharing the same hyper-parameters. And similarly, after the training in the adaption stage, we can use test1
to evaluate the performance of the redundant HSTNN, which will be reported in log/test_ptb_hybrid_1111.log
(without the ratio).
The selection and the restoration stage are both integrated in the working mode: train2
. It is worth to mention that only the HSTNNs have the train2 and test2 modes. And the selection and restoration stages must be enabled after the adapation stage is finished.
PTB_HBR_Parameter["mode"] = "train2"
After the training in the restration stage, we can get the final HSTNN performance by setting the working mode as test2
.
Similarly, to enable the parallel tasks, please uncomment the corresponding statement:
# HBR-Stage2-Train/Test
multi_task(path, PTB_HBR_Parameter)
And the training/testing logs for each instances will be saved in the log/
folder, such as:
train_ptb_hybrid_1111_0.05.log
and test_ptb_hybrid_1111_0.05.log
.
- The checkpoints are saved in the
checkpoint/
directory. - Please set the statememts of the
multi_task(xxx)
properly. When we need to training HSTNNs in the restoration stage, please comment the othermulti_task(xxx)
statements. - The
traces/
directory stores the Hessian trace, and used bypyhessian/
. - There will be a script to summarize the performance of pure NNs, directly hybrid networks, and HSTNNs
latex-code-gen.py
. It would be best if all of the tasks oftest1
andtest2
are finished. Just run:python latex-code-gen.py
.
Considering the training of HSTNNs with multiple random seeds and hybrid ratios are very time-consuming, here the authors listed some of their practices and profiling to help the users.
Software The HSTNN enviroment is set up as a virtual enviroment in Anaconda 22.9.0. The version of Python, PyTorch, and CUDA are 3.9.16, 1.13.1, and 11.7, respectively.
Hardware All of the experiments are executed on a server equipped with two-way Intel Xeon Gold [email protected] and 768GB DDR4-2933 Memory. There are two NVIDIA RTX 3080Ti GPUs with 12GB GDDR6 device memory per GPU.
CPU Core, CUDA Memory, and Main Memory Usage The profiled data is only reference values, especially for the epoch time, which can be easily affected by workload, GPU utilization, temporature, and other factors. The listed epoch time is all collected at the high-workload situation.
Experiment | CPU Core | CUDA Memory | Main Memory | Epoch Time |
---|---|---|---|---|
RNN-PTB | 1 per instance | 600 MB ~ 650 MB | 1.9 GB ~ 2.0 GB | 30s ~ 60s |
RNN-SMNIST | 1 per instance | 350 MB ~ 400 MB | 1.9 GB ~ 2.0 GB | 33s ~ 47s |
RCNN-DVSGes | 1 per instance | 7000 MB ~ 11500 MB | 3.8 GB ~ 4.4 GB | 20s ~ 30s (per 20 epochs) |
RCNN-NMNIST | 1 per instance | 4000 MB ~ 9000 MB | 11.0 GB ~ 12.0 GB | 100s ~ 200s |