Pytorch implementation for Cross-Modal Dynamic Transfer Learning for Multi-Modal Emotion Recognition
Cross-Modal Dynamic Transfer Learning for Multi-Modal Emotion Recognition (Accepted at IEEE Access)
We proposed a representation learning method called Cross-Modal Dynamic Transfer Learning (CDaT), which dynamically filters the low-confident modality and complements it with the high-confident modality using uni-modal masking and cross-modal representation transfer learning. We train an auxiliary network that learns model confidence scores to determine which modality is low-confident and how much the transfer should occur from other modalities. Furthermore, it can be used with any fusion model in a model-agnostic way because it leverages transfer between low-level uni-modal information via probabilistic knowledge transfer loss.
- the first step is clone this repo
git clone [email protected]:SoyeonHH/CDaT.git
- Set up the environment (need conda prerequisities) from the
environment.yml
file
conda env create -f environment.yml
bash init.sh
- Download the dataset and put it in the
data
folder & Modify the path intrain.py
mosei_data_dir = 'YOUR_PATH'
iemocap_data_dir = 'YOUR_PATH'
- Train the model
bash train.sh
- Test the model
bash inference.sh
Note that modify the model path and data path
Some portion of the code were adapted from the TAILOR repo. We thank the authors for their great work.