We currenent release the code and models for:
-
ImageNet-1K pretraining
-
ImageNet-1K pretraining + Token Labeling
-
Large resolution fine-tuning
03/6/2022
Some models with head_dim=64
are released, which can save memory cost for downstream tasks.
01/19/2022
- Pretrained models on ImageNet-1K with Token Labeling.
- Large resolution fine-tuning.
01/13/2022
[Initial commits]:
- Pretrained models on ImageNet-1K.
The followed models and logs can be downloaded on Google Drive: total_models, total_logs.
We also release the models on Baidu Cloud: total_models (bdkq), total_logs (ttub).
Model | Top-1 | #Param. | FLOPs | Model | Log | Shell |
---|---|---|---|---|---|---|
UniFormer-S | 82.9 | 22M | 3.6G | run.sh | ||
UniFormer-S† | 83.4 | 24M | 4.2G | run.sh | ||
UniFormer-B | 83.8 | 50M | 8.3G | - | run.sh | |
UniFormer-B+Layer Scale | 83.9 | 50M | 8.3G | run.sh |
Though Layer Scale is helpful for training deep models, we meet some problems when fine-tuning on video datasets. Hence, we only use the models trained without it for video tasks.
Due to the model UniFormer-S† uses head_dim=32
, which cause much memory cost for downstream tasks. We re-train these models with head_dim=64
. All models are trained with 224x224 resolution.
Model | Top-1 | #Param. | FLOPs | Model | Log | Shell |
---|---|---|---|---|---|---|
UniFormer-S† | 83.4 | 24M | 4.2G | run.sh |
The followed models and logs can be downloaded on Google Drive: total_models, total_logs.
We also release the models on Baidu Cloud: total_models (p05h), total_logs (wsvi).
We follow LV-ViT to train our models with Token Labeling. Please see token_labeling for more details.
Model | Top-1 | #Param. | FLOPs | Model | Log | Shell |
---|---|---|---|---|---|---|
UniFormer-S | 83.4 (+0.5) | 22M | 3.6G | run.sh | ||
UniFormer-S† | 83.9 (+0.5) | 24M | 4.2G | run.sh | ||
UniFormer-B | 85.1 (+1.3) | 50M | 8.3G | run.sh | ||
UniFormer-L+Layer Scale | 85.6 | 100M | 12.6G | run.sh |
Due to the models UniFormer-S/S†/B use head_dim=32
, which cause much memory cost for downstream tasks. We re-train these models with head_dim=64
. All models are trained with 224x224 resolution.
Model | Top-1 | #Param. | FLOPs | Model | Log | Shell |
---|---|---|---|---|---|---|
UniFormer-S | 83.4 (+0.5) | 22M | 3.6G | run.sh | ||
UniFormer-S† | 83.6 (+0.2) | 24M | 4.2G | run.sh | ||
UniFormer-B | 84.8 (+1.0) | 50M | 8.3G | run.sh |
The followed models and logs can be downloaded on Google Drive: total_models, total_logs.
We also release the models on Baidu Cloud: total_models (p05h), total_logs (wsvi).
We fine-tune the above models with Token Labeling on resolution of 384x384. Please see token_labeling for more details.
Model | Top-1 | #Param. | FLOPs | Model | Log | Shell |
---|---|---|---|---|---|---|
UniFormer-S | 84.6 | 22M | 11.9G | run.sh | ||
UniFormer-S† | 84.9 | 24M | 13.7G | run.sh | ||
UniFormer-B | 86.0 | 50M | 27.2G | run.sh | ||
UniFormer-L+Layer Scale | 86.3 | 100M | 39.2G | run.sh |
Our repository is built base on the DeiT repository, but we add some useful features:
- Calculating accurate FLOPs and parameters with fvcore (see check_model.py).
- Auto-resuming.
- Saving best models and backup models.
- Generating training curve (see generate_tensorboard.py).
-
Clone this repo:
git clone https://github.com/Sense-X/UniFormer.git cd UniFormer
-
Install PyTorch 1.7.0+ and torchvision 0.8.1+
conda install -c pytorch pytorch torchvision
-
Install other packages
pip install timm pip install fvcore
Simply run the training scripts in exp as followed:
bash ./exp/uniformer_small/run.sh
If the training was interrupted abnormally, you can simply rerun the script for auto-resuming. Sometimes the checkpoint may not be saved properly, you should set the resumed model via --reusme ${work_path}/ckpt/backup.pth
.
Simply run the evaluating scripts in exp as followed:
bash ./exp/uniformer_small/test.sh
It will evaluate the last model by default. You can set other models via --resume
.
You can generate the training curves as followed:
python3 generate_tensoboard.py
Note that you should install tensorboardX
.
You can calculate the FLOPs and parameters via:
python3 check_model.py
This repository is built using the timm library and the DeiT repository.