After data preparation and before running the training script, please modify data_root command in scripts, e.g.
data_root='./dataset'
Download howto100m(link) meta files and videos and organize the data structures as below
Dataset
│
├── pretrain_dataset
│ ├── caption.json
│ └── videos_ht
Download yttemporal(link) meta files and videos and organize the data structures as below
Dataset
│
├── pretrain_dataset
│ └── videos_yt
Pretraining Script
bash scripts/pretrain_mae_vam.sh