Rethinking Diffusion for Text-Driven Human Motion Generation (arXiv 2024)

Rethinking Diffusion for Text-Driven Human Motion Generation

Zichong Meng Yiming Xie Xiaogang Peng Zeyu Han Huaizu Jiang
Northeastern University
arXiv 2024

Official Simple & Minimalist PyTorch Implementation

📜 TODO List

Release the clean codes for implementation.
Release the evaluation codes and the pretrained models.
Release the simple and minimalist version of codes for implementation.
Release updated version of AE weights and scripts

📢 News

Will be releasing the updated version of AE weights with more support and scripts soon after cleaning the code.

⚙️ Getting Started

1. Conda Environment

conda env create -f environment.yml
conda activate MARDM

We test our code on Python 3.10.13, PyTorch 2.2.0, and CUDA 12.1

2. Models and Dependencies

Download Evaluation Models

rm -rf checkpoints
mkdir checkpoints
cd checkpoints
mkdir t2m
mkdir kit

cd t2m 
echo -e "Downloading evaluation models for HumanML3D dataset"
gdown --fuzzy https://drive.google.com/file/d/1ejiz4NvyuoTj3BIdfNrTFFZBZ-zq4oKD/view?usp=sharing
echo -e "Unzipping humanml3d evaluators"
unzip evaluators_humanml3d.zip

echo -e "Cleaning humanml3d evaluators zip"
rm evaluators_humanml3d.zip

cd ../kit/
echo -e "Downloading pretrained models for KIT-ML dataset"
gdown --fuzzy https://drive.google.com/file/d/1kobWYZdWRyfTfBj5YR_XYopg9YZLdfYh/view?usp=sharing

echo -e "Unzipping kit evaluators"
unzip evaluators_kit.zip

echo -e "Cleaning kit evaluators zip"
rm evaluators_kit.zip

cd ../../

Download GloVe

rm -rf glove
echo -e "Downloading glove (in use only by the evaluators)"
gdown --fuzzy https://drive.google.com/file/d/1cmXKUT31pqd7_XpJAiWEo1K81TMYHA5n/view?usp=sharing

unzip glove.zip
echo -e "Cleaning GloVe zip\n"
rm glove.zip

echo -e "Downloading done!"

Download Pre-trained Models

cd checkpoints/t2m
echo -e "Downloading pretrained models for HumanML3D dataset"
gdown --fuzzy https://drive.google.com/file/d/1TBybFByAd-kD4AuFgMyR3ZBt4VV43Sif/view?usp=sharing
gdown --fuzzy https://drive.google.com/file/d/1csjlxi0uOhfPPEwiThsR0gaj7_VDmgb6/view?usp=sharing
gdown --fuzzy https://drive.google.com/file/d/1nWoEcN4rEFKi4Xyf_ObKinDmSQNPKXgU/view?usp=sharing
gdown --fuzzy https://drive.google.com/file/d/1nfX_j8VzMmynqKv8x68pXrsL3c0qWLXA/view?usp=sharing
echo -e "Unzipping"
unzip MARDM_SiT_XL.zip
unzip MARDM_DDPM_XL.zip
unzip length_estimator.zip
unzip AE_humanml3d.zip
echo -e "Cleaning zips"
rm MARDM_SiT_XL.zip
rm MARDM_DDPM_XL.zip
rm length_estimator.zip
rm AE_humanml3d.zip

cd ../../

3. Obtain Data

You do not need to get data if you only want to generate motions based on textual instructions.

If you want to reproduce and evaluate our method, you can obtain both HumanML3D and KIT following instructions in HumanML3D. By default, the data path is set to ./datasets.

For dataset Mean and Std, you are welcome to use the eval_mean,npy and eval_std,npy in the utils, or you can calculate based on your obtained dataset using:

python utils/cal_mean_std.py

💻 Demo

(a) Generate with single textual instruction

python sample.py --name MARDM_SiT_XL --text_prompt "A person is running on a treadmill."

(b) Generate from a prompt file

in a txt file, in each line, your input should be <text description>#<motion length>, you can push NA as motion length to let model determine the motion length (if there is one NA in file, all the others will be NA as well).

python sample.py --name MARDM_SiT_XL --text_path ./text_prompt.txt

🎆 Train Your Own MARDM models

HumanML3D

AE

python train_AE.py --name AE --dataset_name t2m --batch_size 256 --epoch 50 --lr_decay 0.05

MARDM

# MARDM SiT-based (best results)
python train_MARDM.py --name MARDM_SiT_XL --model "MARDM-SiT-XL" --dataset_name t2m --batch_size 64 --ae_name AE
# MARDM DDPM-based
python train_MARDM.py --name MARDM_DDPM_XL --model "MARDM-DDPM-XL" --dataset_name t2m --batch_size 64 --ae_name AE

KIT-ML

AE

python train_AE.py --name AE --dataset_name kit --batch_size 512 --epoch 50 --lr_decay 0.1

MARDM

# MARDM SiT-based (best results)
python train_MARDM.py --name MARDM_SiT_XL --model "MARDM-SiT-XL" --dataset_name kit --batch_size 16 --ae_name AE --milestones 20000
# MARDM DDPM-based
python train_MARDM.py --name MARDM_DDPM_XL --model "MARDM-DDPM-XL" --dataset_name kit --batch_size 16 --ae_name AE --milestones 20000

📖 Evaluate MARDM models

HumanML3D

AE

python evaluation_AE.py --name AE --dataset_name t2m

MARDM

# MARDM SiT-based (best results)
python evaluation_MARDM.py --name MARDM_SiT_XL --model "MARDM-SiT-XL" --dataset_name t2m --cfg 4.5
# MARDM DDPM-based
python evaluation_MARDM.py --name MARDM_DDPM_XL --model "MARDM-DDPM-XL" --dataset_name t2m --cfg 4.5

KIT-ML

AE

python evaluation_AE.py --name AE --dataset_name kit

MARDM

# MARDM SiT-based (best results)
python evaluation_MARDM.py --name MARDM_SiT_XL --model "MARDM-SiT-XL" --dataset_name kit --cfg 2.5
# MARDM DDPM-based
python evaluation_MARDM.py --name MARDM_DDPM_XL --model "MARDM-DDPM-XL" --dataset_name kit --cfg 2.5

🎏 Temporal Editing

python edit.py --name MARDM_SiT_XL -msec 0.3,0.6 --text_prompt "A man dancing around." --source_motion 000612.npy

🍀 Acknowledgments

This code is standing on the shoulders of giants, we would like to thank the following contributors that our code is based on:.

Our original raw implementation is heavily based on T2M, T2M-GPT, MMM and MoMask. The Diffusion part is primarily based on DDPM, DiT, SiT, MAR, HOI-Diff, InterGen, MDM, MLD.

For open sourced version, we decide to restructure (and some rewrite) for a simple and minimalist version of PyTorch code implementation that get rids of PyTorch Lighting implicit hooks, outer-space variable utilization and implicit argparse calls. We hope our minimalist version implementation can lead to better code comprehension and contribution to the motion generation community. Thank you.

🤝 Citation

If you find this repository useful for your work, please consider citing it as follows:

@article{meng2024rethinking,
      title={Rethinking Diffusion for Text-Driven Human Motion Generation},
      author={Meng, Zichong and Xie, Yiming and Peng, Xiaogang and Han, Zeyu and Jiang, Huaizu},
      journal={arXiv preprint arXiv:2411.16575},
      year={2024}
    }

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rethinking Diffusion for Text-Driven Human Motion Generation (arXiv 2024)

Official Simple & Minimalist PyTorch Implementation

📜 TODO List

📢 News

⚙️ Getting Started

1. Conda Environment

2. Models and Dependencies

Download Evaluation Models

Download GloVe

Download Pre-trained Models

3. Obtain Data

💻 Demo

(a) Generate with single textual instruction

(b) Generate from a prompt file

🎆 Train Your Own MARDM models

HumanML3D

AE

MARDM

KIT-ML

AE

MARDM

📖 Evaluate MARDM models

HumanML3D

AE

MARDM

KIT-ML

AE

MARDM

🎏 Temporal Editing

🍀 Acknowledgments

🤝 Citation

Star History

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
diffusions		diffusions
models		models
utils		utils
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
MARDM.png		MARDM.png
README.md		README.md
edit.py		edit.py
environment.yml		environment.yml
evaluation_AE.py		evaluation_AE.py
evaluation_MARDM.py		evaluation_MARDM.py
sample.py		sample.py
train_AE.py		train_AE.py
train_MARDM.py		train_MARDM.py

License

neu-vi/MARDM

Folders and files

Latest commit

History

Repository files navigation

Rethinking Diffusion for Text-Driven Human Motion Generation (arXiv 2024)

Official Simple & Minimalist PyTorch Implementation

📜 TODO List

📢 News

⚙️ Getting Started

1. Conda Environment

2. Models and Dependencies

Download Evaluation Models

Download GloVe

Download Pre-trained Models

3. Obtain Data

💻 Demo

(a) Generate with single textual instruction

(b) Generate from a prompt file

🎆 Train Your Own MARDM models

HumanML3D

AE

MARDM

KIT-ML

AE

MARDM

📖 Evaluate MARDM models

HumanML3D

AE

MARDM

KIT-ML

AE

MARDM

🎏 Temporal Editing

🍀 Acknowledgments

🤝 Citation

Star History

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages