PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models
Yiming Zhang*, Zhening Xing*, Yanhong Zeng†, Youqing Fang, Kai Chen†
(*equal contribution, †corresponding Author)
You may also want to try other project from our team:
PIA is a personalized image animation method which can generate videos with high motion controllability and strong text and image alignment.
[2024/01/03] Add Replicate Demo & API!
[2024/01/03] Add third-party Colab!
[2023/12/28] PIA can animate a 1024x1024 image with just 16GB of GPU memory with scaled_dot_product_attention
!
[2023/12/25] HuggingFace demo is available now! 🤗 Hub
[2023/12/22] Release the model and demo of PIA. Try it to make your personalized movie!
- Online Demo on OpenXLab
- Checkpoint on Google Drive or
Use the following command to install Pytorch==2.0.0 and other dependencies:
conda env create -f environment-pt2.yaml
conda activate pia
If you want to use lower version of Pytorch (e.g. 1.13.1), you can use the following command:
conda env create -f environment.yaml
conda activate pia
We strongly recommend you to use Pytorch==2.0.0 which supports scaled_dot_product_attention
for memory-efficient image animation.
conda install git-lfs
git lfs install
git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 models/StableDiffusion/
git clone https://huggingface.co/Leoxing/PIA models/PIA/
bash download_bashscripts/1-RealisticVision.sh
bash download_bashscripts/2-RcnzCartoon.sh
bash download_bashscripts/3-MajicMix.sh
You can also download pia.ckpt manually through link on Google Drive or HuggingFace.
Put checkpoints as follows:
└── models
├── DreamBooth_LoRA
│ ├── ...
├── PIA
│ ├── pia.ckpt
└── StableDiffusion
├── vae
├── unet
└── ...
Image to Video result can be obtained by:
python inference.py --config=example/config/lighthouse.yaml
python inference.py --config=example/config/harry.yaml
python inference.py --config=example/config/majic_girl.yaml
Run the command above, then you can find the results in example/result:
You can control the motion magnitude through the parameter magnitude:
python inference.py --config=example/config/xxx.yaml --magnitude=0 # Small Motion
python inference.py --config=example/config/xxx.yaml --magnitude=1 # Moderate Motion
python inference.py --config=example/config/xxx.yaml --magnitude=2 # Large Motion
Examples:
python inference.py --config=example/config/labrador.yaml
python inference.py --config=example/config/bear.yaml
python inference.py --config=example/config/genshin.yaml
Input Image |
Small Motion |
Moderate Motion |
Large Motion |
a golden labrador is running | |||
1bear is walking, ... | |||
cherry blossom, ... |
To achieve style transfer, you can run the command(Please don't forget set the base model in xxx.yaml):
Examples:
python inference.py --config example/config/concert.yaml --style_transfer
python inference.py --config example/config/anya.yaml --style_transfer
Input Image |
1man is smiling |
1man is crying |
1man is singing |
Realistic Vision | |||
RCNZ Cartoon 3d | |||
1girl smiling |
1girl open mouth |
1girl is crying, pout |
|
RCNZ Cartoon 3d |
You can generate loop by using the parameter --loop
python inference.py --config=example/config/xxx.yaml --loop
Examples:
python inference.py --config=example/config/lighthouse.yaml --loop
python inference.py --config=example/config/labrador.yaml --loop
Input Image |
lightning, lighthouse |
sun rising, lighthouse |
fireworks, lighthouse |
Input Image |
labrador jumping |
labrador walking |
labrador running |
We provide training script for PIA. It borrows from AnimateDiff heavily, so please prepare the dataset and configuration files according to the guideline.
After preparation, you can train the model by running the following command using torchrun:
torchrun --nnodes=1 --nproc_per_node=1 --config example/config/train.yaml
We have open-sourced AnimateBench on HuggingFace which includes images, prompts and configs to evaluate PIA and other image animation methods.
Yiming Zhang: [email protected]
Zhening Xing: [email protected]
Yanhong Zeng: [email protected]
The code is built upon AnimateDiff, Tune-a-Video and PySceneDetect