Skip to content

Uncovering the Text Embedding in Text-to-Image Diffusion Models

License

Notifications You must be signed in to change notification settings

yuhuUSTC/UTEmain

Repository files navigation

Uncovering the Text Embedding in Text-to-Image Diffusion Models

This is the official implementation of the paper "Uncovering the Text Embedding in Text-to-Image Diffusion Models".

Requirements

Our code is based on stable-diffusion. This project requires one GPU with 48GB memory. Please first clone the repository and build the environment:

git clone https://github.com/wuqiuche/DiffusionDisentanglement
cd DiffusionDisentanglement
conda env create -f environment.yaml
conda activate ldm

You will also need to download the pretrained stable-diffusion model:

mkdir models/ldm/stable-diffusion-v1
wget -O models/ldm/stable-diffusion-v1/model.ckpt https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/resolve/main/sd-v1-4.ckpt

Methods

Controllable Image Editing via Text Embedding Manipulation.

Semantic Directions in SVD of Text Embedding.

Diverse operations on the text embedding via changing --mode

/bin/bash edit_all.sh

Codes for swap editing on the generated mageNet-R-TI2I dataset.

/bin/bash edit_swap.sh

Calculating the scores on mageNet-R-TI2I dataset.

/bin/bash score.sh

Results

Object Replace, Action Edit, Fader Control, Style Transfer, and Semantic Directions.

Parent Repository

This code is adopted from https://github.com/CompVis/stable-diffusion and https://github.com/UCSB-NLP-Chang/DiffusionDisentanglement.

About

Uncovering the Text Embedding in Text-to-Image Diffusion Models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published