Skip to content

Latest commit

 

History

History
29 lines (19 loc) · 684 Bytes

NUWA.md

File metadata and controls

29 lines (19 loc) · 684 Bytes

NUWA

NÜWA is a unified multimodal pre-trained model that can generate new or manipulate existing visual data (i.e., images and videos) for 8 visual synthesis tasks (as shown above).

Text-To-Image (T2I)

t2i

SKetch-to-Image (S2I)

s2i

Image Completion (I2I)

i2i

Text-Guided Image Manipulation (TI2I)

ti2i

Text-to-Video(T2V)

t2v

Video Prediction (V2V)

v2v

Sketch-to-Video (S2V)

s2v

Text-Guided Video Manipulation (TV2V)

out_final