missing VAE encoder with DWT #77

ink1 · 2021-12-05T01:19:52Z

@shonenkov Great work everyone!
As far as I can tell, there is only VAE decoder with DWT and no corresponding encoder.
Encoding with get_vae(dwt=True) produces the same number of tokens as get_vae(dwt=False) on the same picture size but they are different. And the DWT decoder doubles the original image size. The result is large but blurry and I see quality loss even after reducing to the original image size. The image decoded-encoded with default VQ GAN model still seems to be better than the DWT model.
@bes-dev Is this due to the need of re-training the model end to end you mentioned in #42 ?
I would expect the compatible VAE DWT encoder encode 512x512 image into 1024 tokens and the decoder restore the image back to 512x512.
I think for now VAE with DWT needs 256x256 image prompts rather than 512x512 but then the resulting quality is unfortunately not worth the effort. Looking forward to see DALL-E trained end-to-end on 512 images.

bes-dev · 2021-12-05T10:51:32Z

@ink1 yes, the available checkpoint of the DWT VQVAE was trained only for a few iterations and a small dataset as a proof of concept, but to achieve production quality, we should train it longer with a larger dataset. At the moment, I don't have enough resources to do it, but I think Sber guys will do it on their side.

RyPoints · 2021-12-31T20:17:33Z

@ink1 Same thing @bes-dev and I were talking about over here: bes-dev/vqvae_dwt_distiller.pytorch#1

Awaiting the retraining here as well.

ink1 mentioned this issue Dec 5, 2021

New 512x with image inputs #53

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

missing VAE encoder with DWT #77

missing VAE encoder with DWT #77

ink1 commented Dec 5, 2021

bes-dev commented Dec 5, 2021

RyPoints commented Dec 31, 2021

missing VAE encoder with DWT #77

missing VAE encoder with DWT #77

Comments

ink1 commented Dec 5, 2021

bes-dev commented Dec 5, 2021

RyPoints commented Dec 31, 2021