-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
latent space #24
Comments
The model trained on latent space achieved similar FID compared to image space. Therefore, I believe that training from scratch on the COCO dataset has essentially reached the performance upperbound. The focus should not be on algorithm design but rather on the selection and quality of the training dataset, the use of pretrained models, or model scaling. |
Training an LDM or ADM from scratch on COCO or VG only is completely insufficient and results in poor generalization. This explains why the generation quality is so poor when inputting a never-before-seen layout during testing. I believe this is primarily due to the inadequate quantity and poor quality of the data and annotations. |
Thank you for your response. It was very helpful to me. |
I modified the guassian_diffusion.py, and then I trained the model in the latent space. The best FID I achieved was 22.17733931330713.That is why I asked you about the FID for COCO 256 in the latent space, as I would like to know if my modifications were effective. |
Thank you for your previous response.
I am also curious to know, what is the best FID score you achieved while training the COCO 256x256 model in the latent space?
If you could answer, that would be great!Thanks!
The text was updated successfully, but these errors were encountered: