Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about fine-tuning ControlNet-Tile to achieve super-resolution #73

Open
snowflakewang opened this issue Jul 19, 2024 · 4 comments

Comments

@snowflakewang
Copy link

Hello, thank you for your great work on high-resolution image-to-3D generation!
I noticed that you utilized a ControlNet-Tile based on SD1.5 to achieve the first stage of super-resolution. I am curious about which data you used to fine-tune. It seems that fine-tuning a ControlNet usually needs data pairs (e.g. image-normal pair, image-depth pair, LR-HR pair).

Thank you :)

@wukailu
Copy link
Contributor

wukailu commented Jul 30, 2024

We use multiview high resolution and low resolution pairs. Multiview images comes from blender's rendering results for the objaverse dataset.

@snowflakewang
Copy link
Author

Thank you for your reply! Do you mean that rendering the Objaverse 3D dataset in two different resolutions (one is relatively high and another one is relatively low) in order to construct data pairs?

@wukailu
Copy link
Contributor

wukailu commented Jul 30, 2024

Yes, we use a (256,512) resolution pair for the first stage of super-resolution training, where the 256 resolution portion is augmented using downsampling to a random resolution and then upsampled back to 256, along with some random noise, to get a 256 resolution image with artifacts. This allows the super-resolution model at this step to correct some minor errors in generation.

@Learningm
Copy link

Learningm commented Dec 11, 2024

@wukailu ,Hi, I want to ask some details about training controlnet-tile.
I tried to re-implement it using diffusers, finetuning the tile model, tiling 4-views to a sheet.
Data processing details:
low-resolution : render 256x256 resolution, make 4 views 256x256, get 512 x 512 image, then upsample to 1024 x 1024
high-resolution: render 512x512 resolution, make 4 views 512x512, get 1024 x 1024 image
How to set background color? white background to fill in the alpha channel?

Then I came across the color change problem, similar as mentioned in lllyasviel/ControlNet-v1-1-nightly#125 (comment)

Could you please give some suggestions to solve the problem? Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants