Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about repeating logic to explain in a tutorial video #640

Open
FurkanGozukara opened this issue Jul 12, 2023 · 10 comments
Open

Question about repeating logic to explain in a tutorial video #640

FurkanGozukara opened this issue Jul 12, 2023 · 10 comments

Comments

@FurkanGozukara
Copy link

FurkanGozukara commented Jul 12, 2023

I am a software engineer so if you can explain the logic I would appreciate that very much

This is super important for me to understand

What I want to achieve is, for every training image, using 100 different class images during the entire training. So for 10 training images total 1000 class images will be used.

So like in every 1 epoch, 1 time training image trained 1 different class image trained for 100 epochs

@kohya-ss

First case

Lets assume I don't use class images

My repeat count is set to 20

I have 10 train images

So in 1 epoch what is happening?

It is training 20 times same training image? So this is being equal to 20 epochs?

Case 2

My training images repeat count is set to 20

I have 10 train images

I have 10000 class images

Class image repeat count is set to 1

So in 1 epoch what is happening?

It is training 20 times same training image? Each time uses a different class image? All train images are being trained with a different class image? What is the logic

Case 3

My training images repeat count is set to 1

I have 10 train images

I have 10000 class images

Class image repeat count is set to 20

So in 1 epoch what is happening?

It trains 1 time the training image then trains 20 times different class images? Really have no idea about this one

Case 4

My training images repeat count is set to 20

I have 10 train images

I have 10000 class images

Class image repeat count is set to 30

Now what would happen in 1 epoch?

Thank you so much for explanations

@FurkanGozukara
Copy link
Author

@kohya-ss I have done over 15 training with SDXL
hopefully will release a tutorial soon
I hope you can give me more info regarding repeating logic

@kohya-ss
Copy link
Owner

kohya-ss commented Jul 25, 2023

Thank you for creating detailed tutorials for my repository. I truly appreciate it. I'm sorry for the delayed response.

Train images and class images, along with the repeat count, can be a bit confusing, I understand. The following descriptions may help clarify:

Case 1:
In this scenario, it is almost equivalent to training for 20 epochs. However, when the repeat count is set to 20, the images are first repeated before being shuffled. This means that the same images can be processed consecutively, such as 'image1, image1, image5, image2, image2...' during training. For SD training, it is trained with random time steps from 0 to 999, so consecutive occurrences of the same image should not pose a significant issue.

Case 2:
In this case, the number of training images including repetitions is calculated first, which results in 200 images (20 repeats * 10). During training, class images will be used in the same quantity as the training images. Therefore, only the first 200 class images will be used for training, and the remaining 9800 images will not be utilized.
(To use all class images for training, you would need to set the product of training image count and repeat count to be greater than or equal to the number of class images.)

Case 3:
It is not recommended to set the repeat count for class images to a value greater than 1 because class images will automatically repeat up to the number of training images including repetitions.
If such a specification is used, the training images including repetitions consist of 10 images, while the class images including repetitions consist of 10000 * 20 = 200,000 images. Therefore, only the first 10 images of the repeated class images will be used for training, resulting in effectively using only the first class image.

Case 4:
Similar to Case 3, it is not recommended to set the repeat count for class images to a value greater than 1.
However, let's consider what happens if such a specification is used. In this scenario, the training images including repetitions consist of 200 images, and the class images including repetitions consist of 300,000 images. The first 200 of the repeated class images will be used for training. Hence, the number of class images effectively used in training will be 200 / 30 = 6.66..., which means 7 images.

I hope this helps you.

@FurkanGozukara
Copy link
Author

FurkanGozukara commented Jul 26, 2023

Thank you so much for all answers. So looks like our only option is increasing repeating count for training images to use more different class images. However this will increase effective epoch count therefore overtraining could be made

To eliminate this we need to reduce learning rate then it will take more time to train :)

I wish you could I implement the logic of class images of dreambooth extension of automatic1111 which is made by https://github.com/d8ahazard/sd_dreambooth_extension

@kohya-ss

So instead of repeating we could set how many unique number of class images to be used for per training image

With that way I could say 1 repeating training images and use 100 unique class images for each training image

So lets say I make 200 epochs

Therefore it uses 1 class image at each epoch and since I have 100 unique class images maximum 2 times same class image are used. They can be purely random used from pool as well. That would still work I guess

With that logic for 10 training images I would need 1000 class images independed from number of epochs

It would simplify process. I understand your repeating logic is made for balancing different concepts datasets but it is not practical for learning

@kohya-ss
Copy link
Owner

Your observations are absolutely valid. However, two factors make the problem more complex.

The first reason is that the dataloader operates in multi-process, and processes are recreated at each epoch. Since the state from the previous epoch is not retained, it becomes challenging to determine which class images to use for each epoch.
Let's consider the case of using class images sequentially. In such scenario, an implementation that shuffles the class image set with a pre-defined random seed at the beginning of each epoch to decide which class images to use becomes necessary, and this can be a considerably laborious task.
(If it is ok to select class images completely randomly, the implementation might be simpler.)

The second reason is Aspect Ratio Bucketing. Class images are also stored in each bucket, which further complicates determining which class images to use at each epoch. Regarding this point, I don't have any good ideas on how to implement it properly.
(Perhaps storing information in the bucket as class image groups instead of individual class images and then sequentially extracting images from that group in the dataset's getitem method could be a possible approach.)

Please understand that due to these reasons, addressing the issue immediately is challenging.

@FurkanGozukara
Copy link
Author

FurkanGozukara commented Jul 26, 2023

@kohya-ss totally valid points

Looks like this approach would be best : totally randomly pick each class image in each bucket at each epoch

With this way I think we can simplify repeating thing with class images and can set any arbitrary number of class images we wish

Like 200 unique class images per training images for 200 epochs :)

I would gladly make a new tutorial and explain new logic

With totally random picking we should get something like 66% uniqueness with class images during training. You know random picking reaches 66% when goes to infinity

@kohya-ss
Copy link
Owner

Sure, you are correct. Randomly selecting class images does solve one problem. However, the problem of Aspect Ratio Bucketing still remains.

For example, let's consider a case where the batch size is 4, and there are 3 training images and 1 class image in one bucket. In a single batch, these four images will be included. Consequently, throughout the epoch, to balance the number of training and class images, in one batch of other buckets, we will need to combine 1 training image with 3 class images.

Perhaps the easiest approach is to randomly determine the class images to be used for each epoch and reconstruct the buckets accordingly at the beginning of each epoch, but even that is not straightforward.

I would be extremely grateful if someone could submit a pull request...😅

Therefore, please understand that the current method is a compromise to use class images as evenly as possible and is relatively easier to implement.

Regarding your initial question, reducing the number of epochs in proportion to the set repeat count would be a good solution. In cases with a large repeat count, the bias in the order within the epoch should not pose a significant problem.

@FurkanGozukara
Copy link
Author

Sure, you are correct. Randomly selecting class images does solve one problem. However, the problem of Aspect Ratio Bucketing still remains.

For example, let's consider a case where the batch size is 4, and there are 3 training images and 1 class image in one bucket. In a single batch, these four images will be included. Consequently, throughout the epoch, to balance the number of training and class images, in one batch of other buckets, we will need to combine 1 training image with 3 class images.

Perhaps the easiest approach is to randomly determine the class images to be used for each epoch and reconstruct the buckets accordingly at the beginning of each epoch, but even that is not straightforward.

I would be extremely grateful if someone could submit a pull request...😅

Therefore, please understand that the current method is a compromise to use class images as evenly as possible and is relatively easier to implement.

Regarding your initial question, reducing the number of epochs in proportion to the set repeat count would be a good solution. In cases with a large repeat count, the bias in the order within the epoch should not pose a significant problem.

i get your point and it is valid. i think you can throw error and tell people to get more training images or fix their class images. that would be more appropriate instead of trying to set all buckets :D this is what happens with dreambooth extension

it generates missing class images so people understands they dont have enough images correct resolution

@FurkanGozukara
Copy link
Author

this is still highly debated

i hope you can find time to add this simple logic

any arbitrary number

at each epoch a random class image is used from pre-cached pool

@cavit99
Copy link

cavit99 commented Aug 11, 2023

so a workaround at the moment could be to train as normal and determine which epoch is most successful, then train all over again with equivalent steps but one big epoch that uses many more repeats and thus many more reg images

@FurkanGozukara
Copy link
Author

I hope you can add this. this is really super requested feature

setting any arbitrary number of regularization images

i would like to use 1 different reg image for every step of training

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants