-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about repeating logic to explain in a tutorial video #640
Comments
@kohya-ss I have done over 15 training with SDXL |
Thank you for creating detailed tutorials for my repository. I truly appreciate it. I'm sorry for the delayed response. Train images and class images, along with the repeat count, can be a bit confusing, I understand. The following descriptions may help clarify: Case 1: Case 2: Case 3: Case 4: I hope this helps you. |
Thank you so much for all answers. So looks like our only option is increasing repeating count for training images to use more different class images. However this will increase effective epoch count therefore overtraining could be made To eliminate this we need to reduce learning rate then it will take more time to train :) I wish you could I implement the logic of class images of dreambooth extension of automatic1111 which is made by https://github.com/d8ahazard/sd_dreambooth_extension So instead of repeating we could set how many unique number of class images to be used for per training image With that way I could say 1 repeating training images and use 100 unique class images for each training image So lets say I make 200 epochs Therefore it uses 1 class image at each epoch and since I have 100 unique class images maximum 2 times same class image are used. They can be purely random used from pool as well. That would still work I guess With that logic for 10 training images I would need 1000 class images independed from number of epochs It would simplify process. I understand your repeating logic is made for balancing different concepts datasets but it is not practical for learning |
Your observations are absolutely valid. However, two factors make the problem more complex. The first reason is that the dataloader operates in multi-process, and processes are recreated at each epoch. Since the state from the previous epoch is not retained, it becomes challenging to determine which class images to use for each epoch. The second reason is Aspect Ratio Bucketing. Class images are also stored in each bucket, which further complicates determining which class images to use at each epoch. Regarding this point, I don't have any good ideas on how to implement it properly. Please understand that due to these reasons, addressing the issue immediately is challenging. |
@kohya-ss totally valid points Looks like this approach would be best : totally randomly pick each class image in each bucket at each epoch With this way I think we can simplify repeating thing with class images and can set any arbitrary number of class images we wish Like 200 unique class images per training images for 200 epochs :) I would gladly make a new tutorial and explain new logic With totally random picking we should get something like 66% uniqueness with class images during training. You know random picking reaches 66% when goes to infinity |
Sure, you are correct. Randomly selecting class images does solve one problem. However, the problem of Aspect Ratio Bucketing still remains. For example, let's consider a case where the batch size is 4, and there are 3 training images and 1 class image in one bucket. In a single batch, these four images will be included. Consequently, throughout the epoch, to balance the number of training and class images, in one batch of other buckets, we will need to combine 1 training image with 3 class images. Perhaps the easiest approach is to randomly determine the class images to be used for each epoch and reconstruct the buckets accordingly at the beginning of each epoch, but even that is not straightforward. I would be extremely grateful if someone could submit a pull request...😅 Therefore, please understand that the current method is a compromise to use class images as evenly as possible and is relatively easier to implement. Regarding your initial question, reducing the number of epochs in proportion to the set repeat count would be a good solution. In cases with a large repeat count, the bias in the order within the epoch should not pose a significant problem. |
i get your point and it is valid. i think you can throw error and tell people to get more training images or fix their class images. that would be more appropriate instead of trying to set all buckets :D this is what happens with dreambooth extension it generates missing class images so people understands they dont have enough images correct resolution |
this is still highly debated i hope you can find time to add this simple logic any arbitrary number at each epoch a random class image is used from pre-cached pool |
so a workaround at the moment could be to train as normal and determine which epoch is most successful, then train all over again with equivalent steps but one big epoch that uses many more repeats and thus many more reg images |
I hope you can add this. this is really super requested feature setting any arbitrary number of regularization images i would like to use 1 different reg image for every step of training |
I am a software engineer so if you can explain the logic I would appreciate that very much
This is super important for me to understand
What I want to achieve is, for every training image, using 100 different class images during the entire training. So for 10 training images total 1000 class images will be used.
So like in every 1 epoch, 1 time training image trained 1 different class image trained for 100 epochs
@kohya-ss
First case
Lets assume I don't use class images
My repeat count is set to 20
I have 10 train images
So in 1 epoch what is happening?
It is training 20 times same training image? So this is being equal to 20 epochs?
Case 2
My training images repeat count is set to 20
I have 10 train images
I have 10000 class images
Class image repeat count is set to 1
So in 1 epoch what is happening?
It is training 20 times same training image? Each time uses a different class image? All train images are being trained with a different class image? What is the logic
Case 3
My training images repeat count is set to 1
I have 10 train images
I have 10000 class images
Class image repeat count is set to 20
So in 1 epoch what is happening?
It trains 1 time the training image then trains 20 times different class images? Really have no idea about this one
Case 4
My training images repeat count is set to 20
I have 10 train images
I have 10000 class images
Class image repeat count is set to 30
Now what would happen in 1 epoch?
Thank you so much for explanations
The text was updated successfully, but these errors were encountered: