Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

学習させてもsafetensorデータが生成されない #1874

Open
m0r1s1 opened this issue Jan 13, 2025 · 3 comments
Open

学習させてもsafetensorデータが生成されない #1874

m0r1s1 opened this issue Jan 13, 2025 · 3 comments

Comments

@m0r1s1
Copy link

m0r1s1 commented Jan 13, 2025

こんにちは。
標記の件で、ご教示頂きたく、質問させて頂けたらと思います。
PC的にはグラフィックボードRTX4090のハイスペックなものを使用しているので申し分ないと思います。
コンソールには以下の内容が表記されています。
何かエラーが起きているのでしょうか。
よろしくお願い致します。

15:13:53-341137 INFO Command executed.
2025-01-13 15:13:59 INFO Loading settings from train_util.py:4174
D:/Data/Packages/kohya_ss/outpu
ts/config_lora-20250113-151353.
toml...
INFO D:/Data/Packages/kohya_ss/outpu train_util.py:4193
ts/config_lora-20250113-151353
2025-01-13 15:13:59 INFO prepare tokenizers sdxl_train_util.py:138
2025-01-13 15:14:00 INFO update token length: 75 sdxl_train_util.py:163
INFO Using DreamBooth method. train_network.py:172
INFO prepare images. train_util.py:1815
INFO found directory train_util.py:1762
D:\Data\Packages\kohya_ss\train
ing\1_test contains 1 image
files
WARNING No caption file found for 1 train_util.py:1793
images. Training will continue
without captions for these
images. If class token exists,
it will be used. /
1枚の画像にキャプションファイル
が見つかりませんでした。これら
の画像についてはキャプションな
しで学習を続行します。class
tokenが存在する場合はそれを使い
ます。
WARNING D:\Data\Packages\kohya_ss\train train_util.py:1800
ing\1_test\1.png
INFO 1 train images with repeating. train_util.py:1856
INFO 0 reg images. train_util.py:1859
WARNING no regularization images / train_util.py:1864
正則化画像が見つかりませんでし

INFO [Dataset 0] config_util.py:572
batch_size: 1
resolution: (512, 512)
enable_bucket: True
network_multiplier: 1.0
min_bucket_reso: 256
max_bucket_reso: 2048
bucket_reso_steps: 64
bucket_no_upscale: True

                           [Subset 0 of Dataset 0]                         
                             image_dir:                                    
                         "D:\Data\Packages\kohya_ss\trai                   
                         ning\1_test"                                      
                             image_count: 1                                
                             num_repeats: 1                                
                             shuffle_caption: False                        
                             keep_tokens: 0                                
                             keep_tokens_separator:                        
                             caption_separator: ,                          
                             secondary_separator: None                     
                             enable_wildcard: False                        
                             caption_dropout_rate: 0.0                     
                             caption_dropout_every_n_epo                   
                         ches: 0                                           
                             caption_tag_dropout_rate:                     
                         0.0                                               
                             caption_prefix: None                          
                             caption_suffix: None                          
                             color_aug: False                              
                             flip_aug: False                               
                             face_crop_aug_range: None                     
                             random_crop: False                            
                             token_warmup_min: 1,                          
                             token_warmup_step: 0,                         
                             alpha_mask: False,                            
                             is_reg: False                                 
                             class_tokens: test                            
                             caption_extension: .txt                       
                                                                           
                                                                           
                INFO     [Dataset 0]                     config_util.py:578
                INFO     loading image sizes.             train_util.py:911

100%|██████████| 1/1 [00:00<?, ?it/s]
INFO make buckets train_util.py:917
WARNING min_bucket_reso and train_util.py:934
max_bucket_reso are ignored if
bucket_no_upscale is set,
because bucket reso is defined
by image size automatically /
bucket_no_upscaleが指定された場
合は、bucketの解像度は画像サイズ
から自動計算されるため、min_buck
et_resoとmax_bucket_resoは無視さ
れます
INFO number of images (including train_util.py:963
repeats) /
各bucketの画像枚数(繰り返し回数
を含む)
INFO bucket 0: resolution (512, 512), train_util.py:968
count: 1
INFO mean ar error (without repeats): train_util.py:973
0.0
WARNING clip_skip will be sdxl_train_util.py:352
unexpected /
SDXL学習ではclip_skipは動作
しません
INFO preparing accelerator train_network.py:225
accelerator device: cuda
INFO loading model for process sdxl_train_util.py:33
0/1
INFO load StableDiffusion sdxl_train_util.py:74
checkpoint:
D:/Data/Models/StableDiffusi
on/juggernautXL_version6Rund
iffusion.safetensors
INFO building U-Net sdxl_model_util.py:198
INFO loading U-Net from sdxl_model_util.py:202
checkpoint
2025-01-13 15:14:02 INFO U-Net:
INFO building text encoders sdxl_model_util.py:211
INFO loading text encoders from sdxl_model_util.py:264
checkpoint
INFO text encoder 1:
INFO text encoder 2:
INFO building VAE sdxl_model_util.py:285
2025-01-13 15:14:03 INFO loading VAE from checkpoint sdxl_model_util.py:290
INFO VAE:
INFO Enable xformers for U-Net train_util.py:3040
import network module: networks.lora
INFO [Dataset 0] train_util.py:2323
INFO caching latents. train_util.py:1095
INFO checking cache validity... train_util.py:1105
100%|██████████| 1/1 [00:00<?, ?it/s]
INFO caching latents... train_util.py:1144
0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "D:\Data\Packages\kohya_ss\sd-scripts\sdxl_train_network.py", line 185, in
trainer.train(args)
File "D:\Data\Packages\kohya_ss\sd-scripts\train_network.py", line 272, in train
train_dataset_group.cache_latents(vae, args.vae_batch_size, args.cache_latents_to_disk, accelerator.is_main_process)
File "D:\Data\Packages\kohya_ss\sd-scripts\library\train_util.py", line 2324, in cache_latents
dataset.cache_latents(vae, vae_batch_size, cache_to_disk, is_main_process, file_suffix)
File "D:\Data\Packages\kohya_ss\sd-scripts\library\train_util.py", line 1146, in cache_latents
cache_batch_latents(vae, cache_to_disk, batch, subset.flip_aug, subset.alpha_mask, subset.random_crop)
File "D:\Data\Packages\kohya_ss\sd-scripts\library\train_util.py", line 2772, in cache_batch_latents
raise RuntimeError(f"NaN detected in latents: {info.absolute_path}")
RuntimeError: NaN detected in latents: D:\Data\Packages\kohya_ss\training\1_test\1.png
Traceback (most recent call last):
File "runpy.py", line 196, in _run_module_as_main
File "runpy.py", line 86, in run_code
File "D:\Data\Packages\kohya_ss\venv\Scripts\accelerate.EXE_main
.py", line 7, in
sys.exit(main())
File "D:\Data\Packages\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main
args.func(args)
File "D:\Data\Packages\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1017, in launch_command
simple_launcher(args)
File "D:\Data\Packages\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 637, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['D:\Data\Packages\kohya_ss\venv\Scripts\python.exe', 'D:/Data/Packages/kohya_ss/sd-scripts/sdxl_train_network.py', '--config_file', 'D:/Data/Packages/kohya_ss/outputs/config_lora-20250113-151353.toml']' returned non-zero exit status 1.
15:14:05-544494 INFO Training has ended.
15:16:07-836381 INFO Start training LoRA Standard ...
15:16:07-837378 INFO Validating lr scheduler arguments...
15:16:07-838375 INFO Validating optimizer arguments...
15:16:07-838375 INFO Validating D:/Data/Packages/kohya_ss/outputs existence
and writability... SUCCESS
15:16:07-839371 INFO Validating
D:/Data/Models/StableDiffusion/juggernautXL_version6Ru
ndiffusion.safetensors existence... SUCCESS
15:16:07-840368 INFO Validating D:\Data\Packages\kohya_ss\training
existence... SUCCESS
15:16:09-570969 INFO Folder 1_test: 1 repeats found
15:16:09-571966 INFO Folder 1_test: 1 images found
15:16:09-572962 INFO Folder 1_test: 1 * 1 = 1 steps
15:16:09-573959 INFO Regulatization factor: 1
15:16:09-573959 INFO Total steps: 1
15:16:09-574956 INFO Train batch size: 1
15:16:09-575953 INFO Gradient accumulation steps: 1
15:16:09-576950 INFO Epoch: 2
15:16:09-576950 INFO max_train_steps (1 / 1 / 1 * 2 * 1) = 2
15:16:09-577946 INFO stop_text_encoder_training = 0
15:16:09-578943 INFO lr_warmup_steps = 0
15:16:09-581818 INFO Saving training config to
D:/Data/Packages/kohya_ss/outputs\test1-1_20250113-151
609.json...
15:16:09-584563 INFO Executing command:
D:\Data\Packages\kohya_ss\venv\Scripts\accelerate.EXE
launch --dynamo_backend no --dynamo_mode default
--mixed_precision fp16 --num_processes 1
--num_machines 1 --num_cpu_threads_per_process 2
D:/Data/Packages/kohya_ss/sd-scripts/sdxl_train_networ
k.py --config_file
D:/Data/Packages/kohya_ss/outputs/config_lora-20250113
-151609.toml
15:16:09-591017 INFO Command executed.
2025-01-13 15:16:15 INFO Loading settings from train_util.py:4174
D:/Data/Packages/kohya_ss/outpu
ts/config_lora-20250113-151609.
toml...
INFO D:/Data/Packages/kohya_ss/outpu train_util.py:4193
ts/config_lora-20250113-151609
2025-01-13 15:16:15 INFO prepare tokenizers sdxl_train_util.py:138
2025-01-13 15:16:16 INFO update token length: 75 sdxl_train_util.py:163
INFO Using DreamBooth method. train_network.py:172
INFO prepare images. train_util.py:1815
INFO found directory train_util.py:1762
D:\Data\Packages\kohya_ss\train
ing\1_test contains 1 image
files
WARNING No caption file found for 1 train_util.py:1793
images. Training will continue
without captions for these
images. If class token exists,
it will be used. /
1枚の画像にキャプションファイル
が見つかりませんでした。これら
の画像についてはキャプションな
しで学習を続行します。class
tokenが存在する場合はそれを使い
ます。
WARNING D:\Data\Packages\kohya_ss\train train_util.py:1800
ing\1_test\1.png
INFO 1 train images with repeating. train_util.py:1856
INFO 0 reg images. train_util.py:1859
WARNING no regularization images / train_util.py:1864
正則化画像が見つかりませんでし

INFO [Dataset 0] config_util.py:572
batch_size: 1
resolution: (512, 512)
enable_bucket: True
network_multiplier: 1.0
min_bucket_reso: 256
max_bucket_reso: 2048
bucket_reso_steps: 64
bucket_no_upscale: True

                           [Subset 0 of Dataset 0]                         
                             image_dir:                                    
                         "D:\Data\Packages\kohya_ss\trai                   
                         ning\1_test"                                      
                             image_count: 1                                
                             num_repeats: 1                                
                             shuffle_caption: False                        
                             keep_tokens: 0                                
                             keep_tokens_separator:                        
                             caption_separator: ,                          
                             secondary_separator: None                     
                             enable_wildcard: False                        
                             caption_dropout_rate: 0.0                     
                             caption_dropout_every_n_epo                   
                         ches: 0                                           
                             caption_tag_dropout_rate:                     
                         0.0                                               
                             caption_prefix: None                          
                             caption_suffix: None                          
                             color_aug: False                              
                             flip_aug: False                               
                             face_crop_aug_range: None                     
                             random_crop: False                            
                             token_warmup_min: 1,                          
                             token_warmup_step: 0,                         
                             alpha_mask: False,                            
                             is_reg: False                                 
                             class_tokens: test                            
                             caption_extension: .txt                       
                                                                           
                                                                           
                INFO     [Dataset 0]                     config_util.py:578
                INFO     loading image sizes.             train_util.py:911

100%|██████████| 1/1 [00:00<?, ?it/s]
INFO make buckets train_util.py:917
WARNING min_bucket_reso and train_util.py:934
max_bucket_reso are ignored if
bucket_no_upscale is set,
because bucket reso is defined
by image size automatically /
bucket_no_upscaleが指定された場
合は、bucketの解像度は画像サイズ
から自動計算されるため、min_buck
et_resoとmax_bucket_resoは無視さ
れます
INFO number of images (including train_util.py:963
repeats) /
各bucketの画像枚数(繰り返し回数
を含む)
INFO bucket 0: resolution (512, 512), train_util.py:968
count: 1
INFO mean ar error (without repeats): train_util.py:973
0.0
WARNING clip_skip will be sdxl_train_util.py:352
unexpected /
SDXL学習ではclip_skipは動作
しません
INFO preparing accelerator train_network.py:225
accelerator device: cuda
INFO loading model for process sdxl_train_util.py:33
0/1
INFO load StableDiffusion sdxl_train_util.py:74
checkpoint:
D:/Data/Models/StableDiffusi
on/juggernautXL_version6Rund
iffusion.safetensors
2025-01-13 15:16:17 INFO building U-Net sdxl_model_util.py:198
INFO loading U-Net from sdxl_model_util.py:202
checkpoint
2025-01-13 15:16:19 INFO U-Net:
INFO building text encoders sdxl_model_util.py:211
INFO loading text encoders from sdxl_model_util.py:264
checkpoint
INFO text encoder 1:
INFO text encoder 2:
INFO building VAE sdxl_model_util.py:285
INFO loading VAE from checkpoint sdxl_model_util.py:290
INFO VAE:
INFO Enable xformers for U-Net train_util.py:3040
import network module: networks.lora
2025-01-13 15:16:20 INFO [Dataset 0] train_util.py:2323
INFO caching latents. train_util.py:1095
INFO checking cache validity... train_util.py:1105
100%|██████████| 1/1 [00:00<?, ?it/s]
INFO caching latents... train_util.py:1144
0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "D:\Data\Packages\kohya_ss\sd-scripts\sdxl_train_network.py", line 185, in
trainer.train(args)
File "D:\Data\Packages\kohya_ss\sd-scripts\train_network.py", line 272, in train
train_dataset_group.cache_latents(vae, args.vae_batch_size, args.cache_latents_to_disk, accelerator.is_main_process)
File "D:\Data\Packages\kohya_ss\sd-scripts\library\train_util.py", line 2324, in cache_latents
dataset.cache_latents(vae, vae_batch_size, cache_to_disk, is_main_process, file_suffix)
File "D:\Data\Packages\kohya_ss\sd-scripts\library\train_util.py", line 1146, in cache_latents
cache_batch_latents(vae, cache_to_disk, batch, subset.flip_aug, subset.alpha_mask, subset.random_crop)
File "D:\Data\Packages\kohya_ss\sd-scripts\library\train_util.py", line 2772, in cache_batch_latents
raise RuntimeError(f"NaN detected in latents: {info.absolute_path}")
RuntimeError: NaN detected in latents: D:\Data\Packages\kohya_ss\training\1_test\1.png
Traceback (most recent call last):
File "runpy.py", line 196, in _run_module_as_main
File "runpy.py", line 86, in run_code
File "D:\Data\Packages\kohya_ss\venv\Scripts\accelerate.EXE_main
.py", line 7, in
sys.exit(main())
File "D:\Data\Packages\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main
args.func(args)
File "D:\Data\Packages\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1017, in launch_command
simple_launcher(args)
File "D:\Data\Packages\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 637, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['D:\Data\Packages\kohya_ss\venv\Scripts\python.exe', 'D:/Data/Packages/kohya_ss/sd-scripts/sdxl_train_network.py', '--config_file', 'D:/Data/Packages/kohya_ss/outputs/config_lora-20250113-151609.toml']' returned non-zero exit status 1.
15:16:21-852432 INFO Training has ended.

@AUTOMATIC2222
Copy link

check No half VAE

@m0r1s1
Copy link
Author

m0r1s1 commented Jan 13, 2025

Thank you very much!
I was able to achieve it.

@AUTOMATIC2222
Copy link

Thank you very much! I was able to achieve it.

どういたしまして

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants