Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue about training the model #37

Open
xrjiang527 opened this issue Aug 21, 2019 · 5 comments
Open

issue about training the model #37

xrjiang527 opened this issue Aug 21, 2019 · 5 comments

Comments

@xrjiang527
Copy link

===> Training Epoch: [1/1000]... Learning Rate: 0.000100
Epoch: [1/1000]: 100%|#####################################| 1000/1000 [11:54<00:00, 1.40it/s, Batch Loss: 0.5712]

Epoch: [1/1000] Avg Train Loss: 9.177065
===> Validating...
[Set5] PSNR: 12.43 SSIM: 0.0694 Loss: 0.629247 Best PSNR: 12.43 in Epoch: [1]
===> Saving last checkpoint to [experiments/RDN_in3f64_x2/epochs/last_ckp.pth] ...]
===> Saving best checkpoint to [experiments/RDN_in3f64_x2/epochs/best_ckp.pth] ...]

The test results were wrong
what should I do to solve the problem?
thank you

@Paper99
Copy link
Owner

Paper99 commented Aug 21, 2019

When training RDN, please ensure the rgb_range in your *.json file is 1.
If it is, the average training loss (9.177..) for rgb_range=1 is too large. Smaller than 1 is reasonable.
My guess is that there is something wrong with your training data.

@xrjiang527
Copy link
Author

I use the Prepare_TrainData_HR_LR.m to generate HR/LR training pairs. When preparing the x2 data, I only change the scale '4' to' 2'. If there is something wrong with training data.
thank you!!

@Paper99
Copy link
Owner

Paper99 commented Aug 21, 2019

I re-confirmed the training process of RDNx2. It is OK.
My log is shown below:

Method: RDN || Scale: 2 || Epoch Range: (1 ~ 1000)

===> Training Epoch: [1/1000]...  Learning Rate: 0.000100
Epoch: [1/1000]: 100%|██████████| 1000/1000 [05:30<00:00,  2.82it/s, Batch Loss: 0.0297]

Epoch: [1/1000]   Avg Train Loss: 0.068141
===> Validating...
[Set5] PSNR: 28.15   SSIM: 0.9314   Loss: 0.038095   Best PSNR: 28.15 in Epoch: [1]
===> Saving last checkpoint to [experiments/RDN_in3f64_x2/epochs/last_ckp.pth] ...]
===> Saving best checkpoint to [experiments/RDN_in3f64_x2/epochs/best_ckp.pth] ...]

Maybe you can clone the latest code and try again, or regenerate your training data.

@Paper99
Copy link
Owner

Paper99 commented Sep 26, 2019

I found the reason causing your mentioned problems. Comment this line for training a model without Kaiming initialization.
This is a very interesting phenomenon for image SR.

@Senwang98
Copy link

@Paper99 have you trained the whole RDN? How about the final result?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants