Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The NAN loss value in SSD #10

Closed
dodgaga opened this issue Apr 3, 2018 · 1 comment
Closed

The NAN loss value in SSD #10

dodgaga opened this issue Apr 3, 2018 · 1 comment

Comments

@dodgaga
Copy link

dodgaga commented Apr 3, 2018

Hi,
I ran the SSD code in the baseline to train the ctw datasets with the batch of 12 (instead of 14 because of the limited GPU memory), but the loss is NAN. I just followd the "CTW dataset tutorial (Part 3: detection baseline)", and I don't change any things except the batch-size. Can you give me some advice?

I0403 09:59:07.896572 38087 solver.cpp:259] Train net output #0: mbox_loss = nan (* 1 = nan loss)
I0403 09:59:08.678768 38087 sgd_solver.cpp:138] Iteration 860, lr = 0.001
I0403 09:59:25.406322 38087 solver.cpp:243] Iteration 870, loss = nan
I0403 09:59:25.406674 38087 solver.cpp:259] Train net output #0: mbox_loss = nan (* 1 = nan loss)
I0403 09:59:25.406772 38087 sgd_solver.cpp:138] Iteration 870, lr = 0.001
I0403 09:59:40.899689 38087 solver.cpp:243] Iteration 880, loss = nan
I0403 09:59:40.899760 38087 solver.cpp:259] Train net output #0: mbox_loss = nan (* 1 = nan loss)
I0403 09:59:41.602229 38087 sgd_solver.cpp:138] Iteration 880, lr = 0.001
I0403 09:59:57.435994 38087 solver.cpp:243] Iteration 890, loss = nan
I0403 09:59:57.436153 38087 solver.cpp:259] Train net output #0: mbox_loss = nan (* 1 = nan loss)
I0403 09:59:57.436187 38087 sgd_solver.cpp:138] Iteration 890, lr = 0.001
I0403 10:00:14.717105 38087 solver.cpp:243] Iteration 900, loss = nan
I0403 10:00:14.717172 38087 solver.cpp:259] Train net output #0: mbox_loss = nan (* 1 = nan loss)
I0403 10:00:14.717288 38087 sgd_solver.cpp:138] Iteration 900, lr = 0.001
I0403 10:00:31.561822 38087 solver.cpp:243] Iteration 910, loss = nan
I0403 10:00:31.562093 38087 solver.cpp:259] Train net output #0: mbox_loss = nan (* 1 = nan loss)
I0403 10:00:32.274315 38087 sgd_solver.cpp:138] Iteration 910, lr = 0.001
I0403 10:00:48.392671 38087 solver.cpp:243] Iteration 920, loss = nan
I0403 10:00:48.392729 38087 solver.cpp:259] Train net output #0: mbox_loss = nan (* 1 = nan loss)
I0403 10:00:48.392833 38087 sgd_solver.cpp:138] Iteration 920, lr = 0.001
I0403 10:01:04.803617 38087 solver.cpp:243] Iteration 930, loss = nan
I0403 10:01:04.804121 38087 solver.cpp:259] Train net output #0: mbox_loss = nan (* 1 = nan loss)
I0403 10:01:05.511602 38087 sgd_solver.cpp:138] Iteration 930, lr = 0.001
I0403 10:01:21.101698 38087 solver.cpp:243] Iteration 940, loss = nan
I0403 10:01:21.101753 38087 solver.cpp:259] Train net output #0: mbox_loss = nan (* 1 = nan loss)
I0403 10:01:21.807464 38087 sgd_solver.cpp:138] Iteration 940, lr = 0.001

@dodgaga
Copy link
Author

dodgaga commented Apr 3, 2018

I lower the initial learning rate and solve the problems as SSD issue 543

@dodgaga dodgaga closed this as completed Apr 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant