Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I got loss = nan ..... #8

Open
jiachen0212 opened this issue Nov 12, 2018 · 5 comments
Open

I got loss = nan ..... #8

jiachen0212 opened this issue Nov 12, 2018 · 5 comments

Comments

@jiachen0212
Copy link

Solving VGG_VOC0712_SSD_300x300_train
I1112 16:14:25.373922 12274 solver.cpp:295] Learning Rate Policy: multistep
I1112 16:14:26.191864 12274 solver.cpp:243] Iteration 0, loss = 393.183
I1112 16:14:26.191922 12274 solver.cpp:259] Train net output #0: mbox_loss = 463.676 (* 1 = 463.676 loss)
I1112 16:14:26.191949 12274 sgd_solver.cpp:138] Iteration 0, lr = 0.001
I1112 16:14:37.323004 12274 solver.cpp:243] Iteration 10, loss = 1.70709e+06
I1112 16:14:37.323065 12274 solver.cpp:259] Train net output #0: mbox_loss = 1.00702e+07 (* 1 = 1.00702e+07 loss)
I1112 16:14:37.323082 12274 sgd_solver.cpp:138] Iteration 10, lr = 0.001
I1112 16:14:48.218691 12274 solver.cpp:243] Iteration 20, loss = nan
I1112 16:14:48.218749 12274 solver.cpp:259] Train net output #0: mbox_loss = nan (* 1 = nan loss)
I1112 16:14:48.218765 12274 sgd_solver.cpp:138] Iteration 20, lr = 0.001
I1112 16:14:59.353508 12274 solver.cpp:243] Iteration 30, loss = nan
I1112 16:14:59.353610 12274 solver.cpp:259] Train net output #0: mbox_loss = nan (* 1 = nan loss)
I1112 16:14:59.353627 12274 sgd_solver.cpp:138] Iteration 30, lr = 0.001

I only put the scripts to the /src and /include, then modified the train.prorotxt 。。。。but the loss
is 。。。。emmmm。。。
there must be something wrong....
shoule I do some other change????

@jiachen0212
Copy link
Author

I modified the base_lr=0.0001 then it gradual convergence.....:
I1112 17:21:25.095569 25993 solver.cpp:259] Train net output #0: mbox_loss = 4.49065 (* 1 = 4.49065 loss)
I1112 17:21:25.095585 25993 sgd_solver.cpp:138] Iteration 110, lr = 0.0001
I1112 17:21:36.336560 25993 solver.cpp:243] Iteration 120, loss = 5.18974
I1112 17:21:36.336649 25993 solver.cpp:259] Train net output #0: mbox_loss = 4.87355 (* 1 = 4.87355 loss)
I1112 17:21:36.336664 25993 sgd_solver.cpp:138] Iteration 120, lr = 0.0001
I1112 17:21:47.554883 25993 solver.cpp:243] Iteration 130, loss = 5.10873
I1112 17:21:47.554940 25993 solver.cpp:259] Train net output #0: mbox_loss = 4.38195 (* 1 = 4.38195 loss)
I1112 17:21:47.554955 25993 sgd_solver.cpp:138] Iteration 130, lr = 0.0001
I1112 17:21:58.538343 25993 solver.cpp:243] Iteration 140, loss = 4.94118
I1112 17:21:58.538398 25993 solver.cpp:259] Train net output #0: mbox_loss = 4.48055 (* 1 = 4.48055 loss)
I1112 17:21:58.538422 25993 sgd_solver.cpp:138] Iteration 140, lr = 0.0001
I1112 17:22:09.673602 25993 solver.cpp:243] Iteration 150, loss = 4.66486
I1112 17:22:09.673713 25993 solver.cpp:259] Train net output #0: mbox_loss = 4.88057 (* 1 = 4.88057 loss)
I1112 17:22:09.673729 25993 sgd_solver.cpp:138] Iteration 150, lr = 0.0001
I1112 17:22:20.823526 25993 solver.cpp:243] Iteration 160, loss = 4.64496
I1112 17:22:20.823588 25993 solver.cpp:259] Train net output #0: mbox_loss = 4.61728 (* 1 = 4.61728 loss)
I1112 17:22:20.823606 25993 sgd_solver.cpp:138] Iteration 160, lr = 0.0001
I1112 17:22:32.045835 25993 solver.cpp:243] Iteration 170, loss = 4.2356
I1112 17:22:32.045897 25993 solver.cpp:259] Train net output #0: mbox_loss = 4.33995 (* 1 = 4.33995 loss)
I1112 17:22:32.045913 25993 sgd_solver.cpp:138] Iteration 170, lr = 0.0001
I1112 17:22:43.136955 25993 solver.cpp:243] Iteration 180, loss = 4.34946
I1112 17:22:43.137040 25993 solver.cpp:259] Train net output #0: mbox_loss = 4.53899 (* 1 = 4.53899 loss)
I1112 17:22:43.137056 25993 sgd_solver.cpp:138] Iteration 180, lr = 0.0001
I1112 17:22:54.255470 25993 solver.cpp:243] Iteration 190, loss = 4.43321
I1112 17:22:54.255529 25993 solver.cpp:259] Train net output #0: mbox_loss = 4.61062 (* 1 = 4.61062 loss)
I1112 17:22:54.255547 25993 sgd_solver.cpp:138] Iteration 190, lr = 0.0001
I1112 17:23:05.433228 25993 solver.cpp:243] Iteration 200, loss = 4.18751
I1112 17:23:05.433287 25993 solver.cpp:259] Train net output #0: mbox_loss = 4.0284 (* 1 = 4.0284 loss)
I1112 17:23:05.433302 25993 sgd_solver.cpp:138] Iteration 200, lr = 0.0001
I1112 17:23:16.648353 25993 solver.cpp:243] Iteration 210, loss = 4.21189
I1112 17:23:16.648444 25993 solver.cpp:259] Train net output #0: mbox_loss = 4.16318 (* 1 = 4.16318 loss)

@peijinwang
Copy link

how to add multibux_focal_loss to the code ,this is no .cu file

@jiachen0212
Copy link
Author

@dubiwei the multibux_focal_loss do not need .cu.....

@jiachen0212
Copy link
Author

after 12w iters, i got:
Iteration 120000, loss = 1.6058
I1114 06:50:20.753614 26071 solver.cpp:433] Iteration 120000, Testing net (#0)
I1114 06:50:20.753654 26071 net.cpp:693] Ignoring source layer mbox_loss
I1114 06:51:53.275323 26071 solver.cpp:546] Test net output #0: detection_eval = 0.524345
I1114 06:51:53.275491 26071 solver.cpp:337] Optimization Done.
I1114 06:51:53.275498 26071 caffe.cpp:254] Optimization Done.
...........

@jiachen0212
Copy link
Author

I modified the max_iter to be 10w but still get a lower map:..... 0.517319

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants