-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How about the result with SSD? #1
Comments
I tested the solution. loss computing may has errors. |
@mychina75 I found the error and corrected it, for verification I checked the gradient by check_focal_diff.py. |
loss can steady reduce now, but the evaluation result of model getting worse... |
@mychina75 |
@chuanqi305 I test your focal loss with SSD, not mobilenet-ssd. I merge your code and change mining_type to NONE. The final loss can decrease to 0.3 , but the detection_eval = 74% worse than ohem 77%. Do you have other training tricks? |
@bailvwangzi No, I did not get a higher mAP, too. Just the same as OHEM. |
@chuanqi305 I0904 13:14:08.266741 13639 solver.cpp:433] Iteration 2000, Testing net (#0) |
My implementation is almost the same as you besides some minor differences which are claimed in the paper. I shall test my function and investigate whether these differences are crucial or not |
@mychina75 Too few iterations, you should evaluate after iteration 30000~50000. |
@XiongweiWu Can you talk about some details? In my test, the performance has not been improved, focal loss is not better than OHEM. |
@chuanqi305 is it normal ? |
No, the loss should be < 10 after 10 iterations. Maybe there is a bug in your network structure? |
@chuanqi305 Sorry for replying late. I did a series experiments on VOC07 with Fast RCNN, ZF backbone. The baseline is 57.1%. In your implementation, alpha is shared with all categories and only one K+1 classifiers is learned. The paper said K 2-class classifiers are trained and alpha is class-dependent. I use your code directly and achieve 53.3% mAP in my settings and when I replace all alpha to 1 the accuracy reaches 57.4%, slightly better than baseline. However, when I use all proposals to train, the performance reduce to 56.8%(worse than OHEM). The difficulty is the loss weight in bounding box regressor loss since we cannot use all samples to smooth. I will test in SSD today and hope you can also share some results |
@XiongweiWu Hi,I chaged a two-stage net RON(very similar to FPN) to one-stage net just like the paper did, and use all proposals to train.But my AP is too low. Do you have time to check my net ? 3q |
@bailvwangzi I'm training the nomal SSD and SSD with focal loss together and use ResNet101 as base line. The detection_eval of nomal SSD is 0.68 at iteration 10000, but the detection_eval of SSD with focal loss is just 0.45 at iteration 20000. It seems like that SSD with focal loss becomes very hard to train .Have you been through of this during training? |
@zhanglonghao1992 the same with you. I get up to 74 mAP after 18w iteration. To avoid the effect of initialization, I use normal SSD model(e.x. you can use normal SSD iteration 10000) as pre-trained model to finetune, it can converge faster. |
@bailvwangzi @zhanglonghao1992 hi, I just finish the ablation experiment on SSD with focal loss trained on VOC07 dataset. The performance of SSD is not as good as paper said ><. SSD's benchmark is 77.4% and 62% w or w\o data augmentation, while my result is 74.1% and 66% on focal loss. I remember the original paper said they remove all data augmentation tricks except mirror. I need more time to investigate, maybe the dataset, maybe the learning parameters, maybe the implementation(I think the implementation should be quite simple...) |
@XiongweiWu nice ablation work, thanks.Looking forward to your better result! |
@bailvwangzi Hi, my mAP is still 0.6 after 18w iters using SSD with focal loss..You said your mAP on 18w iter is 0.74? How you do that? Do you change the lr rate or use nomal SSD model to initialize the model? |
@chuanqi305 Hi ,I use your code on SSD with Resnet-101 but the final result is 0.6.. Do you change the lr rate or some other params? How about the pretrain model? |
@chuanqi305 ..When I use VGG16, lr_rate=0.001 will make loss=nan, but ResNet-101 is ok with 0.001. I have to set lr_rate=0.0001 to train VGG16 |
@XiongweiWu Hi ,could you leave your qq or E-mail address? I got some troubles when training SSD with focal loss on VGG16 and ResNet-101 |
@mychina75 The same with you..Have you sloved that? |
@zhanglonghao1992 no... I can not get better result.. maybe need to change some parameters? |
@mychina75 It only happens when I use VGG16. This 'Missing true_pos for label' never appears when i use ResNet101. I dont know why |
@chuanqi305 Thank you very much for sharing your Focal Loss implementation. I tested your code and also found no improvement with respect to original SSD. maybe the focal loss is not the key factor for the retinaNet? |
@pbdahzou Maybe the Focal Loss is similar to OHEM in the training effect. The retinaNet use FPN framework, maybe the key factor is 'Deconvolution'. |
Has any one tried both kind of losses together - i.e. some thing like: layer { layer { |
It seems this has implemented Softmax Focal Loss, where as the original paper RetinaNet paper descibed used of Sigmoid instead of Softmax to compute the p. (See equatioon 5 and the paragraph below that). Also see this discussion. kuangliu/pytorch-retinanet#6 Has any one tried Sigmod for the Focal loss layer? |
@mathmanu even worse. |
This work is useful for experimentation. Could you please add a license file? May be the same license file in caffe: Thank you. |
@mathmanu I chose a MIT license, feel free for using this project. |
Thank you. |
@zhanglonghao1992 when I try SSD with VGG16,base lr =0.001, its loss become "nan".When I change base lr to 0.0001. its loss will decrease , after 10k iteration, the loss is about 1.9. While the final detection eval can only achieve 0.29. It is similar to your problem. Have you solved it? |
@zhanglonghao1992 , How is the result? does this help in detecting smaller objects with better accuracy? |
I'm glad to see your work with focal loss, have you got some better performance with focal loss than ohem in ssd? Moreover, have you test focal loss with your another work MobileNet-SSD?Thanks!
The text was updated successfully, but these errors were encountered: