You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I refered to your and Hao Mood's code and trained the BCNN model fine tuning all layers,
and the best test accuracy I can reach was ~73%/~61% with and without pretrained VGG16.
Is it easy to reach the accuracy of 84% you report?
I used almost the same hyperparameter setting, except the batch-size.
Due to memory constraint, I can only set batch-size as 12.
I doubt the small batch size would hurt the training but have no evidence.
Since the VGG16 I used didn't include BN layers, and people just said
small batch size can provide noise in training to prevent from poor generalization.
Because small batch size increase the variance of gradient,
so I also tried to tune the lr rate in order to adjust that, but still can't improve the result.
Could you give me some advice on how to reach the 84% accuracy?
or confirm that it is not possible to reach 84% accuracy when batch size is 12.
The text was updated successfully, but these errors were encountered:
Hi @hcygeorge
In my experience, BCNNs have been tricky to train with different hyperparameters, including batch size.
Long ago I'd tried to replicate results using LuaTorch, but like you, had to reduce batch size. I could get close (~1-2% gap) to the official results by tweaking the learning rate and the momentum according to the changed batch size.
My suggestion would be to keep trying to tweak the LR and momentum, or try a larger batch size.
Last day I decided to downsize the image to 224x224 in order to increase batch size up to 64.
And with pretrained VGG16 model, the test accuracy of BCNN reached 71%, while the train accuracy
had reached ~100%. So I think that it is the best result we can get using BCNN on this down-sized
dataset.
I refered to your and Hao Mood's code and trained the BCNN model fine tuning all layers,
and the best test accuracy I can reach was ~73%/~61% with and without pretrained VGG16.
Is it easy to reach the accuracy of 84% you report?
I used almost the same hyperparameter setting, except the batch-size.
Due to memory constraint, I can only set batch-size as 12.
I doubt the small batch size would hurt the training but have no evidence.
Since the VGG16 I used didn't include BN layers, and people just said
small batch size can provide noise in training to prevent from poor generalization.
Because small batch size increase the variance of gradient,
so I also tried to tune the lr rate in order to adjust that, but still can't improve the result.
Could you give me some advice on how to reach the 84% accuracy?
or confirm that it is not possible to reach 84% accuracy when batch size is 12.
The text was updated successfully, but these errors were encountered: