The regularization of depthwise convolution #56

BOBrown · 2018-03-04T09:10:05Z

The author wrote following words in paper:
Additionally, we found that it was important to put very little or no weight decay (l2 regularization) on the depthwise filters since their are so few parameters in them.

Therefore, i think that we should set decay_mult: 0.0 in the moblienet prototxt

mathmanu · 2018-06-05T07:32:58Z

Isn't this line taken from the MobilenetV1 paper? I couldn't find any such statement in the MobilenetV2 paper.

I wonder if all parameters are to be decayed in MobileNetV2 training - at-least that's the understanding that I get by looking at the repository's (very few) that provide a training script:
eg: https://github.com/Randl/MobileNetV2-pytorch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The regularization of depthwise convolution #56

The regularization of depthwise convolution #56

BOBrown commented Mar 4, 2018

mathmanu commented Jun 5, 2018

The regularization of depthwise convolution #56

The regularization of depthwise convolution #56

Comments

BOBrown commented Mar 4, 2018

mathmanu commented Jun 5, 2018