-
Notifications
You must be signed in to change notification settings - Fork 617
Description
Stochastic Depth (aka layer dropout) has been shown to speed up and improve training in ResNets, as well as overall accuracy on testing sets. Essentially, every training step a random subset of residual layers are entirely removed from the network, and training proceeds on the remaining layers. Direct connections are made between the missing layers.
It is described in this paper: https://arxiv.org/pdf/1603.09382.pdf. (Deep Networks with Stochastic Depth by Gao Huang, Yu Sun, Zhuang Liu, Daniel Sedra, Kilian Q. Weinberger)
I can't think of a way to implement this with the python API without reconstructing the model every training iteration, and I'm not familiar the with the C++ API / Cudnn to try to write the op myself.
Of course I'm willing to try any python-only suggestions.
Thanks in advance,
Alex