Aug 24, 2016

Good list! I think it's important to note that this article is (intentionally) focused on modern CNN architectures, and not "deep learning" in general.

I'd also add in the following "technique" articles: Geoff Hinton et al.'s dropout paper[0] and Loffe and Szegedy's Batch Normalization paper[1]. I don't think there's been enough time for the dust to settle, but I'm excited about the possibilities Stochastic Depth[2] could offer, too.

[0]: http://arxiv.org/abs/1207.0580 [1]: http://arxiv.org/abs/1502.03167 [2]: http://arxiv.org/abs/1603.09382