The kind of math that would be necessary to understand papers like .
It's not so much that there's just one or two specific results, but rather that there's just far more researchers working on this now, and quite a few are working on more theoretical stuff. Sometimes, that theory results in really practical outcomes - a good example would be https://arxiv.org/abs/1701.07875 . Or another by the same researcher: https://arxiv.org/abs/1506.00059 .
We've seen, in the last year or two, interesting results in nearly every area of theoretical research of deep learning, including generalization, optimization, generative modelling, bayesian models, and network architecture.