Toronto Probability

Event Information A Central Limit Theorem for Deep Neural Networks and Products of Random Matrices
14:10 on Monday September 24, 2018
15:00 on Monday September 24, 2018
FI210, Fields Institute, 222 College St.
Mihai Nica
http://www.math.toronto.edu/mnica/
University of Toronto

We study the output-input Jacobian matrix for deep ReLU neural networks when they are initialized with random weights. We reduce the problem to studying certain products of random matrices and show that the norm of columns of this matrix are approximately log-normal distributed. The result holds for a large class of random weights. The variance depends on the depth-to-width aspect ratio of the network; this result provides an explanation for why very deep networks can suffer from the "vanishing and exploding " gradient problem that makes these networks difficult to train. Based on joint work with Boris Hanin.

Some background material (which will be coverd in the talk) can also be reviewed at this preprint: https://arxiv.org/abs/1801.03744