UBC Math Department Colloquium: Aukosh Jagannath
Topic
The training dynamics and local geometry of high-dimensional learning
Speakers
Details
Many modern data science tasks can be expressed as optimizing a complex, random function in high dimensions. The go-to method for such problems is Stochastic gradient descent (SGD), which performs remarkably well—c.f. the success of modern neural networks. However, the rigorous analysis of SGD on natural, high-dimensional statistical models is in its infancy. In this talk, we study a general model that captures a broad range of learning tasks, from Matrix and Tensor PCA to training multi-layer neural networks to classify mixture models. We show the evolution of natural summary statistics along training converge, in the high-dimensional limit, to a closed, finite-dimensional dynamical system called their effective dynamics. We then turn to understanding the landscape of training from the point-of-view of the algorithm. We show that in this limit, the spectrum of the Hessian and Information matrices admit an effective spectral theory: the limiting empirical spectral measure and outliers have explicit characterizations that depend only on these summary statistics. Using these tools we find rich phenomenology, such as the failure to converge from natural initializations, degenerate diffusive limits, and dynamical spectral transitions. This talk surveys a series of joint works with G. Ben Arous (NYU), R. Gheissari (Northwestern), and J. Huang (U Penn).