MaLGa logoMaLGa black extendedMaLGa white extendedUniGe ¦ MaLGaUniGe ¦ MaLGaUniversita di Genova | MaLGaUniversita di Genova

A rainbow in deep network black boxes




A rainbow in deep network black boxes


Florentin Guth - École Normale Supérieure, Paris


One of many puzzles in deep learning is that every training run results in a different set of network weights but nevertheless leads to the same performance. A central observation made in prior work is that independently trained networks learn the same representations at each layer, but up to rotations. It suggests that all networks learn the same things up to an internal rotation symmetry at each layer. Inspired by these results, we introduce a probabilistic model of the trained weights. The model is specified by weight distributions at each layer, in the manner of a multi-layer mean-field model. Crucially, weights at different layers are not independent as each layer is "aligned" to the previous one. The model is estimated from one or several trained networks, and then allows generating new network weights which in some cases perform as well without any training. An important ingredient of the model is the covariances of weight distributions at each layer, which describe the "dimensionality" properties of the network. I will show that the weight distributions can be made Gaussian by using a structured network architecture. This approach also allows us to describe the training dynamics of such networks.


Florentin Guth is a finishing PhD student at École Normale Supérieure in Paris, advised by Prof. Stéphane Mallat. Starting September 2023, he will be a Faculty Fellow in the Center for Data Science at New York University and a Research Fellow in the Center for Computational Neuroscience at the Flatiron Institute. His research aims at understanding how deep learning manages to escape the curse of dimensionality by exploiting the structure of real-world learning problems, with a focus on image classification and generation tasks. His research evidences properties of image datasets that deep networks rely on, as well as some of the mathematical structure of learned network weights and computations.


Monday July 17th, 15:00


Room 705, DIMA, Via Dodecaneso 35