Foundations of deep convolutional models through kernel methods
Alberto Bietti - New York University
Deep learning has been most widely successful in tasks where the data presents a rich structure, such as images, audio, or text. The choice of network architecture is believed to play a key role in exploiting this structure, for instance through convolutions and pooling on natural signals, yet a precise study of these properties and how they affect learning guarantees is still missing. Another challenge for the theoretical understanding of deep learning models is that they are often over-parameterized and known to be powerful function approximators, while being seemingly easy to optimize using gradient methods. We study deep models through the lens of kernel methods, which naturally define functional spaces for learning in a non-parametric manner, and naturally appear when considering the optimization of infinitely-wide networks in certain regimes. This allows us to study invariance and stability properties of various convolutional architectures by studying the geometry of the kernel mapping, as well as approximation properties of learning in different regimes.
Alberto is a Faculty Fellow/Postdoc at the NYU Center for Data Science in New York. He completed his PhD in 2019 from Inria and Université Grenoble-Alpes under the supervision of Julien Mairal, and later spent part of 2020 as a postdoc at Inria Paris hosted by Francis Bach. His research interests revolve around machine learning, optimization and statistics, with a focus on developing the theoretical foundations of deep learning.
2021-02-16 at 3:00 pm