The role of depth in neural networks: function space geometry and learnability
Title
The role of depth in neural networks: function space geometry and learnability
Speaker
Rebecca Willett - University of Chicago
Speaker
Rebecca Willet - University of Chicago
Abstract
Neural network architectures play a key role in determining which functions are fit to training data and the resulting generalization properties of learned predictors. For instance, imagine training an overparameterized neural network to interpolate a set of training samples using weight decay; the network architecture will influence which interpolating function is learned. In this talk, I will describe new insights into the role of network depth in machine learning using the notion of representation costs – i.e., how much it “costs” for a neural network to represent some function f. Understanding representation costs helps reveal the role of network depth in machine learning. First, we will see that there is a family of functions that can be learned with depth-3 networks when the number of samples is polynomial in the input dimension d, but which cannot be learned with depth-2 networks unless the number of samples is exponential in d. Furthermore, no functions can easily be learned with depth-2 networks while being difficult to learn with depth-3 networks. Together, these results mean deeper networks have an unambiguous advantage over shallower networks in terms of sample complexity. Second, I will show that adding linear layers to a ReLU network yields a representation cost that favors functions with latent low-dimension structure, such as single- and multi-index models. Together, these results highlight the role of network depth from a function space perspective and yield new tools for understanding neural network generalization.
Bio
Rebecca Willett is a Professor of Statistics and Computer Science and the Director of AI in the Data Science Institute at the University of Chicago, and she holds a courtesy appointment at the Toyota Technological Institute at Chicago. Her research is focused on machine learning foundations, scientific machine learning, and signal processing. She is the Deputy Director for Research at the NSF-Simons Foundation National Institute for Theory and Mathematics in Biology and a member of the NSF Institute for the Foundations of Data Science Executive Committee. She is the Faculty Director of the Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship and helps direct the Air Force Research Lab University Center of Excellence on Machine Learning. Willett received the inaugural SIAM Activity Group on Data Science Career Prize in 2024, the National Science Foundation CAREER Award in 2007, was a member of the DARPA Computer Science Study Group, received an Air Force Office of Scientific Research Young Investigator Program award in 2010, was named a Fellow of the Society of Industrial and Applied Mathematics in 2021, and was named a Fellow of the IEEE in 2022. She completed her PhD in Electrical and Computer Engineering at Rice University in 2005 and was an Assistant then tenured Associate Professor of Electrical and Computer Engineering at Duke University from 2005 to 2013. She was an Associate Professor of Electrical and Computer Engineering, Harvey D. Spangler Faculty Scholar, and Fellow of the Wisconsin Institutes for Discovery at the University of Wisconsin-Madison from 2013 to 2018.
When
Monday March 4th, 4.00pm
Where
Room 322 @ DIBRIS/DIMA, Via Dodecaneso 35, Genoa