Dynamical low-rank training
Francesco Tudisco - University of Edinburgh, UK
As model and data sizes increase, modern AI progress faces pressing questions about timing, costs, energy consumption, and accessibility. As a consequence, growing attention has been devoted to network pruning techniques able to reduce model size and computational footprint, while retaining model performance. The majority of these techniques focus on reducing inference costs by pruning the network after a pass of full training. A smaller number of methods address the reduction of training costs, mostly based on compressing the network via low-rank layer factorizations. In fact, a variety of empirical and theoretical evidence has recently shown that deep networks exhibit a form of low-rank bias, hinting at the existence of highly performing low-rank subnetworks. In this talk, I will present our recent work on low-rank models in deep learning, including some of our recent results on implicit low-rank bias and our dynamical low-rank training algorithm, which uses a form of Riemannian optimization to train small factorized network layers while simultaneously adjusting their rank. I will describe several results about convergence and approximation as well as experimental evidence of performance quality as compared to recent pruning techniques on several convolutional network models.
I am an Associate Professor (Reader) in Machine Learning within the School of Mathematics at the University of Edinburgh, UK. I am also affiliated with the Numerical Analysis and Data Science group at the Gran Sasso Science Institute, Italy. My research interests lie at the intersection between matrix analysis, scientific computing, optimization, and machine learning. My recent work includes the use of model order reduction techniques from matrix and tensor differential equations to design efficient training algorithms for deep learning models, the stability and bias of deep networks and neural PDEs, the analysis of neural networks in the infinite width and infinite depth limits, nonlinear spectral theory with applications to machine learning on graphs, and the use of physics informed machine learning models to reconstruct and compress flow data.
Monday November 6th, 15:00
Room 322, DIBRIS, Via Dodecaneso 35