Seminar

What Neural Networks Memorize and Why

24/03/2023

Title

Speaker

Vitaly Feldman - Apple ML Research

Abstract

Deep learning algorithms tend to fit the entire training dataset (nearly) perfectly including mislabeled examples and outliers. In addition, in extreme cases, complex models appear to memorize entire input examples, including seemingly irrelevant information (social security numbers from text, for example). This propensity to memorize seemingly useless data and the resulting large generalization gap have puzzled many practitioners and is not explained by existing theories of machine learning. We provide a simple conceptual explanation and a theoretical model demonstrating that memorization of labels is necessary for achieving close-to-optimal generalization error when learning from long-tailed data distributions. We also describe natural prediction problems for which every sufficiently accurate training algorithm must encode, in the prediction model, essentially all the information about a large subset of its training examples. This remains true even when most of that information is ultimately irrelevant to the task at hand. Finally, we demonstrate the utility of memorization and support our explanation empirically. These results rely on a new technique for efficiently estimating memorization and influence of training data points.

Bio

Vitaly Feldman is a research scientist at Apple ML Research working on foundations of machine learning and privacy-preserving data analysis. His recent research interests include tools for analysis of generalization, distributed privacy-preserving learning, privacy-preserving optimization, and adaptive data analysis. Vitaly holds a Ph.D. from Harvard (2006, advised by Leslie Valiant) and was previously a research scientist at Google Research (Brain team) and IBM Research - Almaden. His work on understanding of memorization in learning was recognized by the 2021 Caspar Bowden Award for Outstanding Research in Privacy Enhancing Technologies and his research on foundations of adaptive data analysis was featured in CACM Research Highlights and Science. His works were also recognized by COLT Best Student Paper Award in 2005 and 2013 (student co-authored) and by the IBM Research Best Paper Award in 2014, 2015 and 2016. He served as a program co-chair for COLT 2016 and ALT 2021 conferences and as a co-organizer of the Simons Institute Program on Data Privacy in 2019.

When

March 24th 2023, 14:30

Where

Room 322, UniGe DIBRIS, Via Dodecaneso 35