MaLGa Colloquia: The Emerging Science of Machine Learning Benchmarks
Title
MaLGa Colloquia: The Emerging Science of Machine Learning Benchmarks
Speaker
Moritz Hardt - Max Planck Institute for Intelligent Systems, Tübingen, Germany
Abstract:
Benchmarks have played a central role in the progress of machine learning research since the 1980s. Although there's much researchers have done with them, we still know little about how and why benchmarks work as an engine of scientific progress. In this talk, I will trace the rudiments of an emerging science of benchmarks through selected empirical and theoretical observations. Looking back, I'll discuss the key scientific lessons about benchmarks from the ImageNet era, focusing on the validity of model rankings. Looking ahead, I'll talk about new challenges to benchmarking and evaluation in the era of large language models. The results we'll encounter challenge conventional wisdom and underscore the benefits of developing a science of benchmarks.
Bio:
Moritz Hardt is a director at the Max Planck Institute for Intelligent Systems. Prior to joining the institute, he was Associate Professor for Electrical Engineering and Computer Sciences at the University of California, Berkeley. His research contributes to the scientific foundations of machine learning and algorithmic decision making from a social perspective. He is a co-author of the textbooks Fairness and Machine Learning: Limitations and Opportunities (MIT Press) and Patterns, Predictions, and Actions: Foundations of Machine Learning (Princeton University Press).
When
Monday March 31st, 16:00
Where
Room 322, UniGe DIBRIS/DIMA, Via Dodecaneso 35