What Has Your Model Learned? Interpreting the Behavior of Pre-Trained Transformers
However, our understanding of what drives their success is still limited. In this talk, I will discuss our work and other recent studies that aim to understand what Transformers learn during pre-training and how that knowledge is represented in the model. I will present several methods for analyzing self-attention patterns in Transformers and probing the information encoded in them. I will also discuss overparameterization in such models from the perspective of the lottery ticket hypothesis and show how the information stored by a model can be analyzed by studying the stability of “lucky” subnetworks across different downstream tasks. I will conclude by discussing some of our ongoing work on improving both the self-attention mechanism and the pre-training methods for these models.
Anna Rumshisky is an Associate Professor of Computer Science at the University of Massachusetts Lowell, where she leads the Text Machine Lab for NLP. Her primary research area is machine learning for natural language processing, with a focus on deep learning techniques. She has contributed to a number of application areas, including computational social science and clinical informatics. She received her Ph.D. from Brandeis University and completed postdoctoral training at MIT CSAIL, where she currently holds a Research Affiliate position. She is a recipient of the NSF CAREER award in 2017, and her work won the best thematic paper award at NAACL-HLT 2019. Her research has been funded by the NSF, NIH, Army Research Office, National Endowment for the Humanities, and other industry sponsors and government agencies.
Author: Anna Rumshinsky