AI Seminar

AI Seminar: Iryna Gurevich – Comment – Link – Revise: Towards a General Framework for Modelling Interconnected Texts

Iryna GurevichProfessor, Computer ScienceTechnische Universität Darmstadt, Germany

Zoom (password: UMichAI)


The ability to find and interpret cross-document relations is crucial in many fields of human activity, from social media to collaborative writing. While natural language processing has made tremendous progress in extracting information from single texts, a general NLP framework for modelling interconnected texts including their versions and related documents is missing. The talk reports on our ongoing efforts to establish such a framework. We address several challenges related to this. First, NLP has an acute need for diverse data to model cross-document tasks. We discuss our new, ethically sound data acquisition strategies and present unique cross-document datasets, along with a generic data model that can capture text structure and cross-document relations in heterogeneous documents. Second, we report on a study that instantiates our framework in the domain of scientific peer reviews. Third, to model cross-document relations, we need to make transformers aware of the structural relations within and across documents – yet it is unclear how much structure they already encode. To this end, we present preliminary insights into probing of long document transformers for structure. Our results pave the way to move NLP forward towards more human-like interpretation of text in the context of other texts.


Iryna Gurevych (PhD 2003, U. Duisburg-Essen, Germany) is professor of Computer Science and director of the Ubiquitous Knowledge Processing (UKP) Lab at the Technical University (TU) of Darmstadt in Germany. Her main research interests are in machine learning for large-scale language understanding and text semantics. Iryna’s work has received numerous awards. Examples are the ACL fellow award 2020 and the first Hessian LOEWE Distinguished Chair award (2,5 mil. Euro) in 2021. Iryna is co-director of the NLP program within ELLIS, a network of excellence in machine learning. She is currently the vice-president of the Association of Computational Linguistics. In 2022, she has been awarded an ERC Advanced Grant (2,5 mil. Euro).

The AI Seminars are sponsored by LG AI Research.


AI Lab

Faculty Host

Rada Mihalcea