Dagstuhl Seminar 25282: Theory of Neural Language Models

Dagstuhl Seminar 25282

Theory of Neural Language Models

( Jul 06 – Jul 11, 2025 )

(Click in the middle of the image to enlarge)

Permalink

Please use the following short url to reference this page: https://www.dagstuhl.de/25282

Organizers

Pablo Barcelo (PUC - Santiago de Chile, CL)
David Chiang (University of Notre Dame, US)
George Cybenko (Dartmouth College Hanover, US)
Lena Strobl (University of Umeå, SE)

Contact

Michael Gerke (for scientific matters)
Jutka Gasiorowski (for administrative matters)

Shared Documents

Dagstuhl Materials Page (Use personal credentials as created in DOOR to log in)

Schedule

Schedule

Motivation

Show Motivation

Artificial intelligence (AI) has gone through multiple “summers” and “winters,” with the current summer based on large neural models that generate text, images and other content. ChatGPT and other neural language models (NLMs), which model sequences of tokens, have taken center stage, not only in natural language processing, but across a wide range of applications. This interest spans the academic, corporate, government, investor, and consumer sectors. However, whereas experimental research and product development are surging ahead, more theoretical research aimed at foundational questions about neural networks is lagging behind. Clear thinking about what NLMs can and can't do is more needed than ever.

The old guard of NLMs, recurrent neural networks (RNNs), have been studied theoretically for decades, in relation to finite automata and Turing machines. At present, the dominant NLMs are based on transformers, whose computational power is a new and rapidly growing area of research. Transformers have been related to a wide variety of formal models from computability and complexity theory, like counter automata, Turing machines, Boolean circuits, and first-order logic. However, a unified and comprehensive theory of the abilities and limitations of transformers is not yet in sight.

Such a theory would ideally answer questions like:

How do transformers, RNNs, other NLMs, and their variants, compare with one another in expressivity and trainability?
How do the successes and failures of NLMs predicted by theoretical models manifest in practice?
What modifications, or what wholly new architectures, are suggested by the theory?

There is a small but growing community of researchers that investigates such questions about NLMs. This Dagstuhl Seminar aims to bring this community together to lay a foundation for continued work in this area, identifying central open problems, and fostering new collaborations. To achieve these goals, the seminar will leave ample room for informal discussions on topics suggested by the participants themselves, along the lines of an Open Space meeting.

Creative Commons BY 4.0

Pablo Barcelo, David Chiang, George Cybenko, and Lena Strobl

Participants

Show Participants

Please log in to DOOR to see more details.

Joshua M. Ackerman (Dartmouth College - Hanover, US) [dblp]
Pablo Barcelo (PUC - Santiago de Chile, CL) [dblp]
Michael Benedikt (University of Oxford, GB) [dblp]
Satwik Bhattamishra (University of Oxford, GB) [dblp]
Michaël Cadilhac (DePaul University - Chicago, US) [dblp]
David Chiang (University of Notre Dame, US) [dblp]
Ryan Cotterell (ETH Zürich, CH) [dblp]
George Cybenko (Dartmouth College Hanover, US) [dblp]
Brian DuSell (ETH Zürich, CH) [dblp]
Robert Frank (Yale University, US) [dblp]
Martin Grohe (RWTH Aachen, DE) [dblp]
Michael Hahn (Universität des Saarlandes - Saarbrücken, DE) [dblp]
Jiaoda Li (ETH Zürich, CH)
Anthony W. Lin (RPTU Kaiserslautern-Landau, DE) [dblp]
Paul S. Lintilhac (Dartmouth College Hanover, US) [dblp]
William Merrill (New York University, US) [dblp]
Guillaume Rabusseau (University of Montreal, CA) [dblp]
Jon Rawski (San José State University, US) [dblp]
Ashish Sabharwal (Allen Institute for AI - Seattle, US) [dblp]
Clayton Sanford (Google - New York, US) [dblp]
Noah A. Smith (University of Washington - Seattle, US)
Howard Straubing (Boston College, US) [dblp]
Laura Strieker (Leibniz Universität Hannover, DE) [dblp]
Lena Strobl (University of Umeå, SE) [dblp]
Anej Svete (ETH Zürich, CH)
Gail Weiss (EPFL - Lausanne, CH) [dblp]
Andrew Yang (University of Notre Dame, US)

Classification

Computational Complexity
Formal Languages and Automata Theory
Machine Learning

Keywords

Expressivity and Trainability
Machine Learning
Formal Languages and Automata Theory
Computational Complexity
Logic

Seminar 25282

Search the Dagstuhl Website

Schloss Dagstuhl Services

Seminars

Within this website:

External resources:

Publishing

Within this website:

External resources:

dblp

Within this website:

External resources:

Dagstuhl Seminar 25282

Theory of Neural Language Models

( Jul 06 – Jul 11, 2025 )

Permalink

Organizers

Contact

Shared Documents

Schedule

Motivation

Participants

Classification

Keywords