TOP
Search the Dagstuhl Website
Looking for information on the websites of the individual seminars? - Then please:
Not found what you are looking for? - Some of our services have separate websites, each with its own search option. Please check the following list:
Schloss Dagstuhl - LZI - Logo
Schloss Dagstuhl Services
Seminars
Within this website:
External resources:
  • DOOR (for registering your stay at Dagstuhl)
  • DOSA (for proposing future Dagstuhl Seminars or Dagstuhl Perspectives Workshops)
Publishing
Within this website:
External resources:
dblp
Within this website:
External resources:
  • the dblp Computer Science Bibliography


Dagstuhl Seminar 25282

Theory of Neural Language Models

( Jul 06 – Jul 11, 2025 )

Permalink
Please use the following short url to reference this page: https://www.dagstuhl.de/25282

Organizers

Contact

Motivation

Artificial intelligence (AI) has gone through multiple “summers” and “winters,” with the current summer based on large neural models that generate text, images and other content. ChatGPT and other neural language models (NLMs), which model sequences of tokens, have taken center stage, not only in natural language processing, but across a wide range of applications. Whereas experimental research is surging ahead, more theoretical research aimed at foundational questions about neural networks is lagging behind. Moreover, this interest spans the academic, corporate, government, investor, and consumer sectors, and is driven not only by scientific investigation but by competition to deliver products and gain social-media followers. Clear thinking about what NLMs can and can't do is more needed than ever.

The old guard of NLMs, recurrent neural networks (RNNs), have been studied theoretically for decades, in relation to finite automata and Turing machines. At present, the dominant NLMs are based on transformers, whose computational power is a new and rapidly growing area of research. Transformers have been related to a wide variety of formal models from computability and complexity theory, like counter automata, Turing machines, Boolean circuits, and first-order logic. However, a unified and comprehensive theory of the abilities and limitations of transformers is not yet in sight.

Such a theory would ideally answer questions like:

  • How do transformers, RNNs, other NLMs, and their variants, compare with one another in expressivity and trainability?
  • How do the successes and failures of NLMs predicted by theoretical models manifest in practice?
  • What modifications, or what wholly new architectures, are suggested by the theory?

There is a small but growing community of researchers that investigates such questions about NLMs. This Dagstuhl Seminar aims to bring this community together to lay a foundation for continued work in this area, identifying central open problems, and fostering new collaborations. To achieve these goals, the seminar will leave ample room for informal discussions on topics suggested by the participants themselves, along the lines of an Open Space meeting.

Copyright Pablo Barcelo, David Chiang, George Cybenko, and Lena Strobl

Classification
  • Computational Complexity
  • Formal Languages and Automata Theory
  • Machine Learning

Keywords
  • Expressivity and Trainability
  • Machine Learning
  • Formal Languages and Automata Theory
  • Computational Complexity
  • Logic