Dagstuhl-Seminar 25282
Theory of Neural Language Models
( 06. Jul – 11. Jul, 2025 )
Permalink
Organisatoren
- Pablo Barcelo (PUC - Santiago de Chile, CL)
- David Chiang (University of Notre Dame, US)
- George Cybenko (Dartmouth College Hanover, US)
- Lena Strobl (University of Umeå, SE)
Kontakt
- Michael Gerke (für wissenschaftliche Fragen)
- Jutka Gasiorowski (für administrative Fragen)
Artificial intelligence (AI) has gone through multiple “summers” and “winters,” with the current summer based on large neural models that generate text, images and other content. ChatGPT and other neural language models (NLMs), which model sequences of tokens, have taken center stage, not only in natural language processing, but across a wide range of applications. Whereas experimental research is surging ahead, more theoretical research aimed at foundational questions about neural networks is lagging behind. Moreover, this interest spans the academic, corporate, government, investor, and consumer sectors, and is driven not only by scientific investigation but by competition to deliver products and gain social-media followers. Clear thinking about what NLMs can and can't do is more needed than ever.
The old guard of NLMs, recurrent neural networks (RNNs), have been studied theoretically for decades, in relation to finite automata and Turing machines. At present, the dominant NLMs are based on transformers, whose computational power is a new and rapidly growing area of research. Transformers have been related to a wide variety of formal models from computability and complexity theory, like counter automata, Turing machines, Boolean circuits, and first-order logic. However, a unified and comprehensive theory of the abilities and limitations of transformers is not yet in sight.
Such a theory would ideally answer questions like:
- How do transformers, RNNs, other NLMs, and their variants, compare with one another in expressivity and trainability?
- How do the successes and failures of NLMs predicted by theoretical models manifest in practice?
- What modifications, or what wholly new architectures, are suggested by the theory?
There is a small but growing community of researchers that investigates such questions about NLMs. This Dagstuhl Seminar aims to bring this community together to lay a foundation for continued work in this area, identifying central open problems, and fostering new collaborations. To achieve these goals, the seminar will leave ample room for informal discussions on topics suggested by the participants themselves, along the lines of an Open Space meeting.
Klassifikation
- Computational Complexity
- Formal Languages and Automata Theory
- Machine Learning
Schlagworte
- Expressivity and Trainability
- Machine Learning
- Formal Languages and Automata Theory
- Computational Complexity
- Logic