Dagstuhl Seminar 25441: Competitions and Empirical Evaluations in Automated Reasoning

Dagstuhl Seminar 25441

Competitions and Empirical Evaluations in Automated Reasoning

( Oct 26 – Oct 31, 2025 )

Permalink

Please use the following short url to reference this page: https://www.dagstuhl.de/25441

Organizers

Johannes Klaus Fichte (Linköping University, SE)
Matti Järvisalo (University of Helsinki, FI)
Aina Niemetz (Stanford University, US)
Guido Tack (Monash University - Clayton, AU)

Contact

Michael Gerke (for scientific matters)
Simone Schilke (for administrative matters)

Motivation

Show Motivation

Automated Reasoning (AR) emerged as a field in the late 1960s. Its sub-areas cover different aspects of deductive reasoning as practiced in mathematics and formal logic, such as automated theorem proving (ATP), propositional satisfiability (SAT), and various generalizations. Practical and theoretical research enabled ground-breaking success for vast applications of formal methods.

At the core of this success are incredibly sophisticated and complex pieces of software (so-called solvers), which tackle specific problems in sub-areas of AR. Recurring (typically annual) solver competitions and large events such as the FLoC Olympic Games, play a significant role in the practical success of AR. Solver competitions enable AR communities to showcase the current state-of-the-art, document practical advancements, identify challenges from research and industry, and push solver developers to not only aim for reliable and robust tools for a wide range of applications, but to improve their tools beyond their current limitations.

Perhaps even more importantly, competitions are often considered the gold standard when it comes to empirical evaluations and have significant impact on the research community and their empirical methods. Organizing a competition that yields meaningful results both for developers and users of the tools is challenging and a tremendous amount of work. And, since competitions serve as a driving force for new advancements in the area, certain decisions related to competition organization have a significant impact as they often establish standards for formats and experimental evaluations. Thus, competition organizers, competition participants, authors, and reviewers all face similar issues, including how to:

evaluate the scientific and theoretical value of advancements through competitions and assess academic and practical implications from decisions made in competitions;
collect, archive, compile, select, and distribute benchmark instances and solvers;
construct meaningful evaluation measures and metrics (beyond pure runtime);
ensure repeatability and easy replicability of results;
design and architect experimental evaluations with a long-term perspective, as stable and easy-to-establish setups that use computational infrastructure efficiently;
build a community and encourage effective communication with participants, among participants and with other researchers.

Surprisingly, we see few interactions between competitions in AR communities as well as between researchers working on empirical evaluations. Interaction happens primarily during the (short) preparation phase prior to competitions and, briefly, at conferences. Critical reflections and more in-depth discussions of long-term implications between all stakeholders in the area typically fall short.

The Dagstuhl Seminar will center around these issues, discuss questions and solutions with the aim to build a community of practice of competition organization and empirical evaluation in AR.

Creative Commons BY 4.0

Johannes Klaus Fichte, Matti Järvisalo, Aina Niemetz, and Guido Tack

Classification

Artificial Intelligence
Distributed / Parallel / and Cluster Computing
Logic in Computer Science

Keywords

Automated Reasoning
Constraint Solving
Competitions
Empirical Evaluation
Design of Empirical Experiments

Seminar 25441

Search the Dagstuhl Website

Schloss Dagstuhl Services

Seminars

Within this website:

External resources:

Publishing

Within this website:

External resources:

dblp

Within this website:

External resources:

Dagstuhl Seminar 25441

Competitions and Empirical Evaluations in Automated Reasoning

( Oct 26 – Oct 31, 2025 )

Permalink

Organizers

Contact

Motivation

Classification

Keywords