Dagstuhl Seminar 23031: Frontiers of Information Access Experimentation for Research and Education

Dagstuhl Seminar 23031

Frontiers of Information Access Experimentation for Research and Education

( Jan 15 – Jan 20, 2023 )

(Click in the middle of the image to enlarge)

Permalink

Please use the following short url to reference this page: https://www.dagstuhl.de/23031

Organizers

Christine Bauer (Utrecht University, NL)
Ben Carterette (University of Delaware - Newark, US)
Nicola Ferro (University of Padova, IT)
Norbert Fuhr (Universität Duisburg-Essen, DE)

Contact

Michael Gerke (for scientific matters)
Susanne Bach-Bernhard (for administrative matters)

Publications

Christine Bauer, Ben Carterette, Nicola Ferro, Norbert Fuhr, and Guglielmo Faggioli. Frontiers of Information Access Experimentation for Research and Education (Dagstuhl Seminar 23031). In Dagstuhl Reports, Volume 13, Issue 1, pp. 68-154, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2023)

Impacts

Schedule

Schedule

Summary

Show Summary

Information access – which includes Information Retrieval (IR), Recommender Systems (RS), and Natural Language Processing (NLP) – has a long tradition of relying heavily on experimental evaluation, dating back to the mid-1950s, a tradition that has driven the research and evolution of the field. However, nowadays, research and development of information access systems are confronted with new challenges: information access systems are called to support a much wider set of user tasks (informational, educational, and entertainment, just to name a few) which are increasingly challenging, and as a result, research settings and available opportunities have evolved substantially (e.g., better platforms, richer data, but also developments within the scientific culture) and shape the way in which we do research and experimentation. Consequently, it is critical that the next generation of scientists is equipped with a portfolio of evaluation methods that reflect the field’s challenges and opportunities, and help ensure internal validity (e.g., measures, statistical analyses, effect sizes, etc., to support establishing a trustworthy cause-effect relationship between treatments and outcomes), construct validity (e.g., measuring the right thing rather than a partial proxy), and external validity (e.g., critically assessing to which extent findings hold in other situations, domains, and user groups). A robust portfolio of such methods will contribute to developing more responsible experimental practices.

Therefore, we face two problems: Can we re-innovate how we do research and experimentation in the field by addressing emerging challenges in experimental processes to develop the next generation of information access systems? How can a new paradigm of experimentation be leveraged to improve education to give an adequate basis to the new generation of researchers and developers?

This Dagstuhl Seminar brought together experts from various sub-fields of information access, namely IR, RS, NLP, information science, and human-computer interaction to create a joint understanding of the problems and challenges presented above, to discuss existing solutions and impediments, and to propose next steps to be pursued in the area.

To stimulate thinking around these themes, prior to the seminar, we challenged participants with the following questions:

Which experimentation methodologies are most promising to further develop and create a culture around?
In which ways can we consider the concerns related to Fairness, Accountability, and Transparency (FAccT) in the experimentation practices? How can we establish FaccT-E, i.e. FaccT in Experimentation?
How can industry and academia better work together on experimentation?
How can critical experimentation methods and skills be taught and developed in academic teaching?
How can we foster collaboration and run shared infrastructures enabling collaborative and joint experimentation? How to organize shared evaluation activities taking advantage of new hybrid forms of participation?

We started the seminar week with a series of long and short talks delivered by participants, also in response to the above questions. This helped in setting a common ground and understanding and in letting emerge the topics and themes that participants wished to explore as the main output of the seminar.

This led to the definition of five groups which explored challenges, opportunities, and next steps in the following areas

Reality check: The working group identified the main challenges in doing real-world studies in RS and IR research – and points to best practices and remaining challenges in both how to do domain-specific or longitudinal studies, how to recruit the right participants, using existing or creating new infrastructure including appropriate data representation, as well as how, why and what to measure.
Human-machine-collaborative relevance judgment frameworks: The working group studied the motivation for using Large Language Models (LLMs) to automatically generate relevance assessments in information retrieval evaluation, and raises research questions about how LLMs can help human assessors with the assessment task, whether machines can replace humans in assessing and annotating, and what are the conditions under which human assessors cannot be replaced by machines.
Overcoming methodological challenges in IR and RS through awareness and education: Given the potential limitations of today’s predominant experimentation practices, we find that we need to better equip the various actors in the scientific ecosystem in terms of scientific methods, and we identify a corresponding set of helpful resources and initiatives, which will allow them to adopt a more holistic perspective when evaluating such systems.
Results-blind reviewing: The current review processes lead to undue emphasis on performance, rejecting papers focusing on insights in case they show no performance improvements. We propose to introduce a results-blind reviewing process forcing reviewers to put more emphasis on the theoretical background, the hypotheses, the methodological plan and the analysis plan of an experiment, thus improving the overall quality of the papers being accepted.
Guidance for authors: The Information Retrieval community has over time developed expectations regarding papers, but these expectations are largely implicit. In contrast to adjacent disciplines, efforts in the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) community have been rather sparse and are mostly due to individuals expressing their own views. Drawing on materials from other disciplines, we have built a draft set of guidelines with the aim of them being understandable, broad, and highly concise. We believe that our proposal is general and uncontroversial, can be used by the main venues, and can be maintained with an open and continuous effort driven by, and for, the community.

Creative Commons BY 4.0

Christine Bauer, Ben Carterette, Nicola Ferro, and Norbert Fuhr

Motivation

Show Motivation

This Dagstuhl Seminar will address technology-enhanced information access (information retrieval, recommender systems, etc.) and specifically focus on developing more responsible experimental practices leading to more valid results, both for research as well as scientific education.

Information access has a long tradition of relying heavily on experimental evaluation dating back to the mid 1950s, this tradition has driven the research and evolution of the field. However, nowadays, research and development of information access systems are confronted with new challenges: information access systems are called to support a much wider set of user tasks (informational, educational, entertainment, just to name a few) which are increasingly challenging, and as a result, research settings and available opportunities have evolved substantially (e.g., better platforms, richer data, but also developments within the scientific culture) and shape the way in which we do research and experimentation. Consequently, it is critical that the next generation of scientists is equipped with a portfolio of evaluation methods that reflect the field’s challenges and opportunities, and help ensure internal validity (e.g., how many measures, statistical analyses, effect sizes, etc., to support establishing a trustworthy cause-effect relationship between treatments and outcomes), construct validity (e.g., measuring the right thing rather than a partial proxy), and external validity (e.g., critically assessing to which extent results hold in other situations, domains, and user groups). A robust portfolio of such methods will contribute to developing more responsible experimental practices.

To cope with these problems, in this Dagstuhl Seminar we aim at addressing specifically the following questions – among others:

Which experimentation methodologies are most promising to further develop and create a culture around?
In which ways can we consider the concerns related to FAccT (Fairness, Accountability, and Transparency) in the experimentation practices? How can we establish FaccT-E, i.e. FaccT in Experimentation?
How can industry and academia better work together on experimentation?
How can critical experimentation methods and skills be taught and developed in academic teaching?
How can we foster collaboration and run shared infrastructures enabling collaborative and joint experimentation? How to organize shared evaluation activities taking the opportunity of new hybrid forms of participation?

Creative Commons BY 4.0

Christine Bauer, Ben Carterette, Nicola Ferro, and Norbert Fuhr

Participants

Show Participants

Christine Bauer (Utrecht University, NL) [dblp]
Joeran Beel (Universität Siegen, DE)
Timo Breuer (TH Köln, DE) [dblp]
Charles Clarke (University of Waterloo, CA) [dblp]
Anita Crescenzi (University of North Carolina - Chapel Hill, US)
Gianluca Demartini (The University of Queensland - Brisbane, AU) [dblp]
Giorgio Maria Di Nunzio (University of Padova, IT) [dblp]
Laura Dietz (University of New Hampshire - Durham, US) [dblp]
Guglielmo Faggioli (University of Padova, IT) [dblp]
Nicola Ferro (University of Padova, IT) [dblp]
Bruce Ferwerda (Jönköping University, SE) [dblp]
Maik Fröbe (Friedrich-Schiller-Universität Jena, DE)
Norbert Fuhr (Universität Duisburg-Essen, DE) [dblp]
Matthias Hagen (Friedrich-Schiller-Universität Jena, DE) [dblp]
Allan Hanbury (TU Wien, AT)
Claudia Hauff (Spotify - Amsterdam, NL) [dblp]
Dietmar Jannach (Alpen-Adria-Universität Klagenfurt, AT) [dblp]
Noriko Kando (National Institute of Informatics - Tokyo, JP) [dblp]
Evangelos Kanoulas (University of Amsterdam, NL) [dblp]
Bart Knijnenburg (Clemson University, US) [dblp]
Udo Kruschwitz (Universität Regensburg, DE) [dblp]
Birger Larsen (Aalborg University Copenhagen, DK) [dblp]
Meijie Li (Universität Duisburg-Essen, DE)
Maria Maistro (University of Copenhagen, DK) [dblp]
Lien Michiels (University of Antwerp, BE) [dblp]
Andrea Papenmeier (Universität Duisburg-Essen, DE) [dblp]
Martin Potthast (Universität Leipzig, DE) [dblp]
Paolo Rosso (Technical University of Valencia, ES) [dblp]
Alan Said (University of Gothenburg, SE) [dblp]
Philipp Schaer (TH Köln, DE) [dblp]
Christin Seifert (Universität Duisburg-Essen, DE)
Ian Soboroff (NIST - Gaithersburg, US) [dblp]
Damiano Spina (RMIT University - Melbourne, AU)
Benno Stein (Bauhaus-Universität Weimar, DE) [dblp]
Nava Tintarev (Maastricht University, NL) [dblp]
Julián Urbano (TU Delft, NL) [dblp]
Henning Wachsmuth (Universität Paderborn, DE) [dblp]
Martijn Willemsen (Eindhoven University of Technology & JADS - ‘s-Hertogenbosch- Eindhoven) [dblp]
Justin Zobel (The University of Melbourne, AU) [dblp]

Classification

Artificial Intelligence
Information Retrieval

Keywords

Information Access Systems
Experimentation
Evaluation
User Interaction
Simulation

Seminar 23031

Search the Dagstuhl Website

Schloss Dagstuhl Services

Seminars

Within this website:

External resources:

Publishing

Within this website:

External resources:

dblp

Within this website:

External resources:

Dagstuhl Seminar 23031

Frontiers of Information Access Experimentation for Research and Education

( Jan 15 – Jan 20, 2023 )

Permalink

Organizers

Contact

Publications

Impacts

Schedule

Summary

Motivation

Participants

Classification

Keywords