Dagstuhl-Perspektiven-Workshop 17442: Towards Cross-Domain Performance Modeling and Prediction: IR/RecSys/NLP

Dagstuhl-Perspektiven-Workshop 17442

Towards Cross-Domain Performance Modeling and Prediction: IR/RecSys/NLP

( 29. Oct – 03. Nov, 2017 )

(zum Vergrößern in der Bildmitte klicken)

Permalink

Bitte benutzen Sie folgende Kurz-Url zum Verlinken dieser Seite: https://www.dagstuhl.de/17442

Organisatoren

Nicola Ferro (University of Padova, IT)
Norbert Fuhr (Universität Duisburg-Essen, DE)
Gregory Grefenstette (IHMC - Paris, FR)
Joseph Konstan (University of Minnesota - Minneapolis, US)

Kontakt

Michael Wagner (für wissenschaftliche Fragen)
Simone Schilke (für administrative Fragen)

Publikationen

Impacts

Motivation

Show Motivation

Information systems, which manage, access, extract and process non-structured information, typically deal with vague and implicit information needs, natural language and complex user tasks. Examples of such systems are information retrieval (IR) systems, search engines, recommender systems (RecSys), machine translation, and so forth. The discipline behind these systems differs from other areas of computer science, and other fields of science and engineering in general, due to the lack of models that allow us to predict system performances in a specific operational context and to design systems ahead to achieve a desired level of effectiveness. In the type of information systems we want to look at, we deal with domains characterized by complex algorithms, dependent on many parameters and confronted with uncertainty both in the information to be processed and the needs to be addressed, where the lack of predictive models is somehow bypassed by massive trials of as many combinations as possible.

These approaches relying on massive experimentation, construction of testbeds, and heuristics are neither indefinitely scaled as the complexity of systems and tasks increases nor applicable outside the context of big Internet companies, which still have the resources to cope with them.

This Dagstuhl Perspectives Workshop will deal with the problem of modelling and predicting performances of information retrieval systems, recommender systems, and natural language processing (NLP) systems. This is an important and open issue common to all these three neighboring fields and it prevents both a deep scientific understanding and an effective engineering of such systems. Progress in modelling and prediction would allow us to better design such systems to achieve desired performance under given operational conditions.

We believe that bringing together these three communities has strong potential for advancing research on predictive models of performance. Indeed, the idea of predictive modeling is not new – each discipline has articulated the need and taken steps in this direction but these efforts have not succeeded yet.

We will challenge participants against some topics relevant to the prediction of system performances, among which:

Characterisation of Corpora before Exploitation
Characterisation of IR/NLP/Rec Systems before Exploitation
Beyond A-B testing
Estimating Performance without A-B Testing
Using Simulation to Predict Performance
Can Deep Learning replace actual user performance measurement?
Predicting Human Experience and Performance
Rich Metrics: Moving from Correctness to Usefulness
Performance guarantees vs. expected average performance
Performance-related axioms and proofs

The result we seek, therefore, is a trans-disciplinary research agenda that draws from the combination of intense focus and the exchange of ideas. This "manifesto" will include steps that can be taken within as well as across disciplines, and we hope it can serve as a roadmap for both researchers and research funders.

Creative Commons BY 3.0 DE

Nicola Ferro, Norbert Fuhr, Gregory Grefenstette, and Joseph A. Konstan

Summary

Show Summary

Information systems, which manage, access, extract and process non-structured information, typically deal with vague and implicit information needs, natural language and complex user tasks. Examples of such systems are information retrieval (IR) systems, recommender systems (RecSys), and applications of natural language processing (NLP) such as e.g. machine translation, document classification, sentiment analysis or search engines. The discipline behind these systems differs from other areas of computer science, and other fields of science and engineering in general, due to the lack of models that allow us to predict system performances in a specific operational context and to design systems ahead to achieve a desired level of effectiveness. In the type of information systems we want to look at, we deal with domains characterized by complex algorithms, dependent on many parameters and confronted with uncertainty both in the information to be processed and the needs to be addressed, where the lack of predictive models is somehow bypassed by massive trials of as many combinations as possible.

The workshop was organized as follows. The first day was devoted to plenary talks focused on providing a general introduction to IR, RecSys, and NLP and on digging into some specific issues in performance modeling and prediction in these three domains. The second day, participants split into three groups - IR, RecSys, and NLP - and explored performance modeling and prediction issues and challenges within each domain; the working groups then reconvened to present the output of their discussion in a plenary session in order to cross-fertilize across disciplines and to identify cross-discipline themes to be further investigated. The third day, participant split into groups which explored these themes - namely measures, performance analysis, documenting and understanding assumptions, application features, and modeling performance - and reported back in plenary sessions to keep all the participants aligned with the ongoing discussions. The fourth and fifth days have been devoted to the drafting of this report and the manifesto originated from the workshop.

This documents reports the overview of the talks given by the participants on the first day. The outcomes of the working groups - both within-discipline themes and cross-discipline themes -- as well as the identified research challenges and directions are presented in the Dagstuhl Manifesto corresponding to this Perspectives Workshop [1].

Acknowledgements. We thank Schloss Dagstuhl for hosting us.

References

N. Ferro, N. Fuhr, G. Grefenstette, J. A. Konstan, P. Castells, E. M. Daly, T. Declerck, M. D. Ekstrand, W. Geyer, J. Gonzalo, T. Kuflik, K. Lindén, B. Magnini, J.-Y. Nie, R. Perego, B. Shapira, I. Soboroff, N. Tintarev, K. Verspoor, M. C. Willemsen, and J. Zobel. Manifesto from Dagstuhl Perspectives Workshop 17442 – Towards Performance Modeling and Performance Prediction across IR/RecSys/NLP. Dagstuhl Manifestos, Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Germany, 7(1), 2018.

Creative Commons BY 3.0 Unported license

Nicola Ferro, Norbert Fuhr, Gregory Grefenstette, and Joseph A. Konstan

Teilnehmer

Zeige Teilnehmer

Pablo Castells (Autonomous University of Madrid, ES) [dblp]
Elizabeth M. Daly (IBM Research - Dublin, IE) [dblp]
Thierry Declerck (DFKI - Saarbrücken, DE) [dblp]
Michael D. Ekstrand (Boise State University, US) [dblp]
Nicola Ferro (University of Padova, IT) [dblp]
Norbert Fuhr (Universität Duisburg-Essen, DE) [dblp]
Werner Geyer (IBM TJ Watson Research Center - Cambridge, US) [dblp]
Julio Gonzalo (UNED - Madrid, ES) [dblp]
Gregory Grefenstette (IHMC - Paris, FR) [dblp]
Joseph Konstan (University of Minnesota - Minneapolis, US) [dblp]
Tsvi Kuflik (Haifa University, IL) [dblp]
Krister Lindén (University of Helsinki, FI) [dblp]
Bernardo Magnini (Bruno Kessler Foundation - Trento, IT) [dblp]
Jian-Yun Nie (University of Montréal, CA) [dblp]
Raffaele Perego (CNR - Pisa, IT) [dblp]
Bracha Shapira (Ben Gurion University - Beer Sheva, IL) [dblp]
Ian Soboroff (NIST - Gaithersburg, US) [dblp]
Nava Tintarev (TU Delft, NL) [dblp]
Karin Verspoor (The University of Melbourne, AU) [dblp]
Martijn Willemsen (TU Eindhoven, NL) [dblp]
Justin Zobel (The University of Melbourne, AU) [dblp]

Klassifikation

data bases / information retrieval
modelling / simulation
society / human-computer interaction

Schlagworte

performance modelling
performance prediction
information retrieval
natural language processing
recommender systems

Seminar 17442

Suche auf der Schloss Dagstuhl Webseite

Schloss Dagstuhl Services

Seminare

Innerhalb dieser Seite:

Externe Seiten:

Publishing

Innerhalb dieser Seite:

Externe Seiten:

dblp

Innerhalb dieser Seite:

Externe Seiten:

Dagstuhl-Perspektiven-Workshop 17442

Towards Cross-Domain Performance Modeling and Prediction: IR/RecSys/NLP

( 29. Oct – 03. Nov, 2017 )

Permalink

Organisatoren

Kontakt

Publikationen

Impacts

Motivation

Summary

References

Teilnehmer

Klassifikation

Schlagworte