Dagstuhl Seminar 14282
Crowdsourcing and the Semantic Web
( Jul 06 – Jul 09, 2014 )
Permalink
Organizers
- Abraham Bernstein (Universität Zürich, CH)
- Jan Marco Leimeister (Universität Kassel, DE & Universität St. Gallen, CH)
- Natasha Noy (Google Inc. - Mountain View, US)
- Elena Simperl (University of Southampton, GB)
Coordinator
- Cristina Sarasua (Universität Koblenz-Landau, DE)
Contact
- Annette Beyer (for administrative matters)
Dagstuhl Seminar Wiki
- Dagstuhl Seminar Wiki (Use personal credentials as created in DOOR to log in)
Shared Documents
- Dagstuhl Materials Page (Use personal credentials as created in DOOR to log in)
Semantic technologies provide flexible and scalable solutions to master and make sense of an increasingly vast and complex data landscape. However, while this potential has been acknowledged for various application scenarios and domains, and a number of success stories exist, it is equally clear that the development and deployment of semantic technologies will always remain reliant of human input and intervention. This is due to the very nature of some of the tasks associated with the semantic data management life cycle, which are famous for their knowledge-intensive and/or context-specific character; examples range from conceptual modeling in almost any flavor, to labeling resources (in different languages), describing their content in terms of ontological terms, or recognizing similar concepts and entities. For this reason, the Semantic Web community has always looked into applying the latest theories, methods and tools from CSCW, participatory design, Web 2.0, social computing, and, more recently crowdsourcing to find ways to engage with users and encourage their involvement in the execution of technical tasks. Existing approaches include the usage of wikis as semantic content authoring environments, leveraging folksonomies to create formal ontologies, but also human computation approaches such as games with a purpose or micro-tasks.
The seminar will focus on three categories of topics: first and foremost we aim to look into existing crowdsourcing approaches and how these could or have been applied to solve traditional semantic data management tasks. Particular attention will be paid to core components of a crowdsourcing-enabled data management and processing system, including methods for quality assurance and spam detection, resources, task and workflow management, as well as interfaces, and the way these components can be assembled into coherent frameworks. A second category of topics to be addressed during the seminar reaches out to other disciplines such as economics, social sciences, and design, with the aim to understand how theories and techniques from these fields could be used to build better crowdsourcing-enabled data management systems for the Semantic Web. Last, but not least, we will discuss the usage of semantic technologies within generic crowdsourcing scenarios, most notably as means to describe data, resources and specific components.
The seminar organizers have the ambition of making this a most influential workshop; we aim to lay the foundations for a scientific community at the intersection of crowdsourcing and semantic technologies. The goal of the seminar is to shape the evolution and further development of this emerging community by devising a research roadmap that will outline the future of the field; and publish a special issue in a high-quality journal, or an edited book summarizing the most important lines of research, and the results of our interactions during the seminar. To achieve these goals the seminar will follow the following procedure:
- Presentation of the participants on what they perceive the greatest challenges are to a successful confluence of the Semantic Social Web and crowdsourcing;
- An affinity mapping exercise to group and order the challenges along a variety of to be identified dimensions;
- A writing session where the participants jointly compose a first draft of the roadmap.
One of the results of this community-building exercise should also include, besides a comprehensive overview of the scientific challenges which so far remained underexplored, a collection of basic vocabularies and services that Semantic Web researchers could exploit in order to build effective crowd-sourced and crowd-moderated systems. As a community which values technologies for open access and interoperability, we need to work towards the definition of metadata vocabularies to described data created through crowdsourcing, and publish this data for further reuse and repurposing. This seminar, in its community-formative role, could be the starting point for the emergence of working groups that would jointly address such problems. The first two days of the seminar will be dedicated to presentations and working groups on topics related to challenges identified during the talks and Q&A sessions. The third day will focus on the consolidation of the results of the working groups in written form and define next steps and follow-up activities.
The aim of the Dagstuhl Seminar 14282: Crowdsourcing and the Semantic Web, which was held in July 2014, was to gain a better understanding of the dual relationship between crowdsourcing and Semantic Web technologies, map out an emerging research space, and identify the fundamental research challenges that will need to be addressed to ensure the future development of the field.
The seminar focused on three categories of topics: first and foremost we looked into existing crowdsourcing approaches and how these could or have been applied to solve traditional semantic data management tasks. Particular attention was paid to core components of a crowdsourcing-enabled data management and processing system, including methods for quality assurance and spam detection, resources, task and workflow management, as well as interfaces, and the way these components can be assembled into coherent frameworks. A second category of topics that was addressed during the seminar reached out to other disciplines such as economics, social sciences, and design, with the aim to understand how theories and techniques from these fields could be used to build better crowdsourcing-enabled data management systems for the Semantic Web. Last, but not least, we discussed the usage of semantic technologies within generic crowdsourcing scenarios, most notably as means to describe data, resources and specific components.
The seminar, in its community-formative role, represented the starting point for the emergence of working groups that will in the future jointly address the identified scientific challenges. Participants were asked to provide a 1-page position statement reflecting on why they think it makes sense to consider the two topics -- crowdsourcing and Semantic Web (or Web of Data) -- at the same seminar. Specifically, participants were asked to write a statement reflecting on one of both of the following questions:
- What are the Semantic Web tasks where you felt you needed crowdsourcing? Why? What were the challenges?
- What are the crowdsourcing tasks where using semantics might help? Why? What are the challenges?
The first two days of the seminar were dedicated to presentations of topics related to position statements and working groups on use case scenarios and challenges identified during the talks and Q&A sessions. The third day focused on the consolidation of the results of the working groups and the definition of next steps and follow-up activities.
In the following sections we present the position papers written by the researchers of the crowdsourcing and the Semantic Web community, who took part in the seminar. We will publish a more complete research roadmap for crowdsourcing and the Semantic Web at a later stage.
- Maribel Acosta (KIT - Karlsruher Institut für Technologie, DE) [dblp]
- Sofia Angeletou (BBC - London, GB) [dblp]
- Lora Aroyo (VU University Amsterdam, NL) [dblp]
- Abraham Bernstein (Universität Zürich, CH) [dblp]
- Irene Celino (CEFRIEL - Milano, IT) [dblp]
- Philippe Cudré-Mauroux (University of Fribourg, CH) [dblp]
- Roberta Cuel (University of Trento, IT) [dblp]
- Gianluca Demartini (University of Fribourg, CH) [dblp]
- Michael Feldman (Universität Zürich, CH) [dblp]
- Yolanda Gil (USC - Marina del Rey, US) [dblp]
- Carole Goble (University of Manchester, GB) [dblp]
- Robert Kern (IBM Deutschland - Böblingen, DE) [dblp]
- Jan Marco Leimeister (Universität Kassel, DE & Universität St. Gallen, CH) [dblp]
- Atsuyuki Morishima (University of Tsukuba, JP) [dblp]
- Natasha Noy (Google Inc. - Mountain View, US)
- Valentina Presutti (CNR - Rome, IT) [dblp]
- Marta Sabou (MODUL Universität Wien, AT) [dblp]
- Harald Sack (Hasso-Plattner-Institut - Potsdam, DE) [dblp]
- Cristina Sarasua (Universität Koblenz-Landau, DE) [dblp]
- Elena Simperl (University of Southampton, GB) [dblp]
- Markus Strohmaier (Universität Koblenz-Landau, DE) [dblp]
- Gerd Stumme (Universität Kassel, DE) [dblp]
- Tania Tudorache (Stanford University, US) [dblp]
- Maja Vukovic (IBM TJ Watson Research Center - Yorktown Heights, US) [dblp]
- Christopher A. Welty (IBM TJ Watson Research Center - Yorktown Heights, US) [dblp]
- Marco Zamarian (University of Trento, IT) [dblp]
Classification
- artificial intelligence / robotics
- semantics / formal methods
- society / human-computer interaction
Keywords
- Crowdsourcing
- human computation
- games with a purpose
- Semantic Web
- Linked Data
- quality assurance
- crowd management
- workflow management
- interfaces
- gamification
- incentives