Dagstuhl Seminar 24061
Are Knowledge Graphs Ready for the Real World? Challenges and Perspective
( Feb 04 – Feb 09, 2024 )
Permalink
Organizers
- David Chaves-Fraga (University of Santiago de Compostela, ES)
- Oscar Corcho (Technical University of Madrid, ES)
- Anastasia Dimou (KU Leuven, BE)
- Maria-Esther Vidal (TIB - Hannover, DE)
Contact
- Marsha Kleinbauer (for scientific matters)
- Susanne Bach-Bernhard (for administrative matters)
Shared Documents
- Dagstuhl Materials Page (Use personal credentials as created in DOOR to log in)
Schedule
Graphs and knowledge bases have been around for many decades, and research results have had a tremendous impact on areas such as mathematics, artificial intelligence, and databases. However, although the term has been coined by the scientific community, technological developments and astronomical data growth have made knowledge graph (KG) management a fundamental topic in various areas of computer science today. The scientific and industrial communities have responded to the emerging field of knowledge management. As a result, formal frameworks for defining and representing KGs, as well as methods for creating, exploring, and analyzing KGs, have flourished to make KGs a reality. However, despite the tangible results, sustainability is still compromised by the lack of transparent and accountable management of CCs. The real-world application of KGs requires programming paradigms for KG management, transparent data integration and quality assessment techniques, and methods for maintaining access control and privacy. In addition to technological advances, societal adjustments can have a tremendous impact on the management of KGs. The seminar addressed these socio-technical challenges with a mix of invited talks, lightning talks, and small group workshops as follows:
The Incremental Creation of Knowledge Graphs. Creating a Knowledge Graph (KG) involves several open research challenges, such as data extraction, data quality, data integration, and data security. It also requires attention to architectural aspects such as scalability and interoperability. A working group was formed to discuss and focus on two main topics: the definition of a general pipeline for KG construction and its relationship to data quality. The main outcome is a standard formalization of the KG construction lifecycle and its associated components. This definition is accompanied by quality measures and provenance tracking of all steps.
Support of Knowledge Graph Implementation. Software engineering and programming languages have created approaches and techniques that support complex tasks during software development such as software dependencies, error identification, testing, syntactic validation, software lifecycle, etc. We look into these proposals to determine a set of requirements in software lifecycle management for knowledge graphs. They will improve and facilitate the implementation of knowledge graphs in industrial and complex environments, taking into account the relationships and dependencies between all the artifacts used (ontologies, shapes, mappings, tests, etc.) as well as their evolution and versioning. To achieve this goal, we believe that it is necessary to have a better understanding and general overview of how knowledge graphs are implemented. Therefore, a workshop on this topic has been proposed at ISWC2024\footnote{\url{https://w3id.org/soflim4kg}}. After its celebration, the next step will be to create a community around this topic with researchers and industry stakeholders to standardize and implement the identified challenges/requirements.
Access Control in Decentralized Knowledge Graphs. Exploring access control in decentralized Knowledge Graphs has been a relatively underexplored area. Specifically, mechanisms for restricting access to knowledge to safeguard confidential information and personal data, as well as establishing consent models for the processing of personal data, have not received substantial attention within the realm of Knowledge Graph management. Additionally, ensuring compliance with usage policies has been inadequately addressed, particularly in the context of decentralized Knowledge Graphs. During the seminar, a dedicated group convened to deliberate on approaches for managing Knowledge Graphs across a federation of decentralized instances.
A New Generation of Knowledge Engineers. Improving the utilization and management of knowledge graphs requires educating a diverse audience about both the social and technical aspects of knowledge work. To address this need, a dedicated working group was established. This group conducted an analysis to identify existing educational resources and gaps in knowledge, exploring how consensus could be fostered among various stakeholders in the field. Moreover, the group investigated the specific educational requirements tailored to different audiences, including professional students, undergraduates, and postgraduates. By thoroughly examining these aspects, the working group aimed to formulate strategies for enhancing education and understanding in the domain of knowledge graph utilization and management.
Graphs and knowledge bases have been around for many decades, and research outcomes have tremendously impacted areas like mathematics, artificial intelligence, and databases. However, despite being already coined by the scientific community, technological developments and astronomical data growth make knowledge graph management a fundamental topic nowadays in various computer science areas, supporting novel applications at the science (e.g., biomedicine) and industry (e.g., Google’s Knowledge Graph) level.
Scientific and industrial communities reacted to the emergent area of knowledge graph (KG) management. As a result, formal frameworks for KG definition and representation, and methods for the creation, exploration, and analysis have flourished to make KGs a reality. On the one hand, albeit expressive and capable of providing a domain shared understanding, current KGs are relatively simple semantic structures that mainly represent an assembly of factual statements arranged in entity descriptions, possibly enriched by class hierarchies and corresponding property definitions. On the other hand, despite the noticeable results, sustainability is still affected by the absence of transparent and traceable frameworks for intelligent KG governance. Therefore, the application of KGs in the real world demands 1) programming paradigms for KG management, 2) transparent data integration and quality assessment techniques, 3) scalable and sustainable approaches for knowledge creation, exploration, and analysis, and 4) access control and privacy preservation.
This Dagstuhl Seminar focuses on these relevant research topics and aspires to reflect on KGs from their more foundational computer science perspectives. The main aim of the seminar is to bring together interdisciplinary researchers from both academia and industry eager to discuss foundations, concepts, and implementations that will pave the way for the next generation of KGs ready to be used in the real world. The unique combination of these research topics should lead to breakthrough ideas to be further investigated. Specifically, the seminar aims to address the following research questions:
Q1) What are the key requirements for programming languages paradigms for modeling, representing, storing, and managing KGs in the real world?
Q2) How is sustainability achieved in the context of KG management, and what are its main benefits in terms of data integration, curation, and exploration toward traceable and sustainable pipelines?
Q3) What are the key requirements in terms of data management and query processing to ensure scalability over big KGs?
Q4) What are the trade-offs between fine-grained knowledge representation (e.g., personalized KGs) and the enforcement of data privacy and access control regulations?
The ambition of the organizers is to make this seminar an influential event in the field. In particular, the aim is to use the insights and the results of the seminar to design a roadmap that will shape the future of intelligent frameworks to make KGs applicable in the real world. The dissemination plan includes 1) a report summarizing the conclusions of the seminar, e.g., as a report in the ACM SIGMOD Record and 2) the publication of a position paper describing the framework roadmap in a top-ranked journal.
- Marina Aguado (European Union Agency for Railways, FR)
- Wouter Beek (Triply B.V. - Bussum, NL) [dblp]
- Eva Blomqvist (Linköping University, SE) [dblp]
- Piero Andrea Bonatti (University of Naples, IT) [dblp]
- Carlos Buil-Aranda (Thuban Technology Services - Madrid, ES) [dblp]
- Cinzia Cappiello (Polytechnic University of Milan, IT) [dblp]
- Irene Celino (CEFRIEL - Milan, IT) [dblp]
- Pierre-Antoine Champin (INRIA - Sophia Antipolis, FR) [dblp]
- David Chaves-Fraga (University of Santiago de Compostela, ES)
- Oscar Corcho (Technical University of Madrid, ES) [dblp]
- Souripriya Das (Oracle Corp. - Nashua, US)
- Coen De Roover (VU - Brussels, BE) [dblp]
- Christophe Debruyne (University of Liège, BE)
- Anastasia Dimou (KU Leuven, BE)
- Michel Dumontier (Maastricht University, NL) [dblp]
- George Fletcher (TU Eindhoven, NL) [dblp]
- Sandra Geisler (RWTH Aachen, DE) [dblp]
- Martin Giese (University of Oslo, NO) [dblp]
- Paul Groth (University of Amsterdam, NL) [dblp]
- Claudio Gutierrez (University of Chile - Santiago de Chile, CL) [dblp]
- Peter Haase (Metaphacts GmbH - Walldorf, DE) [dblp]
- Olaf Hartig (Linköping University, SE) [dblp]
- Aidan Hogan (University of Chile - Santiago de Chile, CL) [dblp]
- Katja Hose (TU Wien, AT) [dblp]
- Ana Iglesias-Molina (Polytechnic University of Madrid, ES)
- Samaneh Jozashoori (Metaphacts GmbH - Walldorf, DE)
- Eduard Kamburjan (University of Oslo, NO) [dblp]
- Sabrina Kirrane (Wirtschaftsuniversität Wien, AT) [dblp]
- Craig A. Knoblock (USC - Marina del Rey, US) [dblp]
- Maurizio Lenzerini (Sapienza University of Rome, IT) [dblp]
- Vanessa López (IBM Research - Dublin, IE)
- Paco Nathan (Derwen - Sebastopol, US)
- Edelmira Pasarella (UPC Barcelona Tech, ES)
- Axel Polleres (Wirtschaftsuniversität Wien, AT) [dblp]
- Anisa Rula (University of Brescia, IT)
- Juan F. Sequeda (data.world - Austin, US) [dblp]
- Dylan Van Assche (Ghent University, BE)
- Ivo Velitchkov (Brussels, BE)
- Maria-Esther Vidal (TIB - Hannover, DE) [dblp]
Classification
- Artificial Intelligence
- Databases
- Programming Languages
Keywords
- Semantic Data Integration
- Federated Query Processing
- Programming Paradigms for Knowledge Graphs
- Intelligent Knowledge Graph Management
- Access Control and Privacy