Dagstuhl Seminar 02181
Information Integration
( Apr 29 – May 03, 2002 )
Permalink
Organizers
- Vishu Krishnamurthy (Oracle Labs., US)
- Frank Leymann (Universität Stuttgart, DE)
- Nelson Mattos (IBM Silicon Valley Lab., US)
- Bernhard Mitschang (Universität Stuttgart, DE)
Contact
Information Integration subsumes all technologies needed to provide form manipulation of information scattered over many data stores while supporting a single system image. The data stores to be integrated are inherently heterogeneous in nature, owned by different organizations, and distributed over the whole world. Data can be structured (e.g. relational data), semi-structured (e.g. XML documents or hyper-linked HTML pages), or unstructured (e.g. opaque flat files, multi-media streams). Access to the data can be based on standardized interfaces (e.g. SQL) or via proprietary APIs (e.g. RYO solutions).
Information integration is expected to become a key technology in many application areas like product data management, business process management, enterprise application integration, life science (including drug design, health care management), or entertainment (e.g. media on demand) to name but a few. Software vendors begin to deliver first products, currently focusing on a particular application area. Research in Information Integration is currently done in different disciplines.
The major goal of the seminar is to bring representatives from the different communities (from research as well as from software vendors and from users) together for a first stocktaking, a joint in-depth understanding of the issues, to identify and prioritize the main research items, identify standardization needs, and to discuss demanding questions and open problems in detail. The areas to discuss include:
-
- How to get access to the various data stores?
- Different technologies like SQL/MED wrappers, J2EE connectors, EAI adapters, and Web Services can be used for these purposes. When should either of these technologies be used? Can they be unified?
-
- What are possible system structures?
- Which role will database systems, application server, workflow systems, messaging systems, portal servers, etc. play? How do they relate and cooperate?
-
- Does "Web Database Technology4 suffice?
- Can XML be used as the language for describing the integrated information base? How to capture "navigational access4 based on hyper-linked HTML pages performed today in many application areas? How to combine search and query functionality? How is XML stored - sliced/diced, as whole document as file in file system, as whole document but combined with other documents in file system? How do you index these effectively? How do you combine SQL and an XML-based query over the same data (i.e., XML query against SQL data and SQL against XML)? Is a pure XML database the way to go or will an extended relational engine be the right solution?
-
- How is information described?
- As different data stores are combined in a dynamic manner the quality of the information available in a data store becomes key. Which information qualities are needed? How are they described? How can qualities be compared, assessed, measured,5? Which metadata is relevant (schema, ontologies,5)?
-
- Which federated database technologies can be used?
- What is a federated schema if structured and unstructured data are brought together? Which schema integration techniques, federated query and search technologies are applicable?
-
- Which transaction model is appropriate?
- Some of the underlying data stores support classical transactions, others dont. Collective manipulation of data stores demands transactional guarantees. Which guarantees are needed? Data stores are owned by different legal entities and are often accessed via the Internet. Which concurrency models, recovery models are applicable
With this seminar we would like to bring together, for the first time ever, people from different areas that all work on the broad topic of 'Information Integration'. We can see the topic of 'Information Integration' to range from application-oriented areas like geographic information systems or product management systems to generic areas in computer science like repository technology, database federation, or data exchange. It is assumed that the discussions in this seminar will provide a first step in the process of finding the needed solutions to the various forms of 'Information Integration'. The participant list covers various well-known people as well as young scientists from both industry and academics. It is our hope that the seminar will improve the understanding of this field, and stimulate new collaborations between the different communities.
- Rakesh Agrawal (IBM Almaden Center, US)
- Jürgen Angele (ontoprise GmbH - Karlsruhe, DE) [dblp]
- Markus Bon (TU Kaiserslautern, DE)
- Michael L. Brodie (Verizon Information Technology, US)
- Christoph Bussler (National University of Ireland - Galway, IE) [dblp]
- Stefan Dessloch (TU Kaiserslautern, DE) [dblp]
- Barry Devlin (IBM Ireland - Dublin, IE)
- Ying Ding (Universität Innsbruck, AT) [dblp]
- Alan Downing (Oracle Labs., US)
- Dieter Fensel (Universität Innsbruck, AT)
- Marcus Flehmig (TU Kaiserslautern, DE)
- Johann-Christoph Freytag (HU Berlin, DE) [dblp]
- Carole Goble (University of Manchester, GB) [dblp]
- Jens Graupmann (Universität des Saarlandes, DE)
- R. Mark Greenwood (University of Manchester, GB)
- Theo Härder (TU Kaiserslautern, DE) [dblp]
- Axel Herbst (SAP SE - Walldorf, DE)
- Klaudia Hergula (Daimler - Stuttgart, DE)
- Jens Hündling (Hasso-Plattner-Institut - Potsdam, DE)
- Mario Jeckle (Daimler R&D - Ulm, DE)
- Johannes Klein (Microsoft Research - Redmond, US)
- Matthias Kloppmann (IBM Deutschland - Böblingen, DE)
- Dieter König (IBM Deutschland - Böblingen, DE)
- Birgitta König-Ries (Universität Jena, DE) [dblp]
- Vishu Krishnamurthy (Oracle Labs., US)
- Frank Leymann (Universität Stuttgart, DE) [dblp]
- Christoph Mangold (Universität Stuttgart, DE)
- Marcello Mariucci (Universität Stuttgart, DE)
- Nelson Mattos (IBM Silicon Valley Lab., US)
- Robert A. Meersman (Free University of Brussels, BE)
- Bernhard Mitschang (Universität Stuttgart, DE) [dblp]
- Jutta A. Mülle (KIT - Karlsruher Institut für Technologie, DE)
- Felix Naumann (HU Berlin, DE) [dblp]
- Borys Omelayenko (VU University Amsterdam, NL)
- Peter L. Peinl (Fachhochschule Fulda, DE)
- Erhard Rahm (Universität Leipzig, DE) [dblp]
- Norbert Ritter (Universität Hamburg, DE) [dblp]
- Dieter H. Roller (IBM Deutschland - Böblingen, DE)
- Hans-Jörg Schek (UMIT - Hall i. Tirol, AT)
- Ulf Schreier (FH Furtwangen, DE)
- Jürgen Sellentin (Daimler R&D - Ulm, DE)
- Gerd Stumme (Universität Kassel, DE) [dblp]
- Satish R. Thatte (Microsoft Research - Redmond, US)
- Joachim Thomas (UBS AG - Basel, CH)
- Can Türker (ETH Zürich, CH)
- Raphael Volz (KIT - Karlsruher Institut für Technologie, DE)
- Ralf Wagner (Universität Stuttgart, DE)
- Gio Wiederhold (Stanford University, US)
- Andreas Wombacher (Fraunhofer Institut - Darmstadt, DE)