TOP
Search the Dagstuhl Website
Looking for information on the websites of the individual seminars? - Then please:
Not found what you are looking for? - Some of our services have separate websites, each with its own search option. Please check the following list:
Schloss Dagstuhl - LZI - Logo
Schloss Dagstuhl Services
Seminars
Within this website:
External resources:
  • DOOR (for registering your stay at Dagstuhl)
  • DOSA (for proposing future Dagstuhl Seminars or Dagstuhl Perspectives Workshops)
Publishing
Within this website:
External resources:
dblp
Within this website:
External resources:
  • the dblp Computer Science Bibliography


Dagstuhl Seminar 01361

Foundations of Semistructured Data

( Sep 02 – Sep 07, 2001 )

Permalink
Please use the following short url to reference this page: https://www.dagstuhl.de/01361

Organizers



Summary

Traditional database systems rely on an old model: the relational data model. When it was proposed in the early 1970's by Codd, a logician, the relational model generated a true revolution in data management. In this simple model data is represented as relations in first order structures and queries as first order logic formulas. It enabled researchers and implementors to separate the logical aspect of the data from its physical implementation. Thirty years of research and development followed, and they led to today's mature and highly performant relational database systems.

The age of the Internet brought new data management applications and challenges. Data is now accessed over the Web, and is available in a variety of formats, including HTML, XML, as well as several application specific data formats. Often data is mixed with free text, and the boundary between data and text is sometimes blurred. The way the data can be retrieved also varies considerably: some instances can be downloaded entirely, others can only be accessed through limited capabilities. To accommodate all forms and kinds of data, the database research community has introduced the "semistructured data model", where data is self-describing, irregular, and graph-like. The new model captures naturally Web data, such as HTML, XML, or other application specific formats.

While researchers mostly agree on a common definition of the semistructured data, there is still a lot of confusion about the logical foundations for representing and querying such data: several practical query languages have been proposed, but their formal foundations and their relationships to logical formalisms are poorly understood. This lack of understanding further prevents us from designing general solutions to typical data management problems, such as building indexes, optimizing queries, and designing storage structures. To add to the confusion, the structured document community has studied for several years "structured text", and proposed a number of algebraic operators and accompanying index structures to express queries over structured text. This work definitely has relevance to semistructured data, but their connections are still poorly understood. Current work in academia and research institutions is studying the nature of query languages for semistructured data, and proposing index structures, optimization techniques, and storage mechanisms to support those queries.

This seminar brought together database researchers, logicians, and researchers in structured documents. Furthermore, people from other communities that are related to the area of semistructured data, like information retrieval, programming languages, and discrete algorithms. Besides the presentation of recent research results by the participants additional goals were:

  • to identify the main issues for further foundational research on semistructured data,
  • to improve the mutual understanding of the communities involved concerning their respective settings and needs.

Participants
  • Serge Abiteboul (University of Paris South XI, FR) [dblp]
  • Franz Baader (TU Dresden, DE) [dblp]
  • Michael Benedikt (Bell Labs - Lisle, US) [dblp]
  • Alexandru Berlea (Universität Trier, DE)
  • Anne Brüggemann-Klein (TU München, DE)
  • François Bry (LMU München, DE) [dblp]
  • Peter Buneman (University of Edinburgh, GB) [dblp]
  • Diego Calvanese (Free University of Bozen-Bolzano, IT) [dblp]
  • Peter Fankhauser (Fraunhofer Institut - Darmstadt, DE)
  • Juliana Freire (Bell Labs - Murray Hill, US) [dblp]
  • Philippa Gardner (Imperial College London, GB) [dblp]
  • Giorgio Ghelli (University of Pisa, IT)
  • Georg Gottlob (TU Wien, AT) [dblp]
  • Gösta Grahne (Concordia Univ. - Montreal, CA)
  • Martin Grohe (HU Berlin, DE) [dblp]
  • Jan Hidders (University of Antwerp, BE) [dblp]
  • Christoph Koch (TU Wien, AT) [dblp]
  • Nikolaus Koudas (AT&T Labs Research - Florham Park, US)
  • Alberto Laender (Federal University of Minas Gerais - Belo Horizont, BR)
  • Laks Lakshmanan (University of British Columbia - Vancouver, CA)
  • Hans Leiß (LMU München, DE)
  • Ling Liu (Georgia Institute of Technology - Atlanta, US) [dblp]
  • David Maier (Oregon Health & Science University - Beaverton, US)
  • Alberto O. Mendelzon (University of Toronto, CA)
  • Holger Meuss (LMU München, DE)
  • Gerome Miklau (University of Washington - Seattle, US) [dblp]
  • Uwe Mönnich (Universität Tübingen, DE) [dblp]
  • Frank Morawietz (Universität Tübingen, DE)
  • Frank Neven (University of Limburg, BE) [dblp]
  • Werner Nutt (Heriot-Watt University Edinburgh, GB)
  • Arnaud Sahuguet (University of Pennsylvania - Philadelphia, US) [dblp]
  • Vladimir Sazonov (University of Liverpool, GB)
  • Klaus U. Schulz (LMU München, DE)
  • Thomas Schwentick (TU Dortmund, DE) [dblp]
  • Luc Segoufin (University of Paris South XI, FR) [dblp]
  • Helmut Seidl (TU München, DE) [dblp]
  • Dan Suciu (University of Washington - Seattle, US) [dblp]
  • Val Tannen (University of Pennsylvania - Philadelphia, US) [dblp]
  • Jan Van den Bussche (Hasselt University - Diepenbeek, BE) [dblp]
  • Stijn Vansummeren (University of Limburg, BE) [dblp]
  • Emmanuel Waller (Université Paris Sud, FR)
  • Fang Wei (Universität Freiburg, DE)