Dagstuhl-Seminar 08111
Ranked XML Querying
( 09. Mar – 14. Mar, 2008 )
Permalink
Organisatoren
- Sihem Amer-Yahia (Yahoo! Research - New York, US)
- Divesh Srivastava (AT&T Labs Research - Florham Park, US)
- Gerhard Weikum (MPI für Informatik - Saarbrücken, DE)
Kontakt
This paper is based on a five-day workshop on „Ranked XML Querying“ that took place in Schloss Dagstuhl in Germany in March 2008 and was attended by 27 people from three different research communities: database systems (DB), information retrieval (IR), and Web. The seminar title was interpreted in an IR-style „andish“ sense (it covered also subsets of {Ranking, XML, Querying}, with larger sets being favored) rather than the DB-style strictly conjunctive manner. So in essence, the seminar really addressed the integration of DB and IR technologies with Web 2.0 being an important target area.
DB and IR have evolved as separate communities for historical reasons. They were spawned in the sixties with focus on very different application areas: accounting and reservation systems on the DB side, and library and patent information on the IR side. Consequently, they have emphasized different methodological paradigms: precise querying over schematized data, based on logic and algebra (DB), vs. keyword search and ranking over text and uncertain data, based on statistics and probability theory (IR). However, there are now many applications that require managing both structured and unstructured data and thus mandate serious consideration on how to integrate the DB and IR worlds at both foundational and software-system levels. These applications include Web and Web 2.0 use cases as well as more corporate-oriented scenarios such as customer support and health care. All three communities that participated in the seminar (DB, IR, Web) agreed on the importance of the general direction and came up with ten tenets, from different viewpoints, on why DB&IR integration is desirable.
All three of the participating communities – DB, IR, and Web – felt that looking across the fence paid off very well, and that the communities should continue learning from each other. Challenges are ahead in areas like Web 2.0, personal information management, and entity-relationship search; these will remain difficult and rewarding areas for a while. Combining the different and quite complementary expertises from DB and IR would be vital towards well-founded and practically viable solutions.
See more: http://drops.dagstuhl.de/opus/volltexte/2008/1535/- Sihem Amer-Yahia (Yahoo! Research - New York, US) [dblp]
- Peter M.G. Apers (University of Twente, NL)
- H. Bast (MPI für Informatik - Saarbrücken, DE)
- Mariano P. Consens (University of Toronto, CA) [dblp]
- Emiran Curtmola (University of California - San Diego, US)
- Arjen P. de Vries (CWI - Amsterdam, NL) [dblp]
- Debora Donato (Yahoo Research - Barcelona, ES)
- Ingo Frommholz (Universität Duisburg-Essen, DE) [dblp]
- Irini Fundulaki (FORTH - Heraklion, GR) [dblp]
- Djoerd Hiemstra (University of Twente, NL) [dblp]
- Ihab Francis Ilyas (University of Waterloo, CA) [dblp]
- Panos Ipeirotis (New York University, US) [dblp]
- Benny Kimelfeld (The Hebrew University of Jerusalem, IL) [dblp]
- Stefan Klinger (Universität Konstanz, DE)
- Amélie Marian (Rutgers University - Piscataway, US) [dblp]
- Maarten Marx (University of Amsterdam, NL)
- Yosi Mass (IBM - Haifa, IL)
- Sebastian Michel (MPI für Informatik - Saarbrücken, DE) [dblp]
- Thomas Rölleke (Queen Mary University of London, GB) [dblp]
- Ralf Schenkel (Universität des Saarlandes, DE) [dblp]
- Harald Schöning (Software AG - Darmstadt, DE) [dblp]
- Pierre Senellart (Telecom Paris Tech, FR) [dblp]
- Divesh Srivastava (AT&T Labs Research - Florham Park, US) [dblp]
- Kostas Stefanidis (University of Ioannina, GR)
- Martin Theobald (MPI für Informatik - Saarbrücken, DE) [dblp]
- David Toman (University of Waterloo, CA) [dblp]
- Gerhard Weikum (MPI für Informatik - Saarbrücken, DE) [dblp]
Klassifikation
- data bases/information retrieval
- data structures/algorithms/complexity
- web
Schlagworte
- Scoring methods for XML
- Ranking approximate XML answers
- Top-K query processing
- Querying structured and unstructured data
- XML Full-Text Querying
- Querying heterogeneous XML
- Extracting structure from unstructured data
- Text mining
- XML data integration.