Dagstuhl-Seminar 17222
Robust Performance in Database Query Processing
( 28. May – 02. Jun, 2017 )
Permalink
Organisatoren
- Renata Borovica-Gajic (The University of Melbourne, AU)
- Goetz Graefe (Google - Madison, US)
- Allison Lee (Snowflake Computing Inc. - San Mateo, US)
Kontakt
- Andreas Dolzmann (für wissenschaftliche Fragen)
- Susanne Bach-Bernhard (für administrative Fragen)
Impacts
- G-CORE : A Core for Future Graph Query Languages : article in Proceedings of the 2018 International Conference on Management of Data - Angles, Renzo; Arenas, Marcelo; Barcelo, Pablo; Boncz, Peter A.; Fletcher, George; Gutierrez, Claudio; Lindaaker, Tobias; Paradies, Marcus; Plantikow, Stefan; Sequeda, Juan; Rest, Oscar van; Voigt, Hannes - New York : ACM, 2018. - Pages: 1421-1432.
- JCC-H : Adding Join Crossing Correlations with Skew to TPC-H : article in LNCS 10661: Performance Evaluation and Benchmarking for the Analytics Era - Boncz, Peter A.; Anatiotis, Angelos-Christos; Kläbe, Steffen - Berlin : Springer, 2017. - pp. 103-119 - (Lecture notes in computer science ; 10661 : article).
- Waves of Misery After Index Creation : article in BTW2019 : Datenbanksysteme für Business, Technologie und Web : pp. 77-96 - Glombiewski, Nikolaus; Seeger, Bernhard; Graefe, Goetz - Bonn : Gesellschaft für Informatik e.V., 2019 - (Lecture notes in informatics / P ; 289 : article).
- POLAR : Adaptive and Non-invasive Join Order Selection via Plans of Least Resistance - Justen, David; Ritter, Daniel; Lamb, Andrew; Markl, Volker; Lee, Allison; Fraser, Campbell; Tran, Nga; Bodner, Thomas; Haddad, Mhd Yamen; Zeuch, Steffen; Boehm, Matthias - Irvine : VLDB Endowment, Inc., 2024. - 14 pp.
The aim of this Dagstuhl Seminar on Robust Query Processing is to bring together top researchers from both industry and academia working on plan generation and plan execution in database query processing and in cloud-based massively parallel systems for data management. Delivering robust query performance is well known to be a difficult problem for database management systems. All experienced DBAs and database users are familiar with sudden disruptions in data centers due to poor performance of queries that have performed perfectly well in the past. The goal of the seminar is to discuss the current state-of-the-art, to identify specific research opportunities in order to improve the state-of-affairs in query processing, and to develop new approaches or even solutions for these opportunities.
Prior Dagstuhl Seminars have initiated discussions on robust performance of query processing in relational databases. They have proposed some metrics and benchmarks as well as initial solutions. These prior solutions have focused on data access, e.g., index scans vs. table scans as well as dynamic transitions from one to the other, and on execution algorithms, e.g., for joins and duplicate removal. Much work remains, e.g., on join sequences and on load balancing. It could be argued that the remaining work will be harder than the prior work – therefore, we invite to this Dagstuhl Seminar on Robust Query Processing.
The seminar will start with a focus on the problem and on prior results. Thereafter, seminar participants will work in small groups on specific problems. For example, one group might focus on detecting that a chosen join sequence is suboptimal for the actual predicate selectivities whereas another group might focus on changing dynamically the sequence of join and aggregation operations. Perhaps multiple groups focus on changing the join sequence assuming alternative join algorithms, e.g., order-based merge joins or order-agnostic hash joins. Interleaved plenary presentations and discussions will re-focus the working groups. Towards the end of the week, we may have concrete ideas for new techniques and perhaps for publications. Both academic and industrial participants may freely use the discussion contents and results for follow-on work.
The Dagstuhl Seminar 17222 on "Robust performance in database query processing" assembled researchers from industry and academia for the third time to discuss robustness issues in database query performance. The seminar gathered 24 researchers around the world working on plan generation and plan execution in database query processing and in cloud-based massively parallel systems with the purpose to address the open research challenges with respect to the robustness of database management systems.
Delivering robust query performance is well known to be a difficult problem for database management systems. All experienced DBAs and database users are familiar with sudden disruptions in data centers due to poor performance of queries that have performed perfectly well in the past. The goal of the seminar is to discuss the current state-of-the-art, to identify specific research opportunities in order to improve the state-of-affairs in query processing, and to develop new approaches or even solutions for these opportunities.
Unlike the previous seminars, the organizers (Renata Borovica-Gajic, Goetz Graefe and Allison Lee) this time attempted to have a focused subset of topics that the participants discussed and analyzed in more depth. From the proposed topics on algorithm choices, join sequences, updates, database utilities, parallelism and skew, column stores, physical database design, and explainability of non-robust query performance, the participants chose four topics and formed four work groups: i) one discussing updates and database utilities, ii) one discussing parallelism and skew, iii) one discussing join sequences, and iv) one focusing on the explanations of the sources of non-robust performance.
Upon choosing the topics of interest, the organizers then guided the participants to approach the topic through a set of steps: by first considering related work in the area; then introducing metrics and tests that will be used for testing the validity and robustness of the solution; after metrics, the focus was on proposing specific mechanisms for the proposed approaches; and finally the last step focused on the implementation policies.
The seminar thus spent its first day on reviewing prior related work, with a special emphasis on the pieces of work that appeared following the previous instances of the seminar: benchmarks (Dagstuhl 12321), Smooth Scan , and Generalized join.
Tuesday was spent on defining metrics and tests. On Wednesday, the participants discussed possible alternative approaches and hiked together in the woods. Thursday was focused on driving one chosen approach to specific mechanisms. Finally, we spent Friday on discussing the policies and presented the overall progress.
At the end of the week, each group was hoping to continue their work towards a research publication. The group on parallelism and skew was hoping to publish first a survey on forms of skew and existing remedies for skew. The work group on dynamic join sequences even had a working prototype by the end of the seminar. The reports of work groups are presented next.
- Angelos Christos Anadiotis (EPFL - Lausanne, CH) [dblp]
- Tahir Azim (EPFL - Lausanne, CH) [dblp]
- Peter A. Boncz (CWI - Amsterdam, NL) [dblp]
- Renata Borovica-Gajic (The University of Melbourne, AU) [dblp]
- Surajit Chaudhuri (Microsoft Research - Redmond, US) [dblp]
- John Cieslewicz (Google Mountain View, US) [dblp]
- Thanh Do (Google - Madison, US) [dblp]
- Campbell Fraser (Google - Kirkland, US) [dblp]
- Johann-Christoph Freytag (HU Berlin, DE) [dblp]
- Goetz Graefe (Google - Madison, US) [dblp]
- Fabian Hüske (data Artisans - Berlin, DE) [dblp]
- Alfons Kemper (TU München, DE) [dblp]
- Allison Lee (Snowflake Computing Inc. - San Mateo, US) [dblp]
- Thomas Neumann (TU München, DE) [dblp]
- Anisoara Nica (SAP SE - Waterloo, CA) [dblp]
- Glenn Paulley (SAP SE - Waterloo, CA) [dblp]
- Ilia Petrov (Hochschule Reutlingen, DE) [dblp]
- Bart Samwel (Google - Amsterdam, NL) [dblp]
- Kai-Uwe Sattler (TU Ilmenau, DE) [dblp]
- Knut Stolze (IBM Deutschland - Böblingen, DE) [dblp]
- Immanuel Trummer (Cornell University - Ithaca, US) [dblp]
- Srinivas Karthik Venkatesh (Indian Institute of Science - Bangalore, IN) [dblp]
- Jiaqi Yan (Snowflake Computing Inc. - San Mateo, US) [dblp]
- Marcin Zukowski (Snowflake Computing Inc. - San Mateo, US) [dblp]
Verwandte Seminare
- Dagstuhl-Seminar 10381: Robust Query Processing (2010-09-19 - 2010-09-24) (Details)
- Dagstuhl-Seminar 12321: Robust Query Processing (2012-08-05 - 2012-08-10) (Details)
- Dagstuhl-Seminar 22111: Database Indexing and Query Processing (2022-03-13 - 2022-03-18) (Details)
- Dagstuhl-Seminar 24101: Robust Query Processing in the Cloud (2024-03-03 - 2024-03-08) (Details)
Klassifikation
- data bases / information retrieval
Schlagworte
- Database
- query processing
- query optimization
- query execution
- map-reduce
- systems
- cloud computing
- performance
- scalability
- robustness
- reliability
- predictability
- planning
- uncertainty