Dagstuhl Seminar 19442
Programming Languages for Distributed Systems and Distributed Data Management
( Oct 27 – Oct 31, 2019 )
Permalink
Organizers
- Carla Ferreira (New University of Lisbon, PT)
- Philipp Haller (KTH Royal Institute of Technology - Stockholm, SE)
- Volker Markl (TU Berlin, DE)
- Guido Salvaneschi (TU Darmstadt, DE)
- Cristina Videira Lopes (University of California - Irvine, US)
Contact
- Michael Gerke (for scientific matters)
- Annette Beyer (for administrative matters)
Developing distributed systems is a well-known, decades-old problem in computer science. Despite significant research effort dedicated to this area, programming distributed systems remains challenging. The issues of consistency, concurrency, fault tolerance, as well as (asynchronous) remote communication among heterogeneous platforms naturally show up in this class of systems, creating a demand for proper language abstractions that enable developers to tackle such challenges.
Over the last years, language abstractions have been a key for achieving the properties above in many industrially successful distributed systems. For example, MapReduce takes advantage of purity to parallelize task processing, complex event processing adopts declarative programming to express sophisticated event correlations, and Spark leverages functional programming for efficient fault recovery via lineage. In parallel, there have been notable advances in research on programming languages for distributed systems, such as conflict-free replicated data types, distributed information flow security, language support for safe distribution of computations, as well as programming frameworks for mixed IoT/cloud development.
However, the researchers that have been carrying out these efforts are scattered across different communities that include programming language design, type systems and theory, database systems and database theory, distributed systems, systems programming, data-centric programming, and web application development. This Dagstuhl Seminar aims to bring researchers from these different communities together.
The seminar aims to focus on answering the following major questions in addition to those raised by participants:
- Which abstractions are required in emergent fields of distributed systems, such as mixed cloud/edge computing and IoT?
- How can language abstractions be designed in a way that they provide a high-level interface to programmers and still allow fine-grained tuning of low-level properties when needed, possibly in a gradual way?
- Which compilation pipeline (e.g., which intermediate representation) is needed to address the (e.g., optimization) issues of distributed systems?
- Which research issues must be solved to provide tools (e.g., debuggers, profilers) that are needed to support languages that target distributed systems?
- Which security and privacy issues come up in the context of programming languages for distributed systems and how can they be addressed?
- What benchmarks can be defined to compare language implementations for distributed systems?
Developing distributed systems is a well-known, decades-old problem in computer science. Despite significant research effort dedicated to this area, programming distributed systems remains challenging. The issues of consistency, concurrency, fault tolerance, as well as (asynchronous) remote communication among heterogeneous platforms naturally show up in this class of systems, creating a demand for proper language abstractions that enable developers to tackle such challenges.
Over the last years, language abstractions have been a key for achieving the properties above in many industrially successful distributed systems. For example, MapReduce takes advantage of purity to parallelize task processing; complex event processing adopts declarative programming to express sophisticated event correlations; and Spark leverages functional programming for efficient fault recovery via lineage. In parallel, there have been notable advances in research on programming languages for distributed systems, such as conflict-free replicated data types, distributed information-flow security, language support for safe distribution of computations, as well as programming frameworks for mixed IoT/cloud development.
However, the researchers that have been carrying out these efforts are scattered across different communities which include programming language design, type systems and theory, database systems and database theory, distributed systems, systems programming, data-centric programming, and web application development. This Dagstuhl Seminar brought together researchers from these different communities.
The seminar focused on answering the following major questions:
- Which abstractions are required in emergent fields of distributed systems, such as mixed cloud/edge computing and IoT?
- How can language abstractions be designed in a way that they provide a high-level interface to programmers and still allow fine-grained tuning of low-level properties when needed, possibly in a gradual way?
- Which compilation pipeline (e.g., which intermediate representation) is needed to address the (e.g., optimization) issues of distributed systems?
- Which research issues must be solved to provide tools (e.g., debuggers, profilers) that are needed to support languages that target distributed systems?
- Which security and privacy issues come up in the context of programming languages for distributed systems and how can they be addressed?
- What benchmarks can be defined to compare language implementations for distributed systems?
The seminar accomplished the goal of bringing together the research communities of databases, distributed systems, and programming languages. The list of participants includes 24 academic and industrial researchers from Austria, Belgium, France, Germany, Portugal, Sweden, Switzerland, UK, and USA, with complementary expertise and research interests. The group had a balanced number of senior researchers and junior researchers, as well as a strong industrial representation.
The scientific program comprised 28 sessions. The sessions devoted to individual presentations included 16 short talks with a maximum duration of 15 minutes and 6 long contributed talks with a maximum duration of 35 minutes. In addition, the seminar included 2 plenary sessions and 4 group sessions. The first two days of the seminar were dedicated to research talks, but it was ensured that each talk had allocated time for discussions and exchange of ideas. In the two following mornings there were 3 plenary sessions and 2 parallel group sessions. The topics for these sessions were proposed and selected after a lively discussion between participants, where the most popular sessions were promoted to plenary and the remaining occurred in two parallel sessions. The scientific sessions discussed and collected open questions on the topics of: programming models and abstractions; security and privacy; static guarantees, type systems, verification; distributed computing for the edge; time, synchrony, and consistency; and persistency and serialization. There was also a social topic discussing further actions to bring the three communities together. Even though there are overlapping research interests, there is a difference of values between communities that needs to be acknowledged and tackled. Participants agreed on the goal of organizing follow-up events to further strengthen the connection among the database, the distributed systems and the programming languages communities. In particular, the importance of extending future events to Ph.D. students, for instance with an integrated Summer School, has been discussed.
- Rohan Achar (University of California - Irvine, US) [dblp]
- Carlos Baquero (University of Minho, PT) [dblp]
- Annette Bieniusa (TU Kaiserslautern, DE) [dblp]
- Uwe Breitenbücher (Universität Stuttgart, DE) [dblp]
- Sebastian Burckhardt (Microsoft Research - Redmond, US) [dblp]
- Surajit Chaudhuri (Microsoft Research - Redmond, US) [dblp]
- Natalia Chechina (University of Bournemouth- Poole, GB) [dblp]
- Amit K. Chopra (Lancaster University, GB) [dblp]
- Schahram Dustdar (Technische Universität Wien, AT) [dblp]
- Patrick Thomas Eugster (University of Lugano, CH) [dblp]
- Carla Ferreira (New University of Lisbon, PT) [dblp]
- Torsten Grust (Universität Tübingen, DE) [dblp]
- Philipp Haller (KTH Royal Institute of Technology - Stockholm, SE) [dblp]
- Edward A. Lee (University of California - Berkeley, US) [dblp]
- Heather Miller (Carnegie Mellon University - Pittsburgh, US) [dblp]
- Aleksandar Prokopec (Oracle Labs Switzerland - Zürich, CH) [dblp]
- Laurent Prosperi (Sorbonne University - Paris, FR) [dblp]
- Guido Salvaneschi (TU Darmstadt, DE) [dblp]
- Manuel Serrano (INRIA - Valbonne, FR) [dblp]
- Marc Shapiro (Sorbonne University - Paris, FR) [dblp]
- Marjan Sirjani (Mälardalen University - Västerås, SE) [dblp]
- Peter Van Roy (UC Louvain, BE) [dblp]
- Nobuko Yoshida (Imperial College London, GB) [dblp]
- Damien Zufferey (MPI-SWS - Kaiserslautern, DE) [dblp]
Classification
- data bases / information retrieval
- operating systems
- programming languages / compiler
Keywords
- distributed programming
- big data processing
- distributed computing
- distributed data management
- cloud computing