Dagstuhl Seminar 17461: Connecting Visualization and Data Management Research

Dagstuhl Seminar 17461

Connecting Visualization and Data Management Research

( Nov 12 – Nov 17, 2017 )

(Click in the middle of the image to enlarge)

Permalink

Please use the following short url to reference this page: https://www.dagstuhl.de/17461

Organizers

Remco Chang (Tufts University - Medford, US)
Jean-Daniel Fekete (INRIA Saclay - Orsay, FR)
Juliana Freire (New York University, US)
Carlos E. Scheidegger (University of Arizona - Tucson, US)

Contact

Michael Gerke (for scientific matters)
Susanne Bach-Bernhard (for administrative matters)

Publications

Connecting Visualization and Data Management Research (Dagstuhl Seminar 17461). Remco Chang, Jean-Daniel Fekete, Juliana Freire, and Carlos E. Scheidegger. In Dagstuhl Reports, Volume 7, Issue 11, pp. 46-58, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018)

Impacts

Schedule

Schedule

Motivation

Show Motivation

What prevents analysts from acquiring wisdom from data sources? To use data, to better understand the world and act upon it, we need to understand both the computational and the human-centric aspects of data-intensive work. In this Dagstuhl Seminar, we will establish the foundations for the next generation of data management and visualization systems by bringing together these two largely independent communities. While exploratory data analysis (EDA) has been a pillar of data science for decades, maintaining interactivity during EDA has become difficult, as the data size and complexity continue to grow. In modern day statistical systems, it is assumed that all data need to fit into memory in order to support interactivity. However, when faced with a large amount of data, few techniques can support EDA fluidly. During this process, interactivity is critical: if each operation takes hours or even minutes to finish, analysts lose track of their thought process. Bad analyses cause bad interpretations, bad actions and bad policies.

As data scale and complexity increases, the novel solutions that will ultimately enable interactive, large-scale EDA will have to come from truly interdisciplinary and international work. Today, database researchers can store and query massive amounts of data, including methods for distributed, streaming and approximate computation. Data mining techniques provide ways to discover unexpected patterns and to automate and scale well-defined analysis procedures. Recent systems research has looked at how to develop novel database systems architectures to support the iterative, optimization-oriented workloads of data-intensive algorithms. Of course, both the inputs and outputs of these systems are ultimately driven by people, in support of analysis tasks. The life-cycle of data involves an iterative, interactive process of determining which questions to ask, the data to analyze, appropriate features and models, and interpreting results. In order to achieve better analysis outcomes, data processing systems require improved interfaces that account for the strengths and limitations of human perception and cognition. Meanwhile, to keep up with the rising tide of data, interactive visualization tools need to integrate more techniques from databases and machine learning.

By bringing together the two disparate communities, we will lay the foundations for next generation of data (management, mining, retrieval) and interactive visualization systems. Isolated, computational breakthroughs will forever remain locked behind inadequate interfaces, while improvements in how users experience data analysis will never scale to the volume of present-day datasets. Together, these two communities will both realize their vision for empowering people to use data to understand and improve the world. The main goal of this seminar is to bring together researchers from the data management community and the interactive visualization community to address the challenge of envisioning and developing the next generation of data systems that can support the cognitive, perceptual, and analytical needs of the human. Few existing systems can truly do so at scale, and with the explosive growth in data size and complexity it is more important than ever to gather researchers from the different disciplines to designing a research agenda that can meet the demands of the future. Specifically, we aim to:

Formulate a research agenda around the challenge of reducing latency in interactive data systems. For example, develop novel pre-aggregation strategies that take into account the particular constraints and strengths of human perceptual systems; this will enable at-scale human-centric database indices, human-centric statistical analysis environments, and so on.
Focus on specific theoretical and practical problems that need to be solved in order to enable human-centric, large-scale data exploration.
Run special issues in leading journals such as IEEE CG&A and ACM TiiS to disseminate the developed research agenda and the research outcomes from this community.

Creative Commons BY 3.0 DE

Remco Chang, Jean-Daniel Fekete, Juliana Freire, and Carlos E. Scheidegger

Summary

Show Summary

What prevents analysts from acquiring wisdom from data sources? To use data, to better understand the world and act upon it, we need to understand both the computational and the human-centric aspects of data-intensive work. In this Dagstuhl Seminar, we sought to establish the foundations for the next generation of data management and visualization systems by bringing together these two largely independent communities. While exploratory data analysis (EDA) has been a pillar of data science for decades, maintaining interactivity during EDA has become difficult, as the data size and complexity continue to grow. Modern statistical systems often assume that all data need to fit into memory in order to support interactivity. However, when faced with a large amount of data, few techniques can support EDA fluidly. During this process, interactivity is critical: if each operation takes hours or even minutes to finish, analysts lose track of their thought process. Bad analyses cause bad interpretations, bad actions and bad policies.

As data scale and complexity increases, the novel solutions that will ultimately enable interactive, large-scale EDA will have to come from truly interdisciplinary and international work. Today, database systems can store and query massive amounts of data, including methods for distributed, streaming and approximate computation. Data mining techniques provide ways to discover unexpected patterns and to automate and scale well-defined analysis procedures. Recent systems research has looked at how to develop novel database systems architectures to support the iterative, optimization-oriented workloads of data-intensive algorithms. Of course, both the inputs and outputs of these systems are ultimately driven by people, in support of analysis tasks. The life-cycle of data involves an iterative, interactive process of determining which questions to ask, the data to analyze, appropriate features and models, and interpreting results. In order to achieve better analysis outcomes, data processing systems require improved interfaces that account for the strengths and limitations of human perception and cognition. Meanwhile, to keep up with the rising tide of data, interactive visualization tools need to integrate more techniques from databases and machine learning.

This Dagstuhl seminar brought together researchers from the two communities (visualization and databases) to establish a research agenda towards the development of next generation data management and interactive visualization systems. In a short amount of time, the two communities learned from each other, identified the strengths and weaknesses of the latest techniques from both fields, and together developed a "state of the art" report on the open challenges that require the collaboration of the two communities. This report documents the outcome of this collaborative effort by all the participants.

Creative Commons BY 3.0 Unported license

Remco Chang, Jean-Daniel Fekete, Juliana Freire, and Carlos E. Scheidegger

Participants

Show Participants

Sihem Amer-Yahia (CNRS - St. Martin-d'Hères, FR) [dblp]
Leilani Battle (University of Washington - Seattle, US) [dblp]
Carsten Binnig (TU Darmstadt, DE) [dblp]
Tiziana Catarci (Sapienza University of Rome, IT) [dblp]
Remco Chang (Tufts University - Medford, US) [dblp]
Surajit Chaudhuri (Microsoft Research - Redmond, US) [dblp]
Stephan Diehl (Universität Trier, DE) [dblp]
Harish Doraiswamy (New York University, US) [dblp]
Steven M. Drucker (Microsoft Research - Redmond, US) [dblp]
Jason Dykes (City - University of London, GB) [dblp]
Jean-Daniel Fekete (INRIA Saclay - Orsay, FR) [dblp]
Danyel Fisher (Microsoft Research - Redmond, US) [dblp]
Juliana Freire (New York University, US) [dblp]
Michael Gleicher (University of Wisconsin - Madison, US) [dblp]
Hans Hagen (TU Kaiserslautern, DE) [dblp]
Gerhard Heyer (Universität Leipzig, DE) [dblp]
Heike Hofmann (Iowa State University - Ames, US) [dblp]
Daniel A. Keim (Universität Konstanz, DE) [dblp]
Tim Kraska (Brown University - Providence, US) [dblp]
Heike Leitte (TU Kaiserslautern, DE) [dblp]
Zhicheng Liu (Adobe Systems Inc. - Seattle, US) [dblp]
Volker Markl (TU Berlin, DE) [dblp]
Alexandra Meliou (University of Massachusetts - Amherst, US) [dblp]
Torsten Möller (Universität Wien, AT) [dblp]
Dominik Moritz (University of Washington - Seattle, US) [dblp]
Hannes Mühleisen (CWI - Amsterdam, NL) [dblp]
Arnab Nandi (Ohio State University - Columbus, US) [dblp]
Behrooz Omidvar-Tehrani (LIG - Grenoble, FR) [dblp]
Themis Palpanas (Paris Descartes University, FR) [dblp]
Carlos E. Scheidegger (University of Arizona - Tucson, US) [dblp]
Gerik Scheuermann (Universität Leipzig, DE) [dblp]
Michael Sedlmair (Universität Wien, AT) [dblp]
Thibault Sellam (Columbia University - New York, US) [dblp]
Juan Soto (Technische Universität Berlin, DE) [dblp]
Richard Wesley (Tableau Software - Seattle, US) [dblp]
Wesley J. Willett (University of Calgary, CA) [dblp]
Eugene Wu (Columbia University - New York, US) [dblp]
Yifan Wu (University of California - Berkeley, US) [dblp]

Classification

computer graphics / computer vision
data bases / information retrieval
society / human-computer interaction

Keywords

Information Visualization
Database Management Systems
Interactive Data Analysis
Human-Centric Computing
Big Data

Seminar 17461

Search the Dagstuhl Website

Schloss Dagstuhl Services

Seminars

Within this website:

External resources:

Publishing

Within this website:

External resources:

dblp

Within this website:

External resources:

Dagstuhl Seminar 17461

Connecting Visualization and Data Management Research

( Nov 12 – Nov 17, 2017 )

Permalink

Organizers

Contact

Publications

Impacts

Schedule

Motivation

Summary

Participants

Classification

Keywords