Dagstuhl Seminar 24202: Causal Inference for Spatial Data Analytics

Dagstuhl Seminar 24202

Causal Inference for Spatial Data Analytics

( May 12 – May 17, 2024 )

(Click in the middle of the image to enlarge)

Permalink

Please use the following short url to reference this page: https://www.dagstuhl.de/24202

Organizers

Fernando Perez Cruz (ETH Zürich, CH)
Jakob Runge (DLR - Jena, DE & TU Berlin, DE)
Martin Tomko (University of Melbourne - Carlton, AU)
Yanan Xin (ETH Zürich, CH)

Contact

Marsha Kleinbauer (for scientific matters)
Jutka Gasiorowski (for administrative matters)

Publications

Martin Tomko, Yanan Xin, and Jonas Wahl. Causal Inference for Spatial Data Analytics (Dagstuhl Seminar 24202). In Dagstuhl Reports, Volume 14, Issue 5, pp. 25-57, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Summary

Show Summary

Spatial data analytics has undergone a revolution in recent years due to the availability of large, observational spatial datasets and advances in spatially-explicit statistical analysis as well as in machine learning. Despite these improvements, the current spatial data analysis methods primarily center on exploratory, descriptive, and predictive modeling that are grounded in correlational analysis. These approaches fall short of being able to quantify (and sometimes even identify) causal relationships. However, there has been an increasing interest in identifying and quantifying causal relationships in spatial data which are key to designing effective policy interventions in critical applications such as environmental and population science, climate science, epidemiology, urban planning, and traffic management.

Causal inference has been an active field of study in statistics and philosophy for some time. It recently gained traction in the machine learning community as a promising method for enabling more intelligent AI capable of causal reasoning. Yet, the application of existing causal inference methods to the spatial domain is not straightforward, and a theoretical and methodological foundation for spatial causal analysis is in its infancy. Spatial effects, such as spatial dependence and spatial heterogeneity, violate the fundamental assumptions of current causal inference frameworks. Besides, the large sample size, high dimensionality (space, time, attributes), and dynamic properties of spatio-temporal data also pose great challenges in inferencing causal effects. Thus, there is a pressing need to accelerate the theoretical development in the field of spatial causal inference and enable a broader adoption of the methodological approaches supported by a well-grounded analytical toolset. Researchers in environmental sciences, spatial econometrics, spatial statistics, theoretical GIScience, and computing/machine learning communities have started making significant, yet thus far disparate efforts contributing to the foundations of spatial causal inference. This lack of interdisciplinary exchange of ideas and a comprehensive understanding of the potential applications and limitations of spatial causal inference hinders progress across these disciplines.

As machine learning rapidly penetrates various spatial decision-making processes, the time is right to enable cross-discipline conversations around spatial causal inference, and thus maximize the impact of sound methodologies. As AI becomes widely applied to spatial data analysis supporting planning and policy-making, it is imperative to develop approaches that are interpretable, grounded, robust, and responsible. Enabling the conversations between theoretical, computational, and domain experts who are active in causal inference and its application for spatio-temporal systems will accelerate the development of more intelligent and responsible AI for spatial decision-making.

This seminar is convened to initiate conversations across disciplines on these critical questions around spatial causal inference. This five-day seminar covers topics on the definitions and theories of spatial causal inference, methodologies and applications, software and benchmark datasets, and open questions. A detailed program of the seminar is provided in Figure 1. A summary of the daily discussions is shown below.

Unified Definitions of Spatial Causal Inference. The discussion focused on the specification of the spatial component in the causal inference process, covering topics on the formalization of spatial causal inference questions, representations (e.g., Spatial DAG), modeling approaches, and practical relevance.
Methodological Challenges and Solutions. Methodological challenges were demonstrated through case studies in environmental science, transportation, advertisement and recommendations, and other social science applications. Based on these case studies, the group explored methods and ideas for modeling spatial confounding, spatial interference, spatial treatments, and evaluation of spatial causal analysis.
Open-Source Software and Benchmarks. The session featured demos of the open-source Python packages https://github.com/uber/causalml and https://github.com/jakobrunge/tigramite. Following the demonstrations, the group dived into the discussions of casual discovery evaluations and establishing benchmarks for spatial causal inference.
Open Questions and the Road Ahead. The group proposed key research questions in the field of spatial causal inference and identified interests for continued collaborations on these topics.

Figure 1 Program of Dagstuhl Seminar: Causal Inference for Spatial Data Analytics (May 12th - 17th, 2024).

As a major outcome of the seminar, key challenges and research questions were identified in the field, as outlined in Section 4.4.5 Open Questions and also detailed in the notes of our daily discussions. We hope these thoughts and ideas will inspire a broader research interest in spatial causal inference and continue the exchange across disciplines, as well as between academia and industry.

The seminar resulted in the desire to continue these discussions in a series of workshops (the first to take place at ACM SIGSPATIAL 2024) and the need to establish a community (spatial-causal.org).

In the the full report we will first present the position statements prepared by seminar participants on their thoughts related to spatial causal inference. Next, detailed notes of our daily discussions are documented in the report.

Creative Commons BY 4.0

Martin Tomko and Yanan Xin

Motivation

Show Motivation

The ability to identify causal relationships in spatial data is increasingly important for designing effective policy interventions in environmental science, epidemiology, urban planning, and traffic management. Current spatial data analytic methods rely mainly on descriptive and predictive methods that lack explicit causal models. Spatial causal inference offers a promising solution to address this challenge by extending causal inference methodologies to spatial domains. However, this translation is challenging due to spatial effects that violate fundamental assumptions of causal inference. Spatial causal inference is therefore still in its infancy, and there is a pressing need to accelerate its theoretical development and support its adoption with a well-grounded methodological toolset. This requires an interdisciplinary exchange of ideas, as researchers in different fields, such as environmental sciences, spatial statistics, theoretical GIScience, and machine learning, are making significant but disparate efforts in the foundations of spatial causal inference.

To address these challenges, we convene the first Dagstuhl Seminar on Causal Inference for Spatial Data Analytics. The seminar brings together researchers from the machine learning, statistics, and spatial data science communities to exchange ideas on theory, methodology, application, and industrial practices related to spatial causal inference. This seminar should help to identify the research gaps and opportunities, and foster collaborations between academia and industry. We plan to establish a roadmap enabling the sound design of experimental and observational spatial causal analysis workflows. We hope to summarize the theoretical, methodological, and application recommendations in (a series of) foundational materials (papers, monographs, or edited volumes), setting the state of the art of the field and acting as an entry gateway to spatial causal inference for researchers across disciplines.

The topics to be discussed in the seminar include:

Spatial causal inference (theory, general framework, methodology, and applications)
Spatial causal machine learning (theory, methodology, and applications)
Methodological concerns in spatial causal inference: dimension reduction, causal discovery from observational data, hidden spatial confounders, invariance of causal features, spatial matching
Causal relationships - identification vs quantification
Conceptual challenges in the identification and reasoning about spatial causal relationships
Case studies of spatial causal inference across academic and industry-centered spatial applications (incl. climate science, mobility analysis, environmental science, spatial econometrics, epidemiology, spatial reasoning)
Feeding back - how can spatial expertise inform non-spatial causal analysis?
Research and educational directions to accelerate the development of causal inference for spatial data analytics

Creative Commons BY 4.0

Fernando Perez Cruz, Jakob Runge, Martin Tomko, and Yanan Xin

Participants

Show Participants

Kevin Credit (Maynooth University, IE)
Cécile de Bézenac (Alan Turing Institute - London, GB & University of Leeds, GB)
Simon Dirmeier (Swiss Data Science Center - Zürich, CH)
Andreas Gerhardus (DLR - Jena, DE)
Totte Harinen (Airbnb - San Francisco, US)
Dominik Janzing (Amazon Web Services - Tübingen, DE)
Urmi Ninad (TU Berlin, DE)
Markus Reichstein (MPI für Biogeochemie - Jena, DE) [dblp]
Katerina Schindlerova (Universität Wien, AT)
Martin Tomko (University of Melbourne - Carlton, AU)
Jonas Wahl (TU Berlin, DE)
Jianwu Wang (University of Maryland - Baltimore County, US)
Levi John Wolf (University of Bristol, GB)
Yanan Xin (ETH Zürich, CH)
Shu Yang (North Carolina State University - Raleigh, US)
Andrew Zammit Mangion (University of Wollongong, AU)

Classification

Artificial Intelligence
Machine Learning

Keywords

spatial causal inference
spatial data analytics
causal machine learning

Seminar 24202

Search the Dagstuhl Website

Schloss Dagstuhl Services

Seminars

Within this website:

External resources:

Publishing

Within this website:

External resources:

dblp

Within this website:

External resources:

Dagstuhl Seminar 24202

Causal Inference for Spatial Data Analytics

( May 12 – May 17, 2024 )

Permalink

Organizers

Contact

Publications

Summary

Motivation

Participants

Classification

Keywords