Dagstuhl Seminar 24121
Trustworthiness and Responsibility in AI – Causality, Learning, and Verification
( Mar 17 – Mar 22, 2024 )
Permalink
Organizers
- Vaishak Belle (University of Edinburgh, GB)
- Hana Chockler (King's College London, GB)
- Sriraam Natarajan (University of Texas at Dallas - Richardson, US)
- Shannon Vallor (University of Edinburgh, GB)
- Kush R. Varshney (IBM Research - Yorktown Heights, US)
- Joost Vennekens (KU Leuven, BE)
Contact
- Marsha Kleinbauer (for scientific matters)
- Simone Schilke (for administrative matters)
Shared Documents
- Dagstuhl Materials Page (Use personal credentials as created in DOOR to log in)
Schedule
How can we trust autonomous computer-based systems? Widely accepted definitions of autonomy take the view of being “independent and having the power to make your own decisions.” While many AI systems fit that description, they are often assembled by integrating many heterogenous technologies – including machine learning, symbolic reasoning or optimization – and correspondingly the notion of trust is fragmented and bespoke for the individual communities. However, given that automated systems are increasingly being deployed in safety-critical environments whilst interoperating with humans, a system would not only need to be able to reason about its actions, but a human user would need to additionally externally validate the behavior of the system. This seminar tackled the issue of trustworthiness and responsibility in autonomous systems by considering: notions of cause, responsibility and liability, and tools to verify the behavior of the resulting system.
In the last few years, we have observed increasing contributions in terms of manifestos, position papers, and policy recommendations issued by governments and learned societies, touching on interdisciplinary research involving AI ethics. This has primarily focused on “Fairness, Accountability, and Transparency” (FAT) with a majority focus on fairness, as individual and group fairness seems relatively easier to define precisely. On the other hand, DARPA’s XAI agenda has led to a resurgence in diagnostic explanations, but also ignited the question of interpretability and transparency in machine learning models, especially deep learning architectures. Our high-level motivation is that governance and regulatory practices can be viewed not only as rules and regulations imposed from afar but instead as an integrative process of dialogue and discovery to understand why an autonomous system might fail and how to help designers and regulators address these through proactive governance. But before that agenda can be approached, we need to resolve an important low-level question: how can we understand trust and responsibility of the components that make up an AI system? Autonomous systems will make ‘mistakes’, and accidents will surely happen despite best efforts. How should we reason about responsibility, blame, and causal factors affecting trustworthiness of the system? And if that is considered, what tools can we provide to regulators, verification and validation professionals, and system designers to help them clarify the intent and content of regulations down to a machine interpretable form? Existing regulations are necessarily vague, depending on the nuance of human interpretation for actual implementation. How should they now be made more precise and quantifiable?
The purpose of the seminar was to initiate a debate around these theoretical foundations and practical methodologies with the overall aim of laying the foundations for a “Trustworthiness & Responsibility in AI” framework – a framework for systems development methodology that integrates quantifiable responsibility and verifiable correctness into all stages of the software engineering process. As the challenge, by nature, is multidisciplinary, addressing it must involve experts from different domains, working on creating a coherent, jointly agreed framework. The seminar brought together researchers from Artificial Intelligence (AI), Machine Learning (ML), Robotics (ROB), hardware and software verification (VER), Software Engineering (SE) and the Humanities (HUM), especially Philosophy (PHI), who provided different and complementary perspectives on responsibility and correctness regarding the design of algorithms, interfaces, and development methodologies in AI. From the outset, we wished to especially focus on understanding correctness for AI systems that integrate or utilize data-driven models (i.e., ML models), and to anchor our discussions by appealing to causality (CAU). Causality is widely used in the natural sciences to understand the effect of interventions on observed correlations, allowing scientists to design physical and biological laws. In ML too, increasingly there is recognition that conventional models focus on statistical associations, which can be misleading in critical applications demanding human-understandable explanations. The concept of causality is central to defining a notion of responsibility, and thus was a key point in our discussions.
Directions identified and discussed
The seminar involved extensive discussions between AI, ML, ROB, VER, SE, PHI and HUM researchers who have experience in the following research topics:
- Ethical aspects of AI & ML algorithms: explainability and interpretability in AI algorithms, bias & fairness, accountability, moral responsibility. For example, there were discussions on large language models, their black box nature, and capabilities. There was also quite a bit of work on how explanations and causality might be related. Relevant papers that the participants identified included [10, 1].
- The moral and legal concepts of responsibility that underpin trust in autonomous systems, and how these relate to or can be aided by explainability or causal models of responsibility.
- Technical aspects of AI & ML algorithms: explainability and interpretability in AI algorithms, bias & fairness, accountability, quantification of responsibility. There were discussions regarding how visual input and human-in-the-loop models could provide the next frontier of explainability. Relevant papers identified by the participants included [11].
- Complex AI systems: robotics, reinforcement learning, integrated task and motion planning, mixed-initiative systems. There were discussions that suggest that incorporating high-level specifications from humans could considerably enhance the literature. Examples include recent loss function-based approaches and program induction-related directions for reinforcement policies [5, 4].
- Software engineering for AI systems: development methodologies, specification synthesis, formal verification of ML models, including deep learning architectures, software testing, causality. Outside of a range of recent approaches and looking at verifying the robustness properties of newer networks, there was a discussion on enhancing these perspectives by modeling trust. In fact, what exactly trustworthy machine learning might look like and the components it might involve were also discussed. Examples of relevant work include [9, 12, 8].
- Causal analysis of counterexamples and software faults. Causality was a central topic in the discussion, anchoring some of the key perspectives on how trustworthy AI, as well as explanations, could be addressed along with more nuanced notions such as harm. Following Joseph Halpern’s talk on how harm could be formalized and related discussions, a number of relevant papers were identified as promising starting points for causal analysis [2, 3].
Open questions
Discussions between researchers from these different areas of expertise allowed us to explore topics at the intersection between the main areas, and to ask (and obtain partial answers on) the following questions:
- What sorts of explanations, and more generally, correctness notions are users looking for (or may be helpful for them)? How should these be generated and presented?
- How should we reason about responsibility, blame and causal factors affecting trustworthiness in individual components? How should that be expanded to the overall AI system?
- How do we define and quantify trust? Is trust achieved differently depending on the type of the user? Can trust in AI be achieved only using technology, or do we need societal changes?
- How do users reason about and handle responsibility, blame and cause in their day-to-day activities, and how do we interface those concepts with that of the AI system?
- Do our notions of responsibility and explanations increase user’s trust in the technology?
- Who are the users of the technology? We envision different types of users, from policy makers and regulators to developers of the technology, to laypeople – the end-users.
- Should we differentiate the type of analysis for different categories of users?
- What tools can we provide to regulators, verification and validation professionals and system designers to help them clarify the intent and content of regulations down to a machine interpretable form?
- What tools are available to verify ML components, and do they cover the scope of “correct behavior” as understood by users and regulators?
- What SE practices are relevant for interfacing, integrating and challenging the above notions?
- How can properties of AI systems that are of interest be expressed in languages that lend themselves to formal verification or quantitative analysis?
- What kinds of user interfaces are needed to scaffold users to scrutinise the way AI systems operate?
- What frameworks are needed to reason about blame and responsibility in AI systems?
- How do we integrate research in causal structure learning with low-level ML modules used in robotics?
- How do we unify tools from causal reasoning and verification for assessing the correctness of complex AI systems?
- What challenges arise in automated reasoning and verification when considering the above mixed-initiative systems?
- Given a falsification of a specification, what kind of automated diagnosis, proof-theoretic and causal tools are needed to identify problematic components?
- How broadly will counterfactual reasoning (i.e., “what-if” reasoning) be useful to tackle such challenges?
References
- Lisanne Bainbridge. Ironies of automation. Automatica, 19, 1983.
- Sander Beckers, Hana Chockler, and Joseph Halpern. A causal analysis of harm. Advances in Neural Information Processing Systems, 35:2365–2376, 2022.
- Ilan Beer, Shoham Ben-David, Hana Chockler, Avigail Orni, and Richard Trefler. Explaining counterexamples using causality. Formal Methods in System Design, 40:20–40, 2012.
- Vaishak Belle and Andreas Bueff. Deep inductive logic programming meets reinforcement learning. In The 39th International Conference on Logic Programming. Open Publishing Association, 2023.
- Craig Innes and Subramanian Ramamoorthy. Elaborating on learned demonstrations with temporal logic specifications. arXiv preprint arXiv:2002.00784, 2020.
- William Kidder, Jason D’Cruz, and Kush R Varshney. Empathy and the right to be an exception: What llms can and cannot do. arXiv preprint arXiv:2401.14523, 2024.
- Bran Knowles, Jason D’Cruz, John T. Richards, and Kush R. Varshney. Humble ai. Commun. ACM, 66(9):73–79, aug 2023.
- Ekaterina Komendantskaya and Guy Katz. Towards a certified proof checker for deep neural network verification. Logic-Based Program Synthesis and Transformation, page 198.
- Madsen and Gregor. Measuring human-computer trust. In 11th Australasian Conference on Information Systems, 2000.
- James H. Moor. Four types of ethical robot. Philosophy Now, 2009.
- Spinner, Schlegel, Schäfer, and El-Assady. explAIner: A visual analytics framework for interactive and explainable machine learning. IEEE Trans. on Visualization and Computer Graphics, 2020.
- Kush R. Varshney. Trustworthy Machine Learning. 24121
How can we trust autonomous computer-based systems? Since such systems are increasingly being deployed in safety-critical environments while interoperating with humans, this question is rapidly becoming more important. This Dagstuhl Seminar aims to address this question by bringing together an interdisciplinary group of researchers from Artificial Intelligence (AI), Machine Learning (ML), Robotics (ROB), hardware and software verification (VER), Software Engineering (SE), and Social Sciences (SS); who can provide different and complementary perspectives on responsibility and correctness regarding the design of algorithms, interfaces, and development methodologies in AI.
The purpose of the seminar will be to initiate a debate around both theoretical foundations and practical methodologies for a "Trustworthiness & Responsibility in AI" framework that integrates quantifiable responsibility and verifiable correctness into all stages of the software engineering process. Such a framework will allow governance and regulatory practises to be viewed not only as rules and regulations imposed from afar, but instead as an integrative process of dialogue and discovery to understand why an autonomous system might fail and how to help designers and regulators address these through proactive governance.
In particular, we will consider how to reason about responsibility, blame, and causal factors affecting the trustworthiness of the system. More practically, we will also ask what tools we can provide to regulators, verification and validation professionals, and system designers to help them clarify the intent and content of regulations down to a machine interpretable form. While existing regulations are necessarily vague, and dependent on human interpretation, we will ask:
How should they now be made precise and quantifiable? What is lost in the process of quantification? How do we address factors that are qualitative in nature, and integrate such concerns in an engineering regime? In addressing these questions, the seminar will benefit from extensive discussions between AI, ML, ROB, SE, and SS researchers who have experience with ethical, societal, and legal aspects of AI, complex AI systems, software engineering for AI systems, and causal analysis of counterexamples and software faults.
As a main outcome of this Dagstuhl Seminar we plan to create a blueprint(s) of a "Trustworthiness & Responsibility in AI" framework(s), grounded in causality and verification. This will be immediately useful as a guideline and can form the foundation for a white paper. Specifically, we will seek to produce a report detailing what we consider gaps in formal research around responsibility and which will be useful for future research for the preparation of new experiments, papers, and project proposals that help close these gaps. We also hope that this initial material will lead to a proposal for an open workshop at a major international conference that could be organised by participants of the seminar, and the organisers will endeavour to produce, in collaboration with other interested participants a magazine-style article (for AI Magazine, IEEE Intelligent Systems, or similar outlets) summarising the results of the workshop and giving an overview of the research challenges that came out of it.
- Nadisha-Marie Aliman (Utrecht University, NL)
- Emma Beauxis-Aussalet (VU Amsterdam, NL) [dblp]
- Sander Beckers (University of Amsterdam, NL)
- Vaishak Belle (University of Edinburgh, GB) [dblp]
- Jan M. Broersen (Utrecht University, NL) [dblp]
- Georgiana Caltais (University of Twente - Enschede, NL)
- Hana Chockler (King's College London, GB) [dblp]
- Jens Claßen (Roskilde University, DK) [dblp]
- Sjur K. Dyrkolbotn (West. Norway Univ. of Applied Sciences - Bergen, NO) [dblp]
- Yanai Elazar (AI2 - Seattle, US)
- Esra Erdem (Sabanci University - Istanbul, TR) [dblp]
- Michael Fisher (University of Manchester, GB) [dblp]
- Sarah Alice Gaggl (TU Dresden, DE) [dblp]
- Leilani H. Gilpin (University of California - Santa Cruz, US)
- Gregor Goessler (INRIA - Grenoble, FR)
- Joseph Y. Halpern (Cornell University - Ithaca, US) [dblp]
- Till Hofmann (RWTH Aachen University, DE) [dblp]
- David Jensen (University of Massachusetts - Amherst, US)
- Leon Kester (TNO Netherlands - The Hague, NL)
- Ekaterina Komendantskaya (Heriot-Watt University - Edinburgh, GB) [dblp]
- Stefan Leue (Universität Konstanz, DE) [dblp]
- Joshua Loftus (London School of Economics and Political Science, GB)
- Mohammad Reza Mousavi (King's College London, GB) [dblp]
- Giuseppe Primiero (University of Milan, IT)
- Ajitha Rajan (University of Edinburgh, GB) [dblp]
- Subramanian Ramamoorthy (University of Edinburgh, GB) [dblp]
- Kilian Rückschloß (LMU München, DE)
- Judith Simon (Universität Hamburg, DE) [dblp]
- Luke Stark (University of Western Ontario - London , CA)
- Daniel Susser (Cornell University - Ithaca, US)
- Shannon Vallor (University of Edinburgh, GB) [dblp]
- Kush R. Varshney (IBM Research - Yorktown Heights, US)
- Joost Vennekens (KU Leuven, BE)
- Felix Weitkämper (LMU München, DE) [dblp]
Classification
- Artificial Intelligence
- Machine Learning
Keywords
- artificial intelligence
- machine learning
- causality
- responsible AI
- verification