Dagstuhl Seminar 25021
Grand Challenges for Research on Privacy Documents
( Jan 05 – Jan 10, 2025 )
Permalink
Organizers
- Florian Schaub (University of Michigan - Ann Arbor, US)
- Christine Utz (Radboud University Nijmegen, NL)
- Shomir Wilson (Pennsylvania State University - University Park, US)
Contact
- Andreas Dolzmann (for scientific matters)
- Jutka Gasiorowski (for administrative matters)
Dagstuhl Seminar Wiki
- Dagstuhl Seminar Wiki (Use personal credentials as created in DOOR to log in)
Shared Documents
- Dagstuhl Materials Page (Use personal credentials as created in DOOR to log in)
Schedule
- Upload (Use personal credentials as created in DOOR to log in)
Dagstuhl Seminar will gather an interdisciplinary group of researchers from privacy, natural language processing, human-computer interaction, public policy, and law to identify and characterize key challenges to research on privacy documents: privacy policies, terms of use, cookie policies, and other texts about data practices. In the status quo, privacy documents fail to fulfill the needs of stakeholders in our information society. Although many Internet users have concerns about their privacy, most lack the time, knowledge, and other resources to understand these documents, leaving them underinformed and compromising notice and choice. The needs of other stakeholders, including researchers, policymakers, and privacy practitioners, are similarly stymied. Although a growing body of research is devoted to analyzing, reconstituting, or otherwise using these documents to satisfy stakeholders’ needs, broader interdisciplinary efforts are needed.
This seminar will identify and characterize key challenges in privacy document research and produce a roadmap of how to tackle them in order to move the field forward. The unique gathering of researchers will be an opportunity to develop and explore the challenges interactively in a highly connected interdisciplinary setting. Potential areas of challenges include:
Privacy document selection: Deciding what privacy documents to analyze is itself a complex question. Popularity, ease of collection, sectoral focus, language, and quantity of documents are factors with impacts that are only partially understood. Additionally, duplicated effort has led to siloed knowledge about best practices.
Privacy document retrieval: Privacy documents are typically collected from the Web or from app stores. Collection methods tend to be purpose-driven and organic, in response to basic collection difficulties (e.g., lack of rigid standards for where to post a privacy policy and how to parse its text) and to specific project goals. This represents a space of options that researchers to date have explored only in an ad hoc fashion.
Policy analysis: Methods vary widely from hand-labeling of small quantities of documents to text classification using large language models. Annotation schemes similarly vary in granularity from binary labeling of documents to frame-centric annotation schemes for labeling phrases or sentences. While no one set of methods accommodates all research goals, the hidden work of language resource construction for this domain is often underappreciated, and prior work remains underutilized.
Lack of a dedicated research community or publication outlets: Privacy policy analysis draws from a variety of different research areas that have had limited contact. While interdisciplinary research is widely acknowledged to be important, obstacles remain for computational researchers and law or public policy researchers to form effective research partnerships. Also, the outputs of interdisciplinary projects that focus on privacy documents may lack a venue that recognizes their value.
Availability and maintenance of research tools: The lack of community poses a challenge to discovering existing work on the analysis of privacy documents. Additionally, the lack of a way to discover tools for working with privacy documents is one of several contributing factors to lack of maintenance of them, as perceived utility is diminished.
By bringing together researchers who face these problems, the seminar will serve as a conduit for interdisciplinary discovery of shared goals and possible solutions.
- Ruba Abu-Salma (King's College London, GB)
- Noah Apthorpe (Colgate University - Hamilton, US)
- Eleanor Birrell (Pomona College - Claremont, US)
- Travis Breaux (Carnegie Mellon University - Pittsburgh, US)
- Kai-Wei Chang (UCLA, US)
- Jose M. del Alamo (Polytechnic University of Madrid, ES)
- Rinku Dewri (University of Denver, US)
- Nico Ebert (ZHAW - Winterthur, CH)
- Kassem Fawaz (University of Wisconsin - Madison, US)
- Simone Fischer-Hübner (Karlstad University, SE) [dblp]
- Sepideh Ghanavati (University of Maine, US) [dblp]
- Henry Hosseini (Universität Münster - ERCIS, DE)
- Agnieszka Kitkowska (Jönköping University, SE)
- Konrad Kollnig (Maastricht University, NL)
- Timothy Libert (webXray - Sunnyvale, US) [dblp]
- Kirsten Martin (University of Notre Dame, US)
- Jelena Mitrovic (Universität Passau, DE & Institute for Artificial Intelligence R&D of Serbia - Novi Sad, RS)
- Rishab Nithyanand (University of Iowa - Iowa City, US)
- Shidong Pan (Australian National University - Acton, AU)
- Harshvardhan J. Pandit (Dublin City University, IE)
- Sarah Radway (Harvard University - Allston, US)
- Florian Schaub (University of Michigan - Ann Arbor, US) [dblp]
- Yan Shvartzshnaider (York University - Toronto, CA)
- Daniel Smullen (CableLabs - Louisville, US)
- Peter Story (Clark University - Worcester, US)
- Alina Stöver (TU Darmstadt, DE)
- Brian Tang (University of Michigan - Ann Arbor, US)
- Emma Tosch (Northeastern University - Boston, US)
- Christine Utz (Radboud University Nijmegen, NL)
- Kami Vaniea (University of Waterloo, CA)
- Isabel Wagner (Universität Basel, CH) [dblp]
- Shomir Wilson (Pennsylvania State University - University Park, US)
- Maximiliane Windl (LMU München, DE)
- Lu Xian (University of Michigan - Ann Arbor, US)
- Tianyang Zhao (Pennsylvania State University - University Park, US)
Classification
- Computation and Language
- Computers and Society
- Human-Computer Interaction
Keywords
- Privacy
- Policy
- Law
- Text
- Usability