Dagstuhl-Seminar 15431
Genomic Privacy
( 18. Oct – 23. Oct, 2015 )
Permalink
Organisatoren
- Jean Pierre Hubaux (EPFL - Lausanne, CH)
- Stefan Katzenbeisser (TU Darmstadt, DE)
- Bradley Malin (Vanderbilt University - Nashville, US)
- Gene Tsudik (University of California - Irvine, US)
Kontakt
- Andreas Dolzmann (für wissenschaftliche Fragen)
- Susanne Bach-Bernhard (für administrative Fragen)
Programm
The current rise of personalized medicine is based on increasing affordability and availability of individual genome sequencing. Impressive recent advances in genome sequencing have ushered a variety of revolutionary applications in modern healthcare and epidemiology. In particular, better understanding of the human genome as well as its relationship to diseases and response to treatments promise improvements in preventive and personalized healthcare.
At the same time, human genetics has become a 'big data' science. For roughly a decade, specific tests for Single Nucleotide Polymorphisms (SNPs), e.g., markers corresponding to specific diseases, have been well established. Furthermore, research in pharmaco-genomics, which currently relies on SNPs, has helped improve drug treatment for cancer and cardiac patients. The methodology of genotyping, which takes into account hundreds to thousands of variations in positions in the genome, has tremendously increased the amount of data acquired during diagnosis. Personalized genotyping has become commercially available from several sources (such as 23andMe). Full genome sequencing and genome-wide association studies are moving towards full deployment in clinical practice. In 2000, the cost of sequencing one human genome was US$2.5 billion. Today, the price of US$200 for genome sequencing is approaching reality. Considering the benefits for (public) health and potential cost savings, widespread acquisition, storage, and usage of personal genomes is guaranteed to happen soon.
However, because of the human genome's highly sensitive nature, this progress raises important privacy and ethical concerns, which simply cannot be ignored. A digitized genome represents one of the most sensitive types of human (personal) identification data. Even worse, a genome contains information about its owner’s close relatives. Furthermore, correlations with individual data sets from so-called “omics-technologies” pose even bigger threats on privacy. Leakage of personal genomic information can lead a wide variety of attacks, many of which are not yet fully understood. Whether accidentally or intentionally revealed, a digitized genome cannot be revoked or modified. Consequently, secrecy of personal genomic data is of paramount importance. Furthermore, genomic data, unlike other types of highly sensitive information (even national secrets), does not lose its sensitivity over time. Even worse, the mechanisms available to interpret genomic data improve over time, which means that it is unclear at the moment how much sensitive information a genome encodes and which consequences a genomic data breach has. Furthermore, it is likely that genomic data will not only be used personally to support medical treatments; great promise lies in its use in large-scale genetic studies for personalized medicine as well as common ancestry and genetic compatibility tests. Therefore, simply encrypting genomic data at rest is not a viable option and new ways of protection need to be devised.
The second Dagstuhl Seminar on Genomic Pricacy will build concentrate on the following topics:
- Technical solutions for genomic privacy: we will discuss technical solutions to enable genomic data privacy, even in the presence of untrusted computing environments. We will investigate techniques that can be used for this purpose and determine whether they can achieve requirements stemming from practice, as, for example, mentioned in the report of Dagstuhl Seminar 13412
- Integration of genomic and physiological data: For medical purposes, genomic data often needs to be correlated with clinical and physiological data. For example, clinical studies may require finding correlations between physiological data reported during hospital stays and genomic information. So far, most technical solutions for the protection of genomic data focused on securely storing DNA data itself, but did not discuss the complex problem of combining it with physiological data.
- Protection of sensitive data within large-scale genome-wide association studies: Although large-scale genomic studies offer many advantages for medical research, they pose many privacy problems. Most prior technical solutions focus on protection of a single human genome and do not scale multitudes of genomes. It remains a challenge to devise scalable techniques.
This report documents the program and the outcomes of Dagstuhl Seminar 15431 "Genomic Privacy". The current rise of personalized medicine is based on increasing affordability and availability of individual genome sequencing. Impressive recent advances in genome sequencing have ushered a variety of revolutionary applications in modern healthcare and epidemiology. In particular, better understanding of the human genome as well as its relationship to diseases and response to treatments promise improvements in preventive and personalized healthcare.
At the same time, human genetics has become a "big data" science. For roughly a decade, specific tests for Single Nucleotide Polymorphisms (SNPs), e.g., markers corresponding to specific diseases, have been well established. Furthermore, research in pharmaco-genomics, which currently relies on SNPs, has helped improve drug treatment for cancer and cardiac patients. The methodology of genotyping, which takes into account hundreds to thousands of variations in positions in the genome, has tremendously increased the amount of data acquired during diagnosis. Personalized genotyping has become commercially available from several sources (such as 23andMe). Full genome sequencing and genome-wide association studies are moving towards full deployment in clinical practice. In 2000, the cost of sequencing one human genome was US$2.5 billion. Today, the price of US$200 for genome sequencing is approaching reality. Considering the benefits for (public) health and potential cost savings, widespread acquisition, storage, and usage of personal genomes is guaranteed to happen soon.
However, because of the human genome's highly sensitive nature, this progress raises important privacy and ethical concerns, which simply cannot be ignored. A digitized genome represents one of the most sensitive types of human (personal) identification data. Even worse, a genome contains information about its owner’s close relatives. Furthermore, correlations with individual data sets from so-called "omics-technologies" pose even bigger threats on privacy. Leakage of personal genomic information can lead a wide variety of attacks, many of which are not yet fully understood. Whether accidentally or intentionally revealed, a digitized genome cannot be revoked or modified. Consequently, secrecy of personal genomic data is of paramount importance. Furthermore, genomic data, unlike other types of highly sensitive information (even national secrets), does not lose its sensitivity over time. Even worse, the mechanisms available to interpret genomic data improve over time, which means that it is unclear at the moment how much sensitive information a genome encodes and which consequences a genomic data breach has. Furthermore, it is likely that genomic data will not only be used personally to support medical treatments; great promise lies in its use in large-scale genetic studies for personalized medicine as well as common ancestry and genetic compatibility tests. Therefore, simply encrypting genomic data at rest is not a viable option and new ways of protection need to be devised.
The second Dagstuhl Seminar on Genomic Pricacy concentrated on the following topics:
- Technical solutions for genomic privacy: the participants discussed technical solutions to enable genomic data privacy, even in the presence of untrusted computing environments, and investigated technical protection techniques that can be used for this purpose.
- Integration of genomic and physiological data: For medical purposes, genomic data often needs to be correlated with clinical and physiological data. For example, clinical studies may require finding correlations between physiological data reported during hospital stays and genomic information. So far, most technical solutions for the protection of genomic data focused on securely storing DNA data itself, but did not discuss the complex problem of combining it with physiological data.
- Protection of sensitive data within large-scale genome-wide association studies: Although large-scale genomic studies offer many advantages for medical research, they pose many privacy problems. Most prior technical solutions focus on protection of a single human genome and do not scale multitudes of genomes. It remains a challenge to devise scalable techniques.
- Luk Arbuckle (CHEO Research Institute - Ottawa, CA) [dblp]
- Erman Ayday (Bilkent University - Ankara, TR) [dblp]
- Marina Blanton (University of Notre Dame, US) [dblp]
- Dan Bogdanov (Cybernetica AS - Tartu, EE) [dblp]
- Emiliano De Cristofaro (University College London, GB) [dblp]
- Zekeriya Erkin (TU Delft, NL) [dblp]
- Paulo Jorge Esteves-Veríssimo (University of Luxembourg, LU) [dblp]
- Jacques Fellay (EPFL - Lausanne, CH) [dblp]
- Kay Hamacher (TU Darmstadt, DE) [dblp]
- Zhicong Huang (EPFL - Lausanne, CH) [dblp]
- Jean Pierre Hubaux (EPFL - Lausanne, CH) [dblp]
- Mathias Humbert (Universität des Saarlandes, DE) [dblp]
- Aniket Kate (Purdue University - West Lafayette, US) [dblp]
- Stefan Katzenbeisser (TU Darmstadt, DE) [dblp]
- Florian Kerschbaum (SAP SE - Karlsruhe, DE) [dblp]
- Oliver Kohlbacher (Universität Tübingen, DE) [dblp]
- Florian Kohlmayer (TU München - Klinikum Rechts der Isar, DE) [dblp]
- Alexander Kaitai Liang (Aalto University, FI)
- Huang Lin (EPFL - Lausanne, CH) [dblp]
- Bradley Malin (Vanderbilt University - Nashville, US) [dblp]
- Adam Molyneaux (Sophia Genetics SA - Lausanne, CH) [dblp]
- Muhammad Naveed (University of Illinois - Urbana-Champaign, US) [dblp]
- Jun Pang (University of Luxembourg, LU) [dblp]
- Fabian Prasser (TU München - Klinikum Rechts der Isar, DE) [dblp]
- Manuel Prinz (DKFZ - Heidelberg, DE)
- Jean-Louis Raisaro (EPFL - Lausanne, CH) [dblp]
- Kurt Rohloff (NJIT - Newark, US) [dblp]
- Vitaly Shmatikov (Cornell Tech NYC, US) [dblp]
- Sean Simmons (MIT - Cambridge, US) [dblp]
- Adam Davison Smith (Pennsylvania State University - University Park, US) [dblp]
- Thorsten Strufe (TU Dresden, DE) [dblp]
- Qiang Tang (University of Luxembourg, LU) [dblp]
- Carmela Troncoso (IMDEA Software - Madrid, ES) [dblp]
- Juan Ramon Troncoso Pastoriza (University of Vigo, ES) [dblp]
- Gene Tsudik (University of California - Irvine, US) [dblp]
- Xiao Feng Wang (Indiana University - Bloomington, US) [dblp]
Verwandte Seminare
- Dagstuhl-Seminar 13412: Genomic Privacy (2013-10-06 - 2013-10-09) (Details)
Klassifikation
- bioinformatics
- security / cryptology
Schlagworte
- genomics
- genetics
- health data
- privacy protection
- differential privacy
- privacy by design
- information security
- cryptography
- secure computation