Dagstuhl Seminar 24181
Computational Metabolomics: Towards Molecules, Models, and their Meaning
( Apr 28 – May 03, 2024 )
Permalink
Organizers
- Timothy M. D. Ebbels (Imperial College London, GB)
- Soha Hassoun (Tufts University - Medford, US)
- Ewy A. Mathé (National Institutes of Health - Bethesda, US)
- Justin J. J. van der Hooft (Wageningen University & Research, NL)
Contact
- Andreas Dolzmann (for scientific matters)
- Christina Schwarz (for administrative matters)
Shared Documents
- Dagstuhl Materials Page (Use personal credentials as created in DOOR to log in)
Metabolomics is the study of the small molecule composition of biological systems. These small molecules define biological functions, making metabolomics a broadly applied technology in the biomedical, environmental, and biotechnology fields of study. Metabolite measurements are typically produced using mass spectrometry (MS), usually coupled with liquid or gas chromatography (LC or GC), and/or nuclear magnetic resonance (NMR) spectroscopy. New technologies continue to be developed, leading to increased resolution of metabolite species detected, as well as increased sensitivity. The resulting datasets are high throughput and typically yield abundances of thousands of small chemical structures. For these reasons, the field of computational metabolomics continues to grow to address current and imminent issues in data stewardship, processing, analysis and interpretation.
This Dagstuhl Seminar is the 5th seminar in the computational metabolomics series. Previous seminars have addressed key topics in the field. These include leveraging spectral data to annotate and identify routinely measured metabolites, assessing interactions between metabolites and proteins through "metaboproteomic assays", as well as implementing multi-omic analyses and interpreting these data through enrichment analyses. This year, we not only dug deeper into the most current and relevant topics but also introduced new topics to the series. Our overall goal was to explore how to improve the utility of metabolomics data and its scientific relevance across many disciplines, alone or in combination with other data types, by leveraging machine learning and deep learning (ML/DL). To accomplish this, our seminar was organized into four categories. The first category was education, which was a newly introduced topic for this edition. Participants recognized the need for resources linking out to available education and training materials and discussed the inherent challenges of teaching multi-disciplinary topics, such as computational metabolomics. Our second category was molecules, which includes annotations and measurements of metabolites and molecules they associate with, such as proteins, genes, and other metabolites. New areas of emphasis included representation and classification of lipids, polymers and multi-constituent substances, and use of electron-activated disassociation methods for data collection. Our third category was models, which encompasses data quality, uncertainty in annotating metabolites, and relationships between metabolites and other molecules. Sessions in this category were the most prominent and included novel areas of repository-scale analyses, simulations of metabolomic data, and automation of data analysis workflows. Of note, practical sessions on how to best measure scientific impact and how/where to submit data publicly to increase utility of data and ability to develop new computational methods were held. Lastly, our fourth category was meaning, which represents resources, methods and tools that enable visualization and interpretation of large-scale metabolomic data in the context of biological, environmental or other sciences. This included the use and misuse of molecular networking and its current applications in the field. This year, new focus was placed on building interpretable models and leveraging increasingly available information on metabolite annotations, such as pathways and reactions. Of note, recent developments in large language model techniques were discussed throughout our categories, with a special session dedicated to prototyping an LLM with metabolomics-specific content that can be used for training the next generation of metabolomics experts.
As in previous years, and based on positive feedback, the seminar format was again flexible, and topics were finalized on the first day of the seminar. The audience was encouraged to bring forth topics, and attention was paid to rotate who moderated and took notes for the sessions. Moderators, as well as organizers, actively ensured that all participants had a voice in topic sessions. Due to the large number of topics and to keep the groups manageable in size, parallel sessions were held. At the end of each day, the last group meeting was held to summarize the day’s discussion and to finalize the planning for the following day. One new aspect this year was the set-up of a Slack workspace that was used prior to, during, and after the seminar. This workspace facilitated communication and information exchange between participants. Overall, this seminar was highly successful, and participants were highly engaged. Topics and discussions generated much enthusiasm and concrete next steps for potential collaborations. The field of computational metabolomics is a very active field of study that is somewhat underrepresented in many of the main metabolomics conferences, especially when it comes to tool development. The opportunity to focus on the computational aspects was very well received and will surely continue to grow as the data generation and types are ever increasing.

Metabolomic data, usually from mass spectrometry or nuclear magnetic resonance spectroscopy, is highly complex and information rich. With continuing advances in metabolite detection technologies, the volume and complexity of data is ever increasing. Computational analysis of metabolomic data, from raw data processing to biological interpretation, is therefore fundamental to realizing the impact of metabolomics on a diverse array of application fields, e.g., environmental toxicity, industrial biotechnology, and biomedicine. This Dagstuhl Seminar extends the Computational Metabolomics series to examine how we can enhance the utility and interpretation of metabolomics data and will include three major themes:
- Enhancing the utility of large scale metabolomic data through application of state-of-the-art ML/DL methods, networks, standardization, and data sharing;
- Evaluating the robustness and increasing confidence in ML/DL modeling by incorporating experimental measures and producing interpretable metrics;
- Empowering the research community in interpreting metabolomic and multi-omic data through big data and knowledge sources analytics using public repositories.
This seminar will thus bring together a multidisciplinary set of computationally focused scientists to address the challenges and opportunities emerging in this important and rapidly changing field.
We aim to bring together many disciplines, including mass spectrometrists and NMR spectroscopists, computer scientists, biostatisticians, epidemiologists, biologists, and chemists. This Dagstuhl Seminar series has a strong record of collaborative outputs that result from gathering multi-disciplinary experts who contribute to computational metabolomics, directly or indirectly. These include new collaborations, grant applications, software tools, online resources and papers, addressing the key challenges identified in the field and discussed at the meeting. As an example, please see the paper “Recent advances in mass spectrometry-based computational metabolomics” [1] which resulted from the previous iteration from this seminar. The official report from the last seminar is published in the Dagstuhl Reports series [2].
The seminar organization will follow a flexible format, as typical in Dagstuhl Seminars. The discussions will center around a list of proposed topics and some that will be added after pre-meeting and brainstorming discussions. We will also consider short overview presentations when the discussion involves members with particularly diverse background expertise. Additionally, we plan to conduct several new pre-meeting Slack activities to solicit topics, collect relevant publications, and create a community around the seminar. As always, flexibility will be critical in enabling conversation to guide the seminar direction.
[1] https://www.sciencedirect.com/science/article/pii/S1367593123000261
[2] https://doi.org/10.4230/DagRep.12.5.1

Please log in to DOOR to see more details.
- Wout Bittremieux (University of Antwerp, BE)
- Sebastian Böcker (Friedrich-Schiller-Universität Jena, DE) [dblp]
- Carl Brunius (Chalmers University of Technology - Göteborg, SE)
- Roman Bushuiev (The Czech Academy of Sciences - Prague, CZ)
- Haley Chatelaine (NCATS - Bethesda, US)
- Ronan Daly (University of Glasgow - Bearsden, GB)
- Niek de Jonge (Wageningen University, NL)
- Kai Dührkop (Friedrich-Schiller-Universität Jena, DE) [dblp]
- Timothy M. D. Ebbels (Imperial College London, GB)
- Soha Hassoun (Tufts University - Medford, US)
- Florian Huber (Hochschule Düsseldorf, DE)
- Pär Jonsson (Sartorius Stedim Data Analytics - Umeå, SE)
- Purva Kulkarni (Radboud University - Nijemgen, NL)
- Jessica Lasky-Su (Brigham and Women's Hospital & Harvard Medical School - Boston, US)
- Alice Limonciel (biocrates life sciences - Innsbruck, AT)
- Liping Liu (Tufts University - Medford, US)
- Tytus Mak (NIST - Gaithersburg, US)
- Ewy A. Mathé (National Institutes of Health - Bethesda, US) [dblp]
- Hosein Mohimani (Carnegie Mellon University - Pittsburgh, US)
- María Eugenia Monge (CIBION - Buenos Aires, AR)
- Steffen Neumann (IPB - Halle, DE) [dblp]
- Louis-Felix Nothias (CNRS & Université Côte d'Azur - Nice, FR)
- Daniel Raftery (University of Washington - Seattle, US)
- Raphael Reher (Universität Marburg, DE)
- Stacey N. Reinke (Edith Cowan University - Joondalup, AU)
- Hannes Röst (University of Toronto, CA) [dblp]
- Juho Rousu (Aalto University, FI) [dblp]
- Robin Schmid (The Czech Academy of Sciences - Prague, CZ)
- Emma Schymanski (University of Luxembourg, LU) [dblp]
- Denise Slenter (Maastricht University, NL)
- Jan Stanstrup (University of Copenhagen, DK)
- Michael Andrej Stravs (Eawag - Dübendorf, CH)
- Marynka Ulaszewska-Tarantino (Thermo Fisher Scientific - Milan, IT)
- Justin J. J. van der Hooft (Wageningen University & Research, NL) [dblp]
- Dries Verdegem (VIB - KU Leuven, BE)
- Juan Antonio Vizcaino (EMBL-EBI - Hinxton, GB) [dblp]
- Ralf Weber (University of Birmingham, GB)
- Egon Willighagen (Maastricht University, NL)
- David Wishart (University of Alberta - Edmonton, CA) [dblp]
- Michael Anton Witting (Helmholtz Zentrum München, DE)
- Nicola Zamboni (ETH Zürich, CH) [dblp]
Related Seminars
- Dagstuhl Seminar 15492: Computational Metabolomics (2015-11-29 - 2015-12-04) (Details)
- Dagstuhl Seminar 17491: Computational Metabolomics: Identification, Interpretation, Imaging (2017-12-03 - 2017-12-08) (Details)
- Dagstuhl Seminar 20051: Computational Metabolomics: From Cheminformatics to Machine Learning (2020-01-26 - 2020-01-31) (Details)
- Dagstuhl Seminar 22181: Computational Metabolomics: From Spectra to Knowledge (2022-05-01 - 2022-05-06) (Details)
- Dagstuhl Seminar 26181: Computational Metabolomics: Discovery of New Molecules to Actionable Insights (2026-04-26 - 2026-04-30) (Details)
Classification
- Artificial Intelligence
- Emerging Technologies
- Information Retrieval
Keywords
- metabolomics
- bioinformatics
- cheminformatics
- multi-omics
- benchmarking