Dagstuhl-Seminar 19102
3D Morphable Models
( 03. Mar – 08. Mar, 2019 )
Permalink
Organisatoren
- Bernhard Egger (MIT - Cambridge, US)
- William Smith (University of York, GB)
- Christian Theobalt (MPI für Informatik - Saarbrücken, DE)
- Thomas Vetter (Universität Basel, CH)
Kontakt
- Shida Kunz (für wissenschaftliche Fragen)
- Susanne Bach-Bernhard (für administrative Fragen)
Programm
After 20 years of research around 3D Morphable Models (3DMM), this Dagstuhl Seminar aims to capture the key ideas and persons in the field and bring the community interested in 3DMMs of faces and bodies together for the first time. Not only will this recognise the 20th birthday of the original paper but, to a greater extent, facilitate a collaborative approach to current limitations and open research questions and address the lack of comparability of different approaches and coordination within the community.
A 3DMM is a statistical object model separating shape from appearance variation. Typically, 3DMMs are used as a statistical prior in computer graphics and vision. A model is learned from high quality 3D scans of multiple object instances. It reduces the dimensionality and provides a low-dimensional, parametric object representation. The resulting model is generative, which means that from a set of randomly sampled parameters a novel realistic object instance arises. The original model was defined on human faces and combines a 3D shape and appearance model. It was applied for 3D reconstruction of faces from still 2D images and 3D manipulation of those 2D images.
Today, 3DMMs of the human body and faces are well researched and adopted by industry. Besides various applications in computer graphics and vision, such models are highly suitable for use in medical imaging, surgical planning, psychology, ergonomics, and anthropology and are even proven to be applicable in modeling human cognition. Some of the application areas led to commercial products in industry. These products range from the entertainment industry through the fashion business to security applications incorporating face recognition.
Beyond current applications, emerging research directions make 3DMMs once again very timely. Recently, 3DMMs have been rediscovered in the context of deep learning. They provide a generative appearance model for self-supervision and parameterise nonrigid correspondence in geometric deep learning. Progress in 3D shape analysis and shape spaces provides new perspectives on 3DMMs via models of shape collections and shape differences.
The main topics we will cover in this Dagstuhl Seminar are correspondence, optimization, realism, physics, and evaluation in context of 3DMMs. The goals of the seminar are:
- to build a diverse, multidisciplinary, and widely spread community to identify and approach future challenges
- to come up with a set of open measurable challenges
- to provide a perfect breeding ground for future collaboration
- to initiate an edited book or a survey paper with broad support
A total of 45 people was invited to this seminar in the first round of invitations. The seminar was fully booked after the first round and 26 researchers from academia and industry participated in the seminar. 21 researchers presented their work in around 15-30 minutes presentations, an abstract of each presentation is included in this report. Besides those presentations participants where presenting their shared data and software in a specific slot. We collected this information in a list of shared resources which we made publicly available (https://github.com/3d-morphable-models/curated-list-of-awesome-3D-Morphable-Model-software-and-data). This overview and exchange was one of the aims we had initially in mind when organizing the workshop. In the beginning of the workshop we collected ideas for discussions in our flexible sessions, those ideas are also contained in this report. We then structured the seminar fixing the topics of discussion for the flexible sessions. The summaries of those discussions are also contained in this report. One slot was reserved for a joint group discussion on upcoming ethical concerns on the methods we are developing. This interesting and well organized discussion was an initiative from the participants and not foreseen by the organizers. Another bigger discussion was around the topic of how to compare different approaches and how to establish a benchmark. We did not completely converge on a final solution but we identified currently available benchmarks and we discussed how a gold-standard benchmark would look like. Another aim of the workshop was to initiate an edited book or a survey paper with broad support. Arising from the workshop a group of 13 junior and senior researchers started to work on a joint survey and perspective paper on 20 years of Morphable Face Models. Discussions and presentations were followed by vivid discussions on current challenges and future research directions. To future nurture the ideas of the seminar we started a google group for discussions, sharing news and exchanging students (https://groups.google.com/forum/#!forum/3d-morphable-models). The group would like to meet again at Dagstuhl in 2022. The program was more dense than expected and we would like to have more time for discussions in groups after a set of talks. We would like to highlight 5 main discussion points:
- To what degree of detail we need to model in 3D and physically adequate, what can we learn from semi-supervised or unsupervised 2D data?
- Is the model depending on the application or is there a golden standard model that is able to fit all applications?
- The current revolution of deep learning in computer vision enables a lot of novel strategies and speeds up the models, however, other challenges in modeling, synthesis and inverse rendering remain and new deep learning specific challenges are introduced.
- What are the ethical implications of the models and systems we are building?
- How will the field develop in the next 20 years? Which challenges should we focus on?
We started the seminar with a short introduction of everybody. The homework was to introduce themself with at most one slide and prepare one important question, challenge or goal you would like to discuss during the seminar.
- Thabo Beeler: Non-Linear Morphable Models. How to get off Model in a meaningful way?
- Florian Bernard: Deeper integration of models of human knowledge and algorithms into learning systems. What are potential perspectives? How to best approach this?
- Michael J. Black: What’s next? Increasing realism? Deep representations? Something else?
- Volker Blanz: Expressive model also reproduces non-face structures! How to discriminate between face and non-face? Future: better regularization, rely on trained regressors, recognize glasses ?
- Bernhard Egger: What to model? What to learn?
- Victoria Fernandez Abrevaya: How far are we from closing the gap between high-quality and low-quality capture devices, and can we use 3DMM for this?
- Patrik Huber: What is missing to reliably reconstruct realistic 3D faces from mostly uncontrolled 2D footage?
- Ron Kimmel: Geometry is the art of finding the “right’’ parametrization. Deep Learning is a technology that exploits convenient parametric spaces (CNN) for classification. Any hope for unification? Is translating geometry into algebra the answer?
- Tatsuro Koizumi: How to evaluate and assure the robustness of neural network-based reconstruction? How to improve the stability of self-supervised training?
- Adam Edward Kortylewski: Can we resolve the limitations of Deep Learning with Generative Object Models?
- Yeara Kozlov: Can physically based face modeling be replaced by machine learning?
- Andreas Morel-Forster: Fast posterior estimation – A contradiction?
- Nick Pears: How to build deeper, wider models?
- Gerard Pons-Moll: Is the Euclidean 3D space the right space to model humans, clothing and hair?
- Emanuele Rodolà: Can we make inverse spectral geometry useful in practice?
- Sami Romdhani: How to combine Deep Learning and 3D Equations to generate images?
- Javier Romero: How can Deep Nets learn from unstructured, uncalibrated views?
- Shunsuke Saito: Is there an unified representation to represent digital human without explicitly having prior for each component?
- William Smith: Self-supervision: holy grail or just re-discovering gradient descent-based analysis-by-synthesis? How do we make sure the gradients of our losses are really useful (Appearance loss: meaningless when far from good solution, Landmark loss: ambiguous (and not self-supervised), Rasterization: not differentiable)?
- Ayush Tewari: How can we build high quality 3D morphable models from 2D data?
- Christian Theobalt: Can we build a 4D Real World Reconstruction Loop? Ethical, Privacy, Security Questions of Parametric/Morphable Model Building and Reconstruction Algorithms
- Thomas Vetter: Did we learn much about this optimization problem (inverse rendering)?
- Stefanie Wuhrer: How to effectively learn parametric human models from captured data using minimal supervision?
- Michael Zollhöfer: What is the best representation for deep learning-based 3D reconstruction and image synthesis?
- Silvia Zuffi: How to model skin dynamics from video?
After the individual introductions, we discussed those ideas in discussion groups to identify points to discuss during the seminar. The following list is the unfiltered result of our brainstorming on open questions and challenges.
- Where to spend the next 20 years? Perfection: finer detail? Move it: Movement, new representation, new goals, new data? Break it: hair, clothing, new representation, new goals, new data?
- Why aren't we focusing on fixing the obvious errors?
- Optimization: Why aren't we doing more to understand our objective function and adopt the algorithms?
- How to predict distributions instead of point estimates?
- How much detail to model vs. overfitting?
- How to evaluate Photorealism?
- Should vision people be more aware of graphics standard for photorealism?
- Is it important to understand?
- Do we need correspondences to build 3D models and predictions?
- How to learn 3D from 2D?
- How to adapt models over time (without calibration)?
- How to deal with multi-view and video in CNNs?
- Which courses/skills are required?
- Use for society?
- What to leave for industry?
- What is the role of academia within industry (collaboration vs. isolation)?
- Representations (beyond triangle meshes) to deal with category discontinuities, e.g. smooth surface vs. hair
- Evaluation of shape and appearance reconstruction
- Connections between deep learning and parametric models
- Role of axiomic models in learning
- Comparability: Benchmark and metrics
- Future prediction of motion
- Self-supervision
- Differentiable inverse rendering
- Thabo Beeler (Disney Research - Zürich, CH) [dblp]
- Florian Bernard (MPI für Informatik - Saarbrücken, DE) [dblp]
- Michael J. Black (MPI für Intelligente Systeme - Tübingen, DE) [dblp]
- Volker Blanz (Universität Siegen, DE) [dblp]
- Timo Bolkart (MPI für Intelligente Systeme - Tübingen, DE) [dblp]
- Bernhard Egger (MIT - Cambridge, US) [dblp]
- Victoria Fernandez Abrevaya (INRIA - Grenoble, FR) [dblp]
- Patrik Huber (University of Surrey, GB) [dblp]
- Ron Kimmel (Technion - Haifa, IL) [dblp]
- Tatsuro Koizumi (University of York, GB)
- Adam Kortylewski (Johns Hopkins Univ. - Baltimore, US) [dblp]
- Yeara Kozlov (ETH Zürich, CH) [dblp]
- Andreas Morel-Forster (Universität Basel, CH) [dblp]
- Nick Pears (University of York, GB) [dblp]
- Gerard Pons-Moll (MPI für Informatik - Saarbrücken, DE) [dblp]
- Emanuele Rodolà (Sapienza University of Rome, IT) [dblp]
- Sami Romdhani (IDEMIA, FR) [dblp]
- Javier Romero (Amazon Research - Barcelona, ES) [dblp]
- Shunsuke Saito (USC - Los Angeles, US) [dblp]
- William Smith (University of York, GB) [dblp]
- Ayush Tewari (MPI für Informatik - Saarbrücken, DE) [dblp]
- Christian Theobalt (MPI für Informatik - Saarbrücken, DE) [dblp]
- Thomas Vetter (Universität Basel, CH) [dblp]
- Stefanie Wuhrer (INRIA - Grenoble, FR) [dblp]
- Michael Zollhöfer (Stanford University, US) [dblp]
- Silvia Zuffi (IMATI - Milano, IT) [dblp]
Verwandte Seminare
Klassifikation
- computer graphics / computer vision
Schlagworte
- 3D Computer Vision
- Computer Graphics
- Statistical Modelling
- Analysis-by-Synthesis
- Generative Models