Dagstuhl Seminar 15361
Mathematical and Computational Foundations of Learning Theory
( Aug 30 – Sep 04, 2015 )
Permalink
Organizers
- Matthias Hein (Universität des Saarlandes, DE)
- Gabor Lugosi (UPF - Barcelona, ES)
- Lorenzo Rosasco (MIT - Cambridge, US)
Contact
- Marcel R. Ackermann (for scientific matters)
- Dagmar Glaser (for administrative matters)
Impacts
- On the measure of Voronoi cells : article : pp. 394-408 - Devroye, Luc; Györfi, Laszlo; Lugosi, Gabor; Walk, Harro - Cambridge : Cambridge University Press, 2017 - (Journal of Applied Probability ; 54. 2017, 2).
- Pattern Coding Meets Censoring : (almost) Adaptive Coding on Countable Alphabets - Ben-Hamou, Anna; Boucheron, Stephane; Gassiat, Elisabeth - Cornell University : arXiv.org, 2016. - 24 pp..
Machine learning is nowadays a central field in computer science. Over the last decade the statistical learning approach has been successfully applied in many areas such as bioinformatics, computer vision, robotics and information retrieval, to name a few.
This is the second seminar on the mathematical and computational foundations of learning theory. Since the last seminar in 2011 new challenges have emerged largely motivated by the availability of data-sets of unprecedented size and complexity. It is now common in many applied domains of science and technology to have datasets with thousands and even millions data-points, features and attributes/categories. The need of analyzing and extracting information from this kind of data has posed a host of new challenges and open questions. The 2015 Dagstuhl seminar will thus have its focus on the efficient analysis of complex large scale data-sets. In particular, we will address the following topics:
While statistical modeling and computational aspects have for a long time been considered separate steps in the design of learning algorithms, dealing effectively with big data requires developing new strategies where statistical and computational complexities are taken simultaneously into account. In other words, the trade-off between optimization error and generalization error has to be exploited. On the other hand it has very recently been noticed that several non-convex NP-hard learning problems (sparse recovery, compressed sensing, dictionary learning, matrix factorization etc.) can be solved efficiently and optimally (in a global sense) under conditions on the data resp. the chosen model or under the use of additional constraints. The goal of this seminar will be the broad discussion of these two topics as well as possible connections between them.
Data representation (e.g. the choice of kernels or features) is widely acknowledged to be the crucial step in solving learning problems. Provided with a suitable data representation, and enough labeled data, supervised algorithms, such as Support Vector Machines or Boosting, can provide good generalization performance. While data representations are often designed ad hoc for specific problems, availability of large/huge amount of unlabeled data has recently motivated the development of data driven techniques, e.g. dictionary learning, to adaptively solve the problem. Indeed, although novel tools for efficient data labeling have been developed (e.g. Amazon Mechanical Turk - mturk.com) most available data are unlabeled and reducing the amount of (human) supervision needed to effectively solve a task remains an important open challenge. While up-to-now the theory of supervised learning has become a mature field, an analogous theory of unsupervised and semi-supervised learning of data representation is still in its infancy and progress in the field is often assessed on a purely empirical basis. Thus we believe that the necessary next step is to build the foundations of representation learning on a firm theoretical ground.
Machine learning is nowadays a central field in computer science. Over the last decade the statistical learning approach has been successfully applied in many areas such as bioinformatics, computer vision, robotics and information retrieval. We believe that the main reasons for the success of machine learning are its strong theoretical foundations and its multidisciplinary approach integrating aspects of computer science, applied mathematics, and statistics among others.
Two very successful conferences titled "Mathematical Foundations of Learning Theory" in Barcelona 2004 and Paris 2006 have been inspired by this point of view on the foundations of machine learning. In 2011 the Dagstuhl seminar "Mathematical and Computational Foundations of Learning Theory" has been organized in the same spirit, bringing together leading researchers from computer science and mathematics to discuss the state of the art and future challenges in machine learning. The 2011 Dagstuhl seminar has been the first to cover a wide range of facets of modern learning theory and has been unanimously considered a success by the participants. Since 2011 new challenges have emerged largely motivated by the availability of data-sets of unprecedented size and complexity. It is now common in many applied domains of science and technology to have datasets with thousands and even millions data-points, features and attributes/categories. For example ImageNet (http://image-net.org) is a computer vision database for object recognition including one million images of one thousands different objects, and image representations are often of the order of several tens of thousands features. Datasets of analogous complexity are customary in biology and information science (e.g. text classification). The need of analyzing and extracting information from this kind of data has posed a host of new challenges and open questions.
The second Dagstuhl seminar on "Mathematical and Computational Foundations of Learning Theory" covered broadly recent developments in the area of learning. The main focus was on two topics:
- Interplay between Optimization and Learning
While statistical modeling and computational aspects have for a long time been considered separate steps in the design of learning algorithms, dealing effectively with big data requires developing new strategies where statistical and computational complexities are taken simultaneously into account. In other words, the trade-off between optimization error and generalization error has to be exploited. On the other hand it has very recently been noticed that several non-convex NP-hard learning problems (sparse recovery, compressed sensing, dictionary learning, matrix factorization etc.) can be solved efficiently and optimally (in a global sense) under conditions on the data resp. the chosen model or under the use of additional constraints. - Learning Data Representations
Data representation (e.g. the choice of kernels or features) is widely acknowledged to be the crucial step in solving learning problems. Provided with a suitable data representation, and enough labeled data, supervised algorithms, such as Support Vector Machines or Boosting, can provide good generalization performance. While data representations are often designed ad hoc for specific problems, availability of large/huge amount of unlabeled data have recently motivated the development of data driven techniques, e.g. dictionary learning, to adaptively solve the problem. Indeed, although novel tools for efficient data labeling have been developed (e.g. Amazon Mechanical Turk-http://mturk.com) most available data are unlabeled and reducing the amount of (human) supervision needed to effectively solve a task remains an important open challenge. While up-to-now the theory of supervised learning has become a mature field, an analogous theory of unsupervised and semi-supervised learning of data representation is still in its infancy and progress in the field is often assessed on a purely empirical basis.
The seminar featured a series of talks on both topics with interesting and exciting new results which lead to insights in both areas as well as a lot of discussion and interaction between the participants which for sure will manifest in several follow-up papers. Also it became obvious during the seminar that there are close connections between these two topics. Apart from these two main topics several other aspects of learning theory were discussed, leading to a quite complete picture on the current state-of-the-art in the field.
Acknowledgements. We would like to thank Dagmar Glaser and the staff at Schloss Dagstuhl for their continuous support and great hospitality which was the basis for the success of this seminar.
- Shivani Agarwal (Indian Institute of Science - Bangalore, IN) [dblp]
- Animashree Anandkumar (University of California - Irvine, US) [dblp]
- Peter L. Bartlett (University of California - Berkeley, US) [dblp]
- Shai Ben-David (University of Waterloo, CA) [dblp]
- Gilles Blanchard (Universität Potsdam, DE) [dblp]
- Stephane Boucheron (Paris Diderot University, FR) [dblp]
- Sebastien Bubeck (Microsoft Research - Redmond, US) [dblp]
- Joachim M. Buhmann (ETH Zürich, CH) [dblp]
- Constantine Caramanis (Univ. of Texas at Austin, US) [dblp]
- Sou-Cheng Choi (NORC - Chicago, US)
- Luc Devroye (McGill University - Montreal, CA) [dblp]
- Jack Fitzsimons (University of Oxford, GB)
- Antoine Gautier (Universität des Saarlandes, DE) [dblp]
- Remi Gribonval (INRIA Rennes - Bretagne Atlantique, FR) [dblp]
- László Györfi (Budapest University of Technology & Economics, HU) [dblp]
- Moritz Hardt (Google Research - Mountain View, US) [dblp]
- Matthias Hein (Universität des Saarlandes, DE) [dblp]
- Prateek Jain (Microsoft Research India - Bangalore, IN) [dblp]
- Stefanie Jegelka (MIT - Cambridge, US) [dblp]
- Felix Krahmer (TU München, DE) [dblp]
- Andreas Krause (ETH Zürich, CH) [dblp]
- Lek-Heng Lim (University of Chicago, US) [dblp]
- Gabor Lugosi (UPF - Barcelona, ES) [dblp]
- Robert D. Nowak (University of Wisconsin - Madison, US) [dblp]
- Guillaume Obozinski (ENPC - Marne-la-Vallée, FR) [dblp]
- Duy Khanh Pham (Ho Chi Minh City University of Pedagogy, VN) [dblp]
- Lorenzo Rosasco (MIT - Cambridge, US) [dblp]
- Alessandro Rudi (MIT - Cambridge, US) [dblp]
- Sivan Sabato (Ben Gurion University - Beer Sheva, IL) [dblp]
- Karin Schnass (Universität Innsbruck, AT) [dblp]
- Dejan Slepcev (Carnegie Mellon University, US) [dblp]
- Nathan Srebro (TTIC - Chicago, US) [dblp]
- Yannik Stein (FU Berlin, DE) [dblp]
- Alexandre Tsybakov (UPMC - Paris, FR) [dblp]
- Ruth Urner (MPI für Intelligente Systeme - Tübingen, DE) [dblp]
- Silvia Villa (Italian Institute of Technology - Genova, IT) [dblp]
- Rachel Ward (University of Texas - Austin, US) [dblp]
- Colin White (Carnegie Mellon University, US) [dblp]
- Robert C. Williamson (Australian National University, AU) [dblp]
- Yuan Yao (Peking University, CN) [dblp]
- Ding-Xuan Zhou (City University - Hong Kong, HK) [dblp]
Related Seminars
- Dagstuhl Seminar 11291: Mathematical and Computational Foundations of Learning Theory (2011-07-17 - 2011-07-22) (Details)
Classification
- artificial intelligence / robotics
- data structures / algorithms / complexity
Keywords
- machine learning
- large scale data