Projects

Phase 2 Projects

Hyperparameter Learning across problems


Project leaders: Lars Schmidt-Thieme (Hildesheim)

Researchers: Nico Schilling (Hildesheim), Martin Wistuba (Hildesheim)

Administration:

Associates:

Summary:

Autonomous learning in general aims to allow systems/machines more control over their learning processes. Therefore, major strands of autonomous learning approach these goals (i) through active learning [Settles, 2009], exploiting control of which samples for a specific problem to draw next, (ii) through reinforcement learning [Taylor and Stone, 2009], semi-supervised learning [Zhu, 2006] or unsupervised learning, esp. deep learning [Bengio, 2009], relying to a smaller degree on expert supervision or avoiding it at all. A third strand of research (iii) exploits similarities between related learning problems, by carrying over models from one problem to another in transfer learning [Pan and Yang, 2010] or by learning joint models for many related problems (e.g., Bayesian networks in general, Markov logic networks [Richardson and Domingos, 2006], and multi-relational models with several target variables [Drumond et al., 2013]). When aiming to learn models for several, possibly many different, possibly related learning problems, learning problem-specific models, and thus esp. learning model parameters, cannot be avoided. But the
major bottleneck for learning many models for many problems autonomously is hyperparameter selection and model selection in general, as the state-of-the-art approaches such as grid search and random hyperparameter sampling require many runs of the learning algorithm, and thus usually have to be conducted on a compute cluster, and not on a resource-restricted platform such as a robot, a car, a mobile phone etc., heavily limiting the autonomy of the learning system. (Using a cloud service often is no option either, as data is local and also communication bandwidth restricted.)
The project Hyperparameter Learning Across Problems targets the problem of
hyperparameter learning and more generally model selection across different learning problems.
The final result of the project is envisioned as (i) a hyperparameter recommendation model, that based on some apriori characteristics of a new learning problem and a few meta observations of the loss of a model with specific hyperparameter configurations is able to recommend further hyperparameter configurations to test (by learning a model with these hyperparameters and evaluating it on hold-out data), delivering comparable performance after a couple of iterations as state-of-the-art techniques, but requiring one or several orders of magnitude less runs of the learning algorithm, and (ii) an active learning strategy for sampling the next hyperparameter configuration to reduce the required number of meta samples for the hyperparameter recommendation model even further. Such a hyperparameter recommendation model will enable learning systems to learn models for new and/or many problems with restricted resources autonomously and much faster than today.
We aim to accomplish such a hyperparameter recommendation model by generalizing hyperparameter performance over many different problems. For this, learning problems have to be characterized to figure as predictor variables in such a hyperparameter recommendation model. One of the key ideas of our approach is to learn such a problem characterization within a latent variable model, especially a multi-relational factorization model.

Phase 1 Projects

Hyperparameter Learning across problems


Project leaders: Lars Schmidt-Thieme (Hildesheim)

Researchers: Nico Schilling (Hildesheim), Martin Wistuba (Hildesheim)

Administration:

Associates:

Summary:

Autonomous learning in general aims to allow systems/machines more control over their learning processes. Therefore, major strands of autonomous learning approach these goals (i) through active learning [Settles, 2009], exploiting control of which samples for a specific problem to draw next, (ii) through reinforcement learning [Taylor and Stone, 2009], semi-supervised learning [Zhu, 2006] or unsupervised learning, esp. deep learning [Bengio, 2009], relying to a smaller degree on expert supervision or avoiding it at all. A third strand of research (iii) exploits similarities between related learning problems, by carrying over models from one problem to another in transfer learning [Pan and Yang, 2010] or by learning joint models for many related problems (e.g., Bayesian networks in general, Markov logic networks [Richardson and Domingos, 2006], and multi-relational models with several target variables [Drumond et al., 2013]). When aiming to learn models for several, possibly many different, possibly related learning problems, learning problem-specific models, and thus esp. learning model parameters, cannot be avoided. But the
major bottleneck for learning many models for many problems autonomously is hyperparameter selection and model selection in general, as the state-of-the-art approaches such as grid search and random hyperparameter sampling require many runs of the learning algorithm, and thus usually have to be conducted on a compute cluster, and not on a resource-restricted platform such as a robot, a car, a mobile phone etc., heavily limiting the autonomy of the learning system. (Using a cloud service often is no option either, as data is local and also communication bandwidth restricted.)
The project Hyperparameter Learning Across Problems targets the problem of
hyperparameter learning and more generally model selection across different learning problems.
The final result of the project is envisioned as (i) a hyperparameter recommendation model, that based on some apriori characteristics of a new learning problem and a few meta observations of the loss of a model with specific hyperparameter configurations is able to recommend further hyperparameter configurations to test (by learning a model with these hyperparameters and evaluating it on hold-out data), delivering comparable performance after a couple of iterations as state-of-the-art techniques, but requiring one or several orders of magnitude less runs of the learning algorithm, and (ii) an active learning strategy for sampling the next hyperparameter configuration to reduce the required number of meta samples for the hyperparameter recommendation model even further. Such a hyperparameter recommendation model will enable learning systems to learn models for new and/or many problems with restricted resources autonomously and much faster than today.
We aim to accomplish such a hyperparameter recommendation model by generalizing hyperparameter performance over many different problems. For this, learning problems have to be characterized to figure as predictor variables in such a hyperparameter recommendation model. One of the key ideas of our approach is to learn such a problem characterization within a latent variable model, especially a multi-relational factorization model.