Projects

Phase 2 Projects

Auto-Tune: Structural optimization of Machine Learning frameworks for large datasets


Project leaders: Thomas Brox (Freiburg), Philipp Henning (Tübingen), Frank Hutter (Freiburg)

Researchers: Aaron Klein (Freiburg), Matthias Feurer (Freiburg)

Administration: Jasmin Anders (Freiburg), Petra Geiger (Freiburg)

Associates:

Summary:

Machine learning has achieved considerable successes in recent years. An ever-growing number of disciplines now rely on it. But while a core premise of the field is automation and autonomy of computer algorithms, building a good machine learning algorithm for a particular task still crucially requires human machine learning experts, who select appropriate features, algorithms, and their hyperparameters. For the increasing number of machine learning users, it can be challenging to make the right choice when faced with these degrees of freedom. Especially non-expert users often make suboptimal choices, such as selecting algorithms based on reputation or intuitive appeal and/or leaving  hyperparameters set to default values.
The recent Auto-WEKA approach [Thornton et al., 2013], co-authored by applicant Frank Hutter, has demonstrated that it is now possible to automatically and simultaneously choose a feature selection strategy, a learning algorithm and its hyperparameters in the WEKA framework [Hall et al., 2009] to optimize empirical performance. This approach formalizes this problem as a high-dimensional hyperparameter optimization problem with an unusually general notion of hyperparameters: The choice of a model is a discrete-valued hyperparameter, while the individual model’s hyperparameters are conditional in the sense that they are only relevant if the model is selected. Likewise, a top-level
hyperparameter governs the choice between different feature selection methods, each of which has its own conditional hyperparameters.
As demonstrated by the feasibility of Auto-WEKA, modern Bayesian optimization  methods based on random forest models [Hutter et al., 2011] effectively search this high-dimensional structured space and automatically choose feature selection methods, algorithms, and hyperparameters.
While Auto-WEKA provides an effective way for non-experts to achieve state-of-the-art performance in terms of human time spent, it can be costly in terms of machine time, as it requires the evaluation of thousands of configurations to reliably home in to a good region of the space. These computational demands can be tolerated for modestly-sized data sets, and for model classes that are inexpensive to train, but they are inacceptable for large data sets and/or computationally expensive models, such as deep belief networks [Hinton et al., 2006] and convolutional networks [LeCun et al., 1998]. The main goal of this project is to overcome these limitations of cost and make the automatic instantiation of complex model families practical for large datasets.
We cover necessary background information on the following areas: Bayesian  optimization and Entropy Search, meta-learning, hyperparameter optimization, and hyperparameter optimization in computer vision.

Phase 1 Projects

Auto-Tune: Structural optimization of Machine Learning frameworks for large datasets


Project leaders: Thomas Brox (Freiburg), Philipp Henning (Tübingen), Frank Hutter (Freiburg)

Researchers: Aaron Klein (Freiburg), Matthias Feurer (Freiburg)

Administration: Jasmin Anders (Freiburg), Petra Geiger (Freiburg)

Associates:

Summary:

Machine learning has achieved considerable successes in recent years. An ever-growing number of disciplines now rely on it. But while a core premise of the field is automation and autonomy of computer algorithms, building a good machine learning algorithm for a particular task still crucially requires human machine learning experts, who select appropriate features, algorithms, and their hyperparameters. For the increasing number of machine learning users, it can be challenging to make the right choice when faced with these degrees of freedom. Especially non-expert users often make suboptimal choices, such as selecting algorithms based on reputation or intuitive appeal and/or leaving  hyperparameters set to default values.
The recent Auto-WEKA approach [Thornton et al., 2013], co-authored by applicant Frank Hutter, has demonstrated that it is now possible to automatically and simultaneously choose a feature selection strategy, a learning algorithm and its hyperparameters in the WEKA framework [Hall et al., 2009] to optimize empirical performance. This approach formalizes this problem as a high-dimensional hyperparameter optimization problem with an unusually general notion of hyperparameters: The choice of a model is a discrete-valued hyperparameter, while the individual model’s hyperparameters are conditional in the sense that they are only relevant if the model is selected. Likewise, a top-level
hyperparameter governs the choice between different feature selection methods, each of which has its own conditional hyperparameters.
As demonstrated by the feasibility of Auto-WEKA, modern Bayesian optimization  methods based on random forest models [Hutter et al., 2011] effectively search this high-dimensional structured space and automatically choose feature selection methods, algorithms, and hyperparameters.
While Auto-WEKA provides an effective way for non-experts to achieve state-of-the-art performance in terms of human time spent, it can be costly in terms of machine time, as it requires the evaluation of thousands of configurations to reliably home in to a good region of the space. These computational demands can be tolerated for modestly-sized data sets, and for model classes that are inexpensive to train, but they are inacceptable for large data sets and/or computationally expensive models, such as deep belief networks [Hinton et al., 2006] and convolutional networks [LeCun et al., 1998]. The main goal of this project is to overcome these limitations of cost and make the automatic instantiation of complex model families practical for large datasets.
We cover necessary background information on the following areas: Bayesian  optimization and Entropy Search, meta-learning, hyperparameter optimization, and hyperparameter optimization in computer vision.