Projects

Phase 2 Projects

Scalable Autonomous Reinforcement Learning - From scratch to less and less structure


Project leaders: Joschka Boedecker (Freiburg), Jan Peters (TU Darmstadt)

Researchers: Tobias Springenberg (Freiburg), Simone Parisi (TU Darmstadt)

Administration:

Associates: Martin Riedmiller

Summary:

Over the course of the last decade, the framework of reinforcement learning (RL) has developed into a promising tool for learning a large variety of dierent tasks in robotics. During this timeframe, a lot of progress has been made towards scaling reinforcement learning to high-dimensional systems and solving tasks of increasing complexity.
Unfortunately, this scalability has been achieved by using expert knowledge to  pre-structure the learning problem in several dimensions. As a consequence, the state-of-the-art methods in robot reinforcement learning generally depend on hand-crafted state representations, pre-structured parametrized policies, well-shaped reward functions and demonstrations by a human expert to aid scaling of the learning algorithm. This large amount of required pre-structuring arguably is in stark contrast to the goal of developing
autonomous reinforcement learning systems.
In this project, we want to advance the field by starting with a 'classical' reinforcement learning setting for a challenging robotic task (i.e., tetherball ). Solving this task by RL methods will be already a valuable contribution. From there on, we will start to identify
the components for which the learning task design still needs engineering experience.
To this end, we will develop systematic methods to increase the autonomy of the learning system by going beyond traditional approaches:

(1) proposing methods for learning state representations for reinforcement learning automatically;

(2) developing generic policy classes capable of representing the large variety of control
policies that are necessary for truly autonomous behaviour;

(3) discovering informative reward functions autonomously.

Progress in each of these aspects will lift the learning algorithm to a higher level of autonomy. The advances will be grounded in the well established theoretical framework of policy search and enabled through improvements to state-of-the-art reinforcement
learning algorithms. Ultimately the resulting system should learn how to map raw sensory inputs to raw control signals from simple, generic principles, discovering structure within its environment automaticallyand solving difficult control tasks without expert knowledge. If successful, both the complete methodology developed within this project as well as sub-parts of it will help to establish a new, substantially more powerful generation of reinforcement learning algorithms that are capable of solving complicated robot control
problems autonomously.

Phase 1 Projects

Scalable Autonomous Reinforcement Learning - From scratch to less and less structure


Project leaders: Joschka Boedecker (Freiburg), Jan Peters (TU Darmstadt)

Researchers: Tobias Springenberg (Freiburg), Simone Parisi (TU Darmstadt)

Administration:

Associates: Martin Riedmiller

Summary:

Over the course of the last decade, the framework of reinforcement learning (RL) has developed into a promising tool for learning a large variety of dierent tasks in robotics. During this timeframe, a lot of progress has been made towards scaling reinforcement learning to high-dimensional systems and solving tasks of increasing complexity.
Unfortunately, this scalability has been achieved by using expert knowledge to  pre-structure the learning problem in several dimensions. As a consequence, the state-of-the-art methods in robot reinforcement learning generally depend on hand-crafted state representations, pre-structured parametrized policies, well-shaped reward functions and demonstrations by a human expert to aid scaling of the learning algorithm. This large amount of required pre-structuring arguably is in stark contrast to the goal of developing
autonomous reinforcement learning systems.
In this project, we want to advance the field by starting with a 'classical' reinforcement learning setting for a challenging robotic task (i.e., tetherball ). Solving this task by RL methods will be already a valuable contribution. From there on, we will start to identify
the components for which the learning task design still needs engineering experience.
To this end, we will develop systematic methods to increase the autonomy of the learning system by going beyond traditional approaches:

(1) proposing methods for learning state representations for reinforcement learning automatically;

(2) developing generic policy classes capable of representing the large variety of control
policies that are necessary for truly autonomous behaviour;

(3) discovering informative reward functions autonomously.

Progress in each of these aspects will lift the learning algorithm to a higher level of autonomy. The advances will be grounded in the well established theoretical framework of policy search and enabled through improvements to state-of-the-art reinforcement
learning algorithms. Ultimately the resulting system should learn how to map raw sensory inputs to raw control signals from simple, generic principles, discovering structure within its environment automaticallyand solving difficult control tasks without expert knowledge. If successful, both the complete methodology developed within this project as well as sub-parts of it will help to establish a new, substantially more powerful generation of reinforcement learning algorithms that are capable of solving complicated robot control
problems autonomously.