Hierarchical relative entropy policy search

C Daniel, G Neumann, J Peters - Artificial Intelligence and …, 2012 - proceedings.mlr.press
Artificial Intelligence and Statistics, 2012proceedings.mlr.press
Many real-world problems are inherently hi-erarchically structured. The use of this struc-ture
in an agent's policy may well be the key to improved scalability and higher per-formance.
However, such hierarchical struc-tures cannot be exploited by current policy search
algorithms. We will concentrate on a basic, but highly relevant hierarchy-the'mixed
option'policy. Here, a gating network first decides which of the options to execute and,
subsequently, the option-policy deter-mines the action. In this paper, we reformulate …
Abstract
Many real-world problems are inherently hi-erarchically structured. The use of this struc-ture in an agent’s policy may well be the key to improved scalability and higher per-formance. However, such hierarchical struc-tures cannot be exploited by current policy search algorithms. We will concentrate on a basic, but highly relevant hierarchy-the’mixed option’policy. Here, a gating network first decides which of the options to execute and, subsequently, the option-policy deter-mines the action. In this paper, we reformulate learning a hi-erarchical policy as a latent variable estima-tion problem and subsequently extend the Relative Entropy Policy Search (REPS) to the latent variable case. We show that our Hierarchical REPS can learn versatile solu-tions while also showing an increased perfor-mance in terms of learning speed and quality of the found policy in comparison to the non-hierarchical approach.
proceedings.mlr.press
Showing the best result for this search. See all results