Improved Learning of Dynamics Models for Control

Venkatraman, Arun; Capobianco, Roberto; Pinto, Lerrel; Hebert, Martial; Nardi, Daniele; Bagnell, J. Andrew

doi:10.1007/978-3-319-50115-4_61

Arun Venkatraman⁷,
Roberto Capobianco⁸,
Lerrel Pinto⁷,
Martial Hebert⁷,
Daniele Nardi⁸ &
…
J. Andrew Bagnell⁷

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 1))

Included in the following conference series:

International Symposium on Experimental Robotics

4478 Accesses
4 Citations

Abstract

Model-based reinforcement learning (MBRL) plays an important role in developing control strategies for robotic systems. However, when dealing with complex platforms, it is difficult to model systems dynamics with analytic models. While data-driven tools offer an alternative to tackle this problem, collecting data on physical systems is non-trivial. Hence, smart solutions are required to effectively learn dynamics models with small amount of examples. In this paper we present an extension to Data As Demonstrator for handling controlled dynamics in order to improve the multiple-step prediction capabilities of the learned dynamics models. Results show the efficacy of our algorithm in developing LQR, iLQR, and open-loop trajectory-based control strategies on simulated benchmarks as well as physical robot platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Trajectories can be sub-sampled shorter than the control problem’s time horizon.
2.
Simulators, except the helicopter, available at https://github.com/webrot9/control_simulators with C++ and Python APIs.

References

Schaal, S., et al.: Learning from demonstration. In: NIPS, pp. 1040–1046 (1997)
Google Scholar
Bakker, B., Zhumatiy, V., Gruener, G., Schmidhuber, J.: Quasi-online reinforcement learning for robots. In: ICRA, pp. 2997–3002 (2006)
Google Scholar
Hester, T., Quinlan, M., Stone, P.: RTMBA: A real-time model-based reinforcement learning architecture for robot control. In: ICRA, pp. 85–90 (2012)
Google Scholar
Thrun, S.: An approach to learning mobile robot navigation. RAS 15(4), 301–319 (1995)
Google Scholar
Matarić, M.J.: Reinforcement learning in the multi-robot domain. In: Arkin, R.C., Bekey, G.A. (eds.) Robot Colonies, pp. 73–83. Springer, New York (1997). doi:10.1007/978-1-4757-6451-2_4
Chapter Google Scholar
Duan, Y., Liu, Q., Xu, X.: Application of reinforcement learning in robot soccer. Eng. Appl. Artif. Intell. 20(7), 936–950 (2007)
Article Google Scholar
Konidaris, G., Kuindersma, S., Grupen, R., Barto, A.: Robot learning from demonstration by constructing skill trees. IJRR 0278364911428653 (2011)
Google Scholar
Ko, J., Klein, D.J., Fox, D., Haehnel, D.: GP-UKF: Unscented Kalman filters with Gaussian process prediction and observation models. In: IROS, pp. 1901–1907 (2007)
Google Scholar
Bagnell, J.A., Hneider, J.G.S.: Autonomous helicopter control using reinforcement learning policy search methods. ICRA 2, 1615–1620 (2001)
Google Scholar
Venkatraman, A., Hebert, M., Bagnell, J.A.: Improving multi-step prediction of learned time series models. In: AAAI, pp. 3024–3030 (2015)
Google Scholar
Van Overschee, P., De Moor, B.: N4SID: Subspace algorithms for the identification of combined deterministic-stochastic systems. Automatica 30(1), 75–93 (1994)
Article MathSciNet MATH Google Scholar
Ghahramani, Z., Roweis, S.T.: Learning nonlinear dynamical systems using an EM algorithm. In: NIPS, pp. 431–437 (1999)
Google Scholar
Siddiqi, S.M., Boots, B., Gordon, G.J.: A constraint generation approach to learning stable linear dynamical systems. In: NIPS (2007)
Google Scholar
Van Overschee, P., De Moor, B.: Subspace identification for linear systems: theory implementation applications. Springer Science & Business Media, New York (2012)
MATH Google Scholar
Venkatraman, A., Boots, B., Hebert, M., Bagnell, J.A.: Data as demonstrator with applications to system identification. In: ALR Workshop, NIPS (2014)
Google Scholar
Abbeel, P., Ng, A.Y.: Exploration and apprenticeship learning in reinforcement learning. In: ICML, pp. 1–8. ACM (2005)
Google Scholar
Deisenroth, M., Rasmussen, C.E.: Pilco: a model-based and data-efficient approach to policy search. In: ICML, pp. 465–472 (2011)
Google Scholar
Ross, S., Bagnell, D.: Agnostic system identification for model-based reinforcement learning. In: ICML, pp. 1703–1710 (2012)
Google Scholar
Heess, N., Wayne, G., Silver, D., Lillicrap, T., Erez, T., Tassa, Y.: Learning continuous control policies by stochastic value gradients. In: NIPS, pp. 2926–2934 (2015)
Google Scholar
Abbeel, P., Ganapathi, V., Ng, A.Y.: Learning vehicular dynamics, with application to modeling helicopters. In: NIPS, pp. 1–8 (2005)
Google Scholar
Müller, K.-R., Smola, A.J., Rätsch, G., Schölkopf, B., Kohlmorgen, J., Vapnik, V.: Predicting time series with support vector machines. In: Gerstner, W., Germond, A., Hasler, M., Nicoud, J.-D. (eds.) ICANN 1997. LNCS, vol. 1327, pp. 999–1004. Springer, Heidelberg (1997). doi:10.1007/BFb0020283
Chapter Google Scholar
Li, W., Todorov, E.: Iterative linear quadratic regulator design for nonlinear biological movement systems. In: ICINCO, vol. 1, pp. 222–229 (2004)
Google Scholar
Rahimi, A., Recht, B.: Random features for large-scale kernel machines. In: NIPS (2007)
Google Scholar

Download references

Acknowledgements

This material is based upon work supported in part by: National Science Foundation Graduate Research Fellowship Grant No. DGE-1252522, National Science Foundation NRI Purposeful Prediction Award No. 1227234, and ONR contract N000141512365. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

The Robotics Institute, Carnegie Mellon University, Pittsburgh, USA
Arun Venkatraman, Lerrel Pinto, Martial Hebert & J. Andrew Bagnell
Sapienza University of Rome, Rome, Italy
Roberto Capobianco & Daniele Nardi

Authors

Arun Venkatraman
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Capobianco
View author publications
You can also search for this author in PubMed Google Scholar
Lerrel Pinto
View author publications
You can also search for this author in PubMed Google Scholar
Martial Hebert
View author publications
You can also search for this author in PubMed Google Scholar
Daniele Nardi
View author publications
You can also search for this author in PubMed Google Scholar
J. Andrew Bagnell
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arun Venkatraman .

Editor information

Editors and Affiliations

Gra Sch of Info Sci&Tech,Dept of MechInf, The University of Tokyo Gra Sch of Info Sci&Tech,Dept of MechInf, Tokyo, Japan
Dana Kulić
Computer Science Department, Stanford University Computer Science Department, Stanford, California, USA
Yoshihiko Nakamura
Department of Electrical & Computer Engg, University of Waterloo Department of Electrical & Computer Engg, Waterloo, Ontario, Canada
Oussama Khatib
Department of Mechanical Systems Enginee, Tokyo University of Agriculture and Tech Department of Mechanical Systems Enginee, Tokyo, Japan
Gentiane Venture

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Venkatraman, A., Capobianco, R., Pinto, L., Hebert, M., Nardi, D., Bagnell, J.A. (2017). Improved Learning of Dynamics Models for Control. In: Kulić, D., Nakamura, Y., Khatib, O., Venture, G. (eds) 2016 International Symposium on Experimental Robotics. ISER 2016. Springer Proceedings in Advanced Robotics, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-319-50115-4_61

Download citation

DOI: https://doi.org/10.1007/978-3-319-50115-4_61
Published: 21 March 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50114-7
Online ISBN: 978-3-319-50115-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics