Skip to main content

Improved Learning of Dynamics Models for Control

  • Conference paper
  • First Online:
2016 International Symposium on Experimental Robotics (ISER 2016)

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 1))

Included in the following conference series:

Abstract

Model-based reinforcement learning (MBRL) plays an important role in developing control strategies for robotic systems. However, when dealing with complex platforms, it is difficult to model systems dynamics with analytic models. While data-driven tools offer an alternative to tackle this problem, collecting data on physical systems is non-trivial. Hence, smart solutions are required to effectively learn dynamics models with small amount of examples. In this paper we present an extension to Data As Demonstrator for handling controlled dynamics in order to improve the multiple-step prediction capabilities of the learned dynamics models. Results show the efficacy of our algorithm in developing LQR, iLQR, and open-loop trajectory-based control strategies on simulated benchmarks as well as physical robot platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Trajectories can be sub-sampled shorter than the control problem’s time horizon.

  2. 2.

    Simulators, except the helicopter, available at https://github.com/webrot9/control_simulators with C++ and Python APIs.

References

  1. Schaal, S., et al.: Learning from demonstration. In: NIPS, pp. 1040–1046 (1997)

    Google Scholar 

  2. Bakker, B., Zhumatiy, V., Gruener, G., Schmidhuber, J.: Quasi-online reinforcement learning for robots. In: ICRA, pp. 2997–3002 (2006)

    Google Scholar 

  3. Hester, T., Quinlan, M., Stone, P.: RTMBA: A real-time model-based reinforcement learning architecture for robot control. In: ICRA, pp. 85–90 (2012)

    Google Scholar 

  4. Thrun, S.: An approach to learning mobile robot navigation. RAS 15(4), 301–319 (1995)

    Google Scholar 

  5. Matarić, M.J.: Reinforcement learning in the multi-robot domain. In: Arkin, R.C., Bekey, G.A. (eds.) Robot Colonies, pp. 73–83. Springer, New York (1997). doi:10.1007/978-1-4757-6451-2_4

    Chapter  Google Scholar 

  6. Duan, Y., Liu, Q., Xu, X.: Application of reinforcement learning in robot soccer. Eng. Appl. Artif. Intell. 20(7), 936–950 (2007)

    Article  Google Scholar 

  7. Konidaris, G., Kuindersma, S., Grupen, R., Barto, A.: Robot learning from demonstration by constructing skill trees. IJRR 0278364911428653 (2011)

    Google Scholar 

  8. Ko, J., Klein, D.J., Fox, D., Haehnel, D.: GP-UKF: Unscented Kalman filters with Gaussian process prediction and observation models. In: IROS, pp. 1901–1907 (2007)

    Google Scholar 

  9. Bagnell, J.A., Hneider, J.G.S.: Autonomous helicopter control using reinforcement learning policy search methods. ICRA 2, 1615–1620 (2001)

    Google Scholar 

  10. Venkatraman, A., Hebert, M., Bagnell, J.A.: Improving multi-step prediction of learned time series models. In: AAAI, pp. 3024–3030 (2015)

    Google Scholar 

  11. Van Overschee, P., De Moor, B.: N4SID: Subspace algorithms for the identification of combined deterministic-stochastic systems. Automatica 30(1), 75–93 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  12. Ghahramani, Z., Roweis, S.T.: Learning nonlinear dynamical systems using an EM algorithm. In: NIPS, pp. 431–437 (1999)

    Google Scholar 

  13. Siddiqi, S.M., Boots, B., Gordon, G.J.: A constraint generation approach to learning stable linear dynamical systems. In: NIPS (2007)

    Google Scholar 

  14. Van Overschee, P., De Moor, B.: Subspace identification for linear systems: theory implementation applications. Springer Science & Business Media, New York (2012)

    MATH  Google Scholar 

  15. Venkatraman, A., Boots, B., Hebert, M., Bagnell, J.A.: Data as demonstrator with applications to system identification. In: ALR Workshop, NIPS (2014)

    Google Scholar 

  16. Abbeel, P., Ng, A.Y.: Exploration and apprenticeship learning in reinforcement learning. In: ICML, pp. 1–8. ACM (2005)

    Google Scholar 

  17. Deisenroth, M., Rasmussen, C.E.: Pilco: a model-based and data-efficient approach to policy search. In: ICML, pp. 465–472 (2011)

    Google Scholar 

  18. Ross, S., Bagnell, D.: Agnostic system identification for model-based reinforcement learning. In: ICML, pp. 1703–1710 (2012)

    Google Scholar 

  19. Heess, N., Wayne, G., Silver, D., Lillicrap, T., Erez, T., Tassa, Y.: Learning continuous control policies by stochastic value gradients. In: NIPS, pp. 2926–2934 (2015)

    Google Scholar 

  20. Abbeel, P., Ganapathi, V., Ng, A.Y.: Learning vehicular dynamics, with application to modeling helicopters. In: NIPS, pp. 1–8 (2005)

    Google Scholar 

  21. Müller, K.-R., Smola, A.J., Rätsch, G., Schölkopf, B., Kohlmorgen, J., Vapnik, V.: Predicting time series with support vector machines. In: Gerstner, W., Germond, A., Hasler, M., Nicoud, J.-D. (eds.) ICANN 1997. LNCS, vol. 1327, pp. 999–1004. Springer, Heidelberg (1997). doi:10.1007/BFb0020283

    Chapter  Google Scholar 

  22. Li, W., Todorov, E.: Iterative linear quadratic regulator design for nonlinear biological movement systems. In: ICINCO, vol. 1, pp. 222–229 (2004)

    Google Scholar 

  23. Rahimi, A., Recht, B.: Random features for large-scale kernel machines. In: NIPS (2007)

    Google Scholar 

Download references

Acknowledgements

This material is based upon work supported in part by: National Science Foundation Graduate Research Fellowship Grant No. DGE-1252522, National Science Foundation NRI Purposeful Prediction Award No. 1227234, and ONR contract N000141512365. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arun Venkatraman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Venkatraman, A., Capobianco, R., Pinto, L., Hebert, M., Nardi, D., Bagnell, J.A. (2017). Improved Learning of Dynamics Models for Control. In: Kulić, D., Nakamura, Y., Khatib, O., Venture, G. (eds) 2016 International Symposium on Experimental Robotics. ISER 2016. Springer Proceedings in Advanced Robotics, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-319-50115-4_61

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50115-4_61

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50114-7

  • Online ISBN: 978-3-319-50115-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics