Deep Learning of Representations

Bengio, Yoshua; Courville, Aaron

doi:10.1007/978-3-642-36657-4_1

Yoshua Bengio⁴ &
Aaron Courville⁴

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 49))

4573 Accesses
20 Citations

Abstract

Unsupervised learning of representations has been found useful in many applications and benefits from several advantages, e.g., where there are many unlabeled examples and few labeled ones (semi-supervised learning), or where the unlabeled or labeled examples are from a distribution different but related to the one of interest (self-taught learning, multi-task learning, and domain adaptation). Some of these algorithms have successfully been used to learn a hierarchy of features, i.e., to build a deep architecture, either as initialization for a supervised predictor, or as a generative model. Deep learning algorithms can yield representations that are more abstract and better disentangle the hidden factors of variation underlying the unknown generating distribution, i.e., to capture invariances and discover non-local structure in that distribution. This chapter reviews the main motivations and ideas behind deep learning algorithms and their representation-learning components, as well as recent results in this area, and proposes a vision of challenges and hopes on the road ahead, focusing on the questions of invariance and disentangling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bagnell, J.A., Bradley, D.M.: Differentiable sparse coding. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 21 (NIPS 2008), pp. 113–120 (2009)
Google Scholar
Barron, A.E.: Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans. on Information Theory 39, 930–945 (1993)
Article MathSciNet MATH Google Scholar
Bartfeld, E., Grinvald, A.: Relationships between orientation-preference pinwheels, cytochrome oxidase blobs, and ocular-dominance columns in primate striate cortex. Proc. Nati. Acad. Sci. USA 89, 11905–11909 (1992)
Article Google Scholar
Becker, S., Hinton, G.E.: Learning mixture models of spatial coherence. Neural Computation 5, 267–277 (1993)
Article Google Scholar
Bengio, Y.: Learning deep architectures for AI. Foundations and Trends in Machine Learning 2(1), 1–127 (2009); also published as a book. Now Publishers (2009)
Article MathSciNet MATH Google Scholar
Bengio, Y., Delalleau, O.: Justifying and generalizing contrastive divergence. Neural Computation 21(6), 1601–1621 (2009)
Article MathSciNet MATH Google Scholar
Bengio, Y., Delalleau, O.: Shallow versus deep sum-product networks. In: The Learning Workshop, Fort Lauderdale, Florida (2011)
Google Scholar
Bengio, Y., LeCun, Y.: Scaling learning algorithms towards AI. In: Bottou, L., Chapelle, O., DeCoste, D., Weston, J. (eds.) Large Scale Kernel Machines. MIT Press (2007)
Google Scholar
Bengio, Y., Delalleau, O., Le Roux, N., Paiement, J.-F., Vincent, P., Ouimet, M.: Spectral Dimensionality Reduction. In: Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L. (eds.) Feature Extraction, Foundations and Applications, vol. 207, pp. 519–550. Springer, Heidelberg (2006)
Chapter Google Scholar
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Schölkopf, B., Platt, J., Hoffman, T. (eds.) Advances in Neural Information Processing Systems 19 (NIPS 2006), pp. 153–160. MIT Press (2007)
Google Scholar
Bengio, Y., Bastien, F., Bergeron, A., Boulanger-Lewandowski, N., Chherawala, Y., Cisse, M., Côté, M., Erhan, D., Eustache, J., Glorot, X., Muller, X., Pannetier-Lebeuf, S., Pascanu, R., Savard, F., Sicard, G.: Deep self-taught learning for handwritten character recognition. In: NIPS*2010 Deep Learning and Unsupervised Feature Learning Workshop (2010)
Google Scholar
Berkes, P., Wiskott, L.: Slow feature analysis yields a rich repertoire of complex cell properties. Journal of Vision 5(6), 579–602 (2005)
Article Google Scholar
Bourlard, H., Kamp, Y.: Auto-association by multilayer perceptrons and singular value decomposition. Biological Cybernetics 59, 291–294 (1988)
Article MathSciNet MATH Google Scholar
Braverman, M.: Poly-logarithmic independence fools bounded-depth boolean circuits. Communications of the ACM 54(4), 108–115 (2011)
Article MathSciNet Google Scholar
Breuleux, O., Bengio, Y., Vincent, P.: Quickly generating representative samples from an RBM-derived process. Neural Computation 23(8), 2058–2073 (2011)
Article MathSciNet Google Scholar
Bromley, J., Benz, J., Bottou, L., Guyon, I., Jackel, L., LeCun, Y., Moore, C., Sackinger, E., Shah, R.: Signature verification using a siamese time delay neural network. In: Advances in Pattern Recognition Systems using Neural Network Technologies, pp. 669–687. World Scientific, Singapore (1993)
Google Scholar
Cadieu, C., Olshausen, B.: Learning transformational invariants from natural movies. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 21, pp. 209–216. MIT Press (2009)
Google Scholar
Cardoso, J.-F.: Multidimensional independent component analysis. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 4, pp. 1941–1944 (1998)
Google Scholar
Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR 2005). IEEE Press (2005)
Google Scholar
Collobert, R., Weston, J.: A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Cohen, W.W., McCallum, A., Roweis, S.T. (eds.) Proceedings of the Twenty-Fifth International Conference on Machine Learning (ICML 2008), pp. 160–167. ACM (2008)
Google Scholar
Courville, A., Bergstra, J., Bengio, Y.: A spike and slab restricted Boltzmann machine. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2011) (2011)
Google Scholar
Desjardins, G., Courville, A., Bengio, Y., Vincent, P., Delalleau, O.: Tempered Markov chain Monte-Carlo for training of restricted Boltzmann machine. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010), pp. 145–152 (2010)
Google Scholar
Erhan, D., Courville, A., Bengio, Y.: Understanding representations learned in deep architectures. Technical Report 1355, Université de Montréal/DIRO (2010a)
Google Scholar
Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? Journal of Machine Learning Research 11, 625–660 (2010b)
MathSciNet MATH Google Scholar
Goodfellow, I., Le, Q., Saxe, A., Ng, A.: Measuring invariances in deep networks. In: Bengio, Y., Schuurmans, D., Williams, C., Lafferty, J., Culotta, A. (eds.) Advances in Neural Information Processing Systems 22 (NIPS 2009), pp. 646–654 (2009)
Google Scholar
Grimes, D.B., Rao, R.P.: Bilinear sparse coding for invariant vision. Neural Computation 17(1), 47–73 (2005)
Article Google Scholar
Gutmann, M., Hyvarinen, A.: Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2010 (2010)
Google Scholar
Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR 2006), pp. 1735–1742. IEEE Press (2006)
Google Scholar
Håstad, J.: Almost optimal lower bounds for small depth circuits. In: Proceedings of the 18th Annual ACM Symposium on Theory of Computing, Berkeley, California, pp. 6–20. ACM Press (1986)
Google Scholar
Håstad, J., Goldmann, M.: On the power of small-depth threshold circuits. Computational Complexity 1, 113–129 (1991)
Article MathSciNet MATH Google Scholar
Hinton, G.E.: Products of experts. In: Proceedings of the Ninth International Conference on Artificial Neural Networks (ICANN), Edinburgh, Scotland, vol. 1, pp. 1–6. IEE (1999)
Google Scholar
Hinton, G.E., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet MATH Google Scholar
Hinton, G.E., Zemel, R.S.: Autoencoders, minimum description length, and Helmholtz free energy. In: Cowan, D., Tesauro, G., Alspector, J. (eds.) Advances in Neural Information Processing Systems 6 (NIPS 1993), pp. 3–10. Morgan Kaufmann Publishers, Inc. (1994)
Google Scholar
Hinton, G.E., Sejnowski, T.J., Ackley, D.H.: Boltzmann machines: Constraint satisfaction networks that learn. Technical Report TR-CMU-CS-84-119, Carnegie-Mellon University, Dept. of Computer Science (1984)
Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.: A fast learning algorithm for deep belief nets. Neural Computation 18, 1527–1554 (2006)
Article MathSciNet MATH Google Scholar
Hubel, D.H., Wiesel, T.N.: Receptive fields of single neurons in the cat’s striate cortex. Journal of Physiology 148, 574–591 (1959)
Google Scholar
Hurri, J., Hyvärinen, A.: Temporal coherence, natural image sequences, and the visual cortex. In: Advances in Neural Information Processing Systems 15 (NIPS 2002), pp. 141–148 (2003)
Google Scholar
Hyvärinen, A.: Estimation of non-normalized statistical models using score matching. Journal of Machine Learning Research 6, 695–709 (2005)
MATH Google Scholar
Hyvärinen, A., Hoyer, P.: Emergence of phase and shift invariant features by decomposition of natural images into independent feature subspaces. Neural Computation 12(7), 1705–1720 (2000)
Article Google Scholar
Hyvärinen, A., Hoyer, P.O., Inki, M.O.: Topographic independent component analysis. Neural Computation 13(7), 1527–1558 (2001)
Article MATH Google Scholar
Jain, V., Seung, S.H.: Natural image denoising with convolutional networks. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 21 (NIPS 2008), pp. 769–776 (2008)
Google Scholar
Jarrett, K., Kavukcuoglu, K., Ranzato, M., LeCun, Y.: What is the best multi-stage architecture for object recognition? In: Proc. International Conference on Computer Vision (ICCV 2009), pp. 2146–2153. IEEE (2009)
Google Scholar
Jenatton, R., Audibert, J.-Y., Bach, F.: Structured variable selection with sparsity-inducing norms. Technical report, arXiv:0904.3523 (2009)
Google Scholar
Jordan, M.I.: Learning in Graphical Models. Kluwer, Dordrecht (1998)
Book MATH Google Scholar
Kavukcuoglu, K., Ranzato, M., LeCun, Y.: Fast inference in sparse coding algorithms with applications to object recognition. Technical report, Computational and Biological Learning Lab, Courant Institute, NYU. Tech Report CBLL-TR-2008-12-01 (2008)
Google Scholar
Kavukcuoglu, K., Ranzato, M., Fergus, R., LeCun, Y.: Learning invariant features through topographic filter maps. In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR 2009), pp. 1605–1612. IEEE (2009)
Google Scholar
Kavukcuoglu, K., Sermanet, P., Boureau, Y.-L., Gregor, K., Mathieu, M., LeCun, Y.: Learning convolutional feature hierarchies for visual recognition. In: Advances in Neural Information Processing Systems 23 (NIPS 2010), pp. 1090–1098 (2010)
Google Scholar
Kingma, D., LeCun, Y.: Regularized estimation of image statistics by score matching. In: Lafferty, J., Williams, C.K.I., Shawe-Taylor, J., Zemel, R., Culotta, A. (eds.) Advances in Neural Information Processing Systems 23, pp. 1126–1134 (2010)
Google Scholar
Klampfl, S., Maass, W.: Replacing supervised classification learning by slow feature analysis in spiking neural networks. In: Bengio, Y., Schuurmans, D., Williams, C., Lafferty, J., Culotta, A. (eds.) Advances in Neural Information Processing Systems 22 (NIPS 2009), pp. 988–996 (2009)
Google Scholar
Kohonen, T.: The self-organizing map. Proceedings of the IEEE 78(9), 1464–1480 (1990)
Article Google Scholar
Kohonen, T.: Emergence of invariant-feature detectors in the adaptive-subspace self-organizing map. Biological Cybernetics 75, 281–291 (1996), doi:10.1007/s004220050295
Article MATH Google Scholar
Kohonen, T., Nemeth, G., Bry, K.-J., Jalanko, M., Riittinen, H.: Spectral classification of phonemes by learning subspaces. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1979, vol. 4, pp. 97–100 (1979)
Google Scholar
Körding, K.P., Kayser, C., Einhäuser, W., König, P.: How are complex cell properties adapted to the statistics of natural stimuli? Journal of Neurophysiology 91, 206–212 (2004)
Article Google Scholar
Krizhevsky, A.: Convolutional deep belief networks on cifar-10 (2010) (unpublished manuscript) http://www.cs.utoronto.ca/~kriz/conv-cifar10-aug2010.pdf
Kurkova, V., Sanguineti, M.: Geometric upper bounds on rates of variable-basis approximation. IEEE Trans. on Information Theory 54, 5681–5688 (2008)
Article MathSciNet Google Scholar
Larochelle, H., Erhan, D., Bengio, Y.: Zero-data learning of new tasks. In: Proceedings of the 23rd National Conference on Artificial Intelligence, vol. 2, pp. 646–651. AAAI Press (2008)
Google Scholar
Le, Q., Ngiam, J., Chen, Z., Hao Chia, D.J., Koh, P.W., Ng, A.: Tiled convolutional neural networks. In: Lafferty, J., Williams, C.K.I., Shawe-Taylor, J., Zemel, R., Culotta, A. (eds.) Advances in Neural Information Processing Systems 23 (NIPS 2010), pp. 1279–1287 (2010)
Google Scholar
Le Roux, N., Bengio, Y.: Representational power of restricted Boltzmann machines and deep belief networks. Neural Computation 20(6), 1631–1649 (2008)
Article MathSciNet MATH Google Scholar
LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition. Neural Computation 1(4), 541–551 (1989)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Bottou, L., Littman, M. (eds.) Proceedings of the Twenty-Sixth International Conference on Machine Learning (ICML 2009). ACM, Montreal (2009a)
Google Scholar
Lee, H., Pham, P., Largman, Y., Ng, A.: Unsupervised feature learning for audio classification using convolutional deep belief networks. In: Bengio, Y., Schuurmans, D., Williams, C., Lafferty, J., Culotta, A. (eds.) Advances in Neural Information Processing Systems 22 (NIPS 2009), pp. 1096–1104 (2009b)
Google Scholar
Lee, J.A., Verleysen, M.: Nonlinear dimensionality reduction. Springer (2007)
Google Scholar
Manzagol, P.-A., Bertin-Mahieux, T., Eck, D.: On the use of sparse time-relative auditory codes for music. In: Proceedings of the 9th International Conference on Music Information Retrieval (ISMIR 2008), pp. 603–608 (2008)
Google Scholar
Olshausen, B., Field, D.J.: How close are we to understanding V1? Neural Computation 17, 1665–1699 (2005)
Article MATH Google Scholar
Olshausen, B.A., Field, D.J.: Sparse coding with an overcomplete basis set: a strategy employed by V1? Vision Research 37, 3311–3325 (1997)
Article Google Scholar
Olshausen, B.A., Anderson, C.H., Van Essen, D.C.: A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information. J. Neurosci. 13(11), 4700–4719 (1993)
Google Scholar
Raina, R., Battle, A., Lee, H., Packer, B., Ng, A.Y.: Self-taught learning: transfer learning from unlabeled data. In: Ghahramani, Z. (ed.) Proceedings of the Twenty-Fourth International Conference on Machine Learning (ICML 2007), pp. 759–766. ACM (2007)
Google Scholar
Ranzato, M., Hinton, G.H.: Modeling pixel means and covariances using factorized third-order Boltzmann machines. In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR 2010), pp. 2551–2558. IEEE Press (2010)
Google Scholar
Ranzato, M., Poultney, C., Chopra, S., LeCun, Y.: Efficient learning of sparse representations with an energy-based model. In: Schölkopf, B., Platt, J., Hoffman, T. (eds.) Advances in Neural Information Processing Systems 19 (NIPS 2006), pp. 1137–1144. MIT Press (2007a)
Google Scholar
Ranzato, M., Poultney, C., Chopra, S., LeCun, Y.: Efficient learning of sparse representations with an energy-based model. In: NIPS 2006 (2007b)
Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323, 533–536 (1986)
Article Google Scholar
Salakhutdinov, R.: Learning deep Boltzmann machines using adaptive MCMC. In: Bottou, L., Littman, M. (eds.) Proceedings of the Twenty-Seventh International Conference on Machine Learning (ICML 2010), vol. 1, pp. 943–950. ACM (2010)
Google Scholar
Salakhutdinov, R., Hinton, G.E.: Deep Boltzmann machines. In: Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS 2009), vol. 5, pp. 448–455 (2009)
Google Scholar
Salakhutdinov, R., Larochelle, H.: Efficient learning of deep Boltzmann machines. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010), JMLR W&CP, vol. 9, pp. 693–700 (2010)
Google Scholar
Saul, L., Roweis, S.: Think globally, fit locally: unsupervised learning of low dimensional manifolds. Journal of Machine Learning Research 4, 119–155 (2002)
MathSciNet Google Scholar
Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, Cambridge (2002)
Google Scholar
Serre, T., Kreiman, G., Kouh, M., Cadieu, C., Knoblich, U., Poggio, T.: A quantitative theory of immediate visual recognition. Progress in Brain Research, Computational Neuroscience: Theoretical Insights into Brain Function 165, 33–56 (2007)
Article Google Scholar
Smith, E.C., Lewicki, M.S.: Efficient auditory coding. Nature 439(7079), 978–982 (2006)
Article Google Scholar
Smolensky, P.: Information processing in dynamical systems: Foundations of harmony theory. In: Rumelhart, D.E., McClelland, J.L. (eds.) Parallel Distributed Processing, ch. 6, vol. 1, pp. 194–281. MIT Press, Cambridge (1986)
Google Scholar
Taylor, G.W., Fergus, R., LeCun, Y., Bregler, C.: Convolutional Learning of Spatio-temporal Features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 140–153. Springer, Heidelberg (2010)
Chapter Google Scholar
Tenenbaum, J.B., Freeman, W.T.: Separating Style and Content with Bilinear Models. Neural Computation 12(6), 1247–1283 (2000)
Article Google Scholar
Tieleman, T.: Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Cohen, W.W., McCallum, A., Roweis, S.T. (eds.) Proceedings of the Twenty-Fifth International Conference on Machine Learning (ICML 2008), pp. 1064–1071. ACM (2008)
Google Scholar
Tieleman, T., Hinton, G.: Using fast weights to improve persistent contrastive divergence. In: Bottou, L., Littman, M. (eds.) Proceedings of the Twenty-Sixth International Conference on Machine Learning (ICML 2009), pp. 1033–1040. ACM (2009)
Google Scholar
Turaga, S.C., Murray, J.F., Jain, V., Roth, F., Helmstaedter, M., Briggman, K., Denk, W., Seung, H.S.: Convolutional networks can learn to generate affinity graphs for image segmentation. Neural Computation 22, 511–538 (2010)
Article MATH Google Scholar
Vasilescu, M.A.O., Terzopoulos, D.: Multilinear independent components analysis. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 547–553 (2005)
Google Scholar
Vincent, P.: A connection between score matching and denoising autoencoders. Neural Computation 23(7), 1661–1674 (2011)
Article MathSciNet MATH Google Scholar
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.-A.: Extracting and composing robust features with denoising autoencoders. In: Cohen, W.W., McCallum, A., Roweis, S.T. (eds.) Proceedings of the Twenty-Fifth International Conference on Machine Learning (ICML 2008), pp. 1096–1103. ACM (2008)
Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.-A.: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research 11, 3371–3408 (2010)
MathSciNet MATH Google Scholar
Wang, H., Ullah, M.M., Kläser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: British Machine Vision Conference (BMVC), London, UK, p. 127 (2009)
Google Scholar
Welling, M.: Herding dynamic weights for partially observed random field models. In: Proceedings of the 25th Conference in Uncertainty in Artificial Intelligence (UAI 2009). Morgan Kaufmann (2009)
Google Scholar
Weston, J., Ratle, F., Collobert, R.: Deep learning via semi-supervised embedding. In: Cohen, W.W., McCallum, A., Roweis, S.T. (eds.) Proceedings of the Twenty-Fifth International Conference on Machine Learning (ICML 2008), pp. 1168–1175. ACM, New York (2008)
Google Scholar
Wiskott, L., Sejnowski, T.: Slow feature analysis: Unsupervised learning of invariances. Neural Computation 14(4), 715–770 (2002)
Article MATH Google Scholar
Younes, L.: On the convergence of Markovian stochastic algorithms with rapidly decreasing ergodicity rates. Stochastics and Stochastic Reports 65(3), 177–228 (1999)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Dept. IRO, Université de Montréal, Montréal, Canada
Yoshua Bengio & Aaron Courville

Authors

Yoshua Bengio
View author publications
You can also search for this author in PubMed Google Scholar
Aaron Courville
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

, Dipto. Ingegneria dell'Informazione, Università degli Studi di Siena, Via Roma 56, Siena, 53100, Italy
Monica Bianchini
Fac. Ingegneria, Dipto. Ingegneria dell'Informazione, Università Siena, Via Roma 56, Siena, 53100, Italy
Marco Maggini
University of Canberra, School of Electrical and Information, Adjunct Professor, Mawson Lakes Campus, ACT, 2601, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bengio, Y., Courville, A. (2013). Deep Learning of Representations. In: Bianchini, M., Maggini, M., Jain, L. (eds) Handbook on Neural Information Processing. Intelligent Systems Reference Library, vol 49. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36657-4_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-36657-4_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36656-7
Online ISBN: 978-3-642-36657-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics