Abstract
In 2017, Ashish Vaswani et al. from Google Brain and Google Research proposed a revolutionary new architecture of neural networks for natural language processing (NLP) and other sequence-to-sequence tasks in their “Attention Is All You Need” paper. In this paper, Vaswani et al. presented a new approach that relies heavily on attention mechanisms to process sequences, allowing for parallelization, efficient training, and the ability to capture long-range dependencies in data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature
About this chapter
Cite this chapter
Amaratunga, T. (2023). Transformers. In: Understanding Large Language Models. Apress, Berkeley, CA. https://doi.org/10.1007/979-8-8688-0017-7_3
Download citation
DOI: https://doi.org/10.1007/979-8-8688-0017-7_3
Published:
Publisher Name: Apress, Berkeley, CA
Print ISBN: 979-8-8688-0016-0
Online ISBN: 979-8-8688-0017-7
eBook Packages: Professional and Applied ComputingProfessional and Applied Computing (R0)Apress Access Books