Transformers

Amaratunga, Thimira

doi:10.1007/979-8-8688-0017-7_3

Thimira Amaratunga²

1155 Accesses

Abstract

In 2017, Ashish Vaswani et al. from Google Brain and Google Research proposed a revolutionary new architecture of neural networks for natural language processing (NLP) and other sequence-to-sequence tasks in their “Attention Is All You Need” paper. In this paper, Vaswani et al. presented a new approach that relies heavily on attention mechanisms to process sequences, allowing for parallelization, efficient training, and the ability to capture long-range dependencies in data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 34.99; Price excludes VAT (USA)

Softcover Book: USD 44.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Nugegoda, Sri Lanka
Thimira Amaratunga

Authors

Thimira Amaratunga
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Amaratunga, T. (2023). Transformers. In: Understanding Large Language Models. Apress, Berkeley, CA. https://doi.org/10.1007/979-8-8688-0017-7_3

Download citation

DOI: https://doi.org/10.1007/979-8-8688-0017-7_3
Published: 26 November 2023
Publisher Name: Apress, Berkeley, CA
Print ISBN: 979-8-8688-0016-0
Online ISBN: 979-8-8688-0017-7
eBook Packages: Professional and Applied ComputingProfessional and Applied Computing (R0)Apress Access Books

Publish with us

Policies and ethics