Skip to main content

Transformers

  • Chapter
  • First Online:
Understanding Large Language Models
  • 1155 Accesses

Abstract

In 2017, Ashish Vaswani et al. from Google Brain and Google Research proposed a revolutionary new architecture of neural networks for natural language processing (NLP) and other sequence-to-sequence tasks in their “Attention Is All You Need” paper. In this paper, Vaswani et al. presented a new approach that relies heavily on attention mechanisms to process sequences, allowing for parallelization, efficient training, and the ability to capture long-range dependencies in data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Amaratunga, T. (2023). Transformers. In: Understanding Large Language Models. Apress, Berkeley, CA. https://doi.org/10.1007/979-8-8688-0017-7_3

Download citation

Publish with us

Policies and ethics