The Lazy Learner
  • About
  • Projects
  • Github
  • Connect With Pinaki

Transformers

My models run while I sleep!

Transformers In Deep Learning
Transformers

Transformers In Deep Learning

Transformers In Self and Cross Attention Initially the transformers began for sequential inputs only where they are later encoded with embeddings. The original paper can be referenced here, The most General Architecture of a Transformer consists of a encoder and decoder model as discused in the Attention article that I wrote earlier. Both the encoder and decoder has a set of layers where the first layer consists of the multi-head attention model, which is followed by the layer normalization node.

  • Pinaki Pani
    profile_pic
Pinaki Pani 17 February 2024 • 4 min read
The Lazy Learner © 2024
Latest Posts Instagram Twitter Github Created by Pinaki Pani