This repository contains the implementation of the concepts discussed in the blog post "Understanding Parameter Calculation in Transformer-Based Models: Simplified". The post provides a detailed explanation of the Transformer architecture and the intricacies involved in counting its parameters.
The blog post breaks down the Transformer architecture into three main components:
- Embedding: Converts an input image into a sequence of embedded patches.
- Attention: The Attention layer.
- Transfomer: Explain the transformer architecture and parameter.
Each component is explained with a focus on the transformations and the trainable parameters involved.
The code in this repository is structured to reflect the modular nature of the Transformer architecture. It includes:
a break down of all the components of a Transforer to allow for total parameter calculation.
deploy packages within the "requirements.txt"
and then run app.py