Skip to content

GARRYHU/TransformerParameters

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transformer Model Parameter Calculation

This repository contains the implementation of the concepts discussed in the blog post "Understanding Parameter Calculation in Transformer-Based Models: Simplified". The post provides a detailed explanation of the Transformer architecture and the intricacies involved in counting its parameters.

https://medium.com/@geosar/understanding-parameter-calculation-in-transformer-based-models-simplified-e8c7f4e059d8

Overview

The blog post breaks down the Transformer architecture into three main components:

  1. Embedding: Converts an input image into a sequence of embedded patches.
  2. Attention: The Attention layer.
  3. Transfomer: Explain the transformer architecture and parameter.

Each component is explained with a focus on the transformations and the trainable parameters involved.

Implementation Details

The code in this repository is structured to reflect the modular nature of the Transformer architecture. It includes:

a break down of all the components of a Transforer to allow for total parameter calculation.

Usage details

deploy packages within the "requirements.txt"

and then run app.py

About

Streamlit app that calculates the Tranformer model parameters

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%