Transformer Model Implementation

A comprehensive implementation of a transformer model using PyTorch for language translation tasks. This repository includes scripts for configuration setup, dataset preparation, model architecture, tokenization, and training. I followed and re-implemented the below tutorial, highly recommend checking it out:

https://youtu.be/ISNdQcPhsts?si=SiHKhcwJIsDb8-BR

Features

Configurable model parameters and paths.
Preprocessing of bilingual datasets (between English and Italian following the above tutorial).
Implementation of transformer model architecture, encoder, and decoder blocks.
Custom tokenization for source and target languages.
Training loop with logging and checkpointing.
Validation with various metrics.
I also added my notes for more explanatory visual structures how words are converted into tokens and fed through encoder/decoder blocks, how attention score is calculated, etc.

Requirements

Python 3.x
Libraries: torch, transformers, datasets, tokenizers, tensorboard

You can install these libraries using pip:

pip install torch transformers datasets tokenizers tensorboard

Setup

1. Clone the Repository

git clone https://github.com/krmdel/Transformer-From-Scratch.git
cd Transformer-From-Scratch

2. Configuration

Update the config.py file with your desired settings. It includes parameters like batch size, number of epochs, learning rate, sequence length, model dimensions, and paths for saving models and tokenizers.

3. Dataset Preparation

Prepare your dataset in the required format (e.g., JSON with translations for source and target languages) and specify the path in the configuration. The vocabulary for English and Italian is added as an example.

4. Tokenization

Tokenizers for the source and target languages are built if not already available. They convert text to token IDs and vice versa.

Usage

Training the Model

Run the training script to start training the transformer model:

python train.py

This will:

Load the dataset and create data loaders.
Build the tokenizers if not already available.
Initialize the transformer model.
Train the model for the specified number of epochs.
Save model checkpoints after each epoch.

Monitoring Training with TensorBoard

Start TensorBoard to monitor the training process:

tensorboard --logdir=runs/tmodel

Open a browser and navigate to http://localhost:6006 to view the training logs, loss, and other metrics.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Transformer Model Implementation

Features

Requirements

Setup

1. Clone the Repository

2. Configuration

3. Dataset Preparation

4. Tokenization

Usage

Training the Model

Monitoring Training with TensorBoard

Files

README.md

Latest commit

History

README.md

File metadata and controls

Transformer Model Implementation

Features

Requirements

Setup

1. Clone the Repository

2. Configuration

3. Dataset Preparation

4. Tokenization

Usage

Training the Model

Monitoring Training with TensorBoard