Skip to content

dschulmeist/Poem-GPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Eminem, Goethe and Shakespeare sing together

- A GPT model from scratch trained on Eminem, Goethe and Shakespeare lyrics -

This repository contains a minimalistic implementation of the GPT model based on Andrej Karpathy's tutorial. I extended the Dataset to also include Goethe and Eminem lyrics. The model is therefore trained on the combined dataset of Shakespeare, Goethe and Eminem lyrics. Furthermore, I changed the tokenization to use the Byte-Pair-Encoding (BPE) or WordPiece tokenization (instead of the simple one from the tutorial).

Usage

To train the model and see the data preparation, just look into the notebook 'dev.ipynb'. The gpt.py file contains the model and the hyperparameters.

Note: this project is meant for fun, and i created it to learn more about transformers and PyTorch. The model is not optimized and the training is not efficient.

Requirements

  • PyTorch
  • tokenizers

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published