✶ A decoder-only character-level GPT that generates infinite Shakespearean text ✶
Generate Shakespeare »
This is a decoder-only character-level GPT model that generates infinite Shakespearean text. This project implements a transformer-based language model trained on Shakespeare's complete works to generate text in his distinctive style.
This is a character-level language model that generates text one character at a time, trained on Shakespeare's complete works. The model uses a transformer architecture with self-attention mechanisms to learn patterns in the text and generate coherent, Shakespearean-style prose and poetry.
- Character-level generation: Generates text character by character for fine-grained control
- Shakespearean style: Trained on complete Shakespeare works for authentic language patterns
- Infinite generation: Generate as much text as you want with customizable prompts
- Pre-trained model: Ready-to-use with pre-trained weights
python3 generate.pyThis will start generating Shakespearean text starting with "HAMLET:" and continue indefinitely. You can modify the prompt in the generate.py file to start with different text.
You can also import and use the model in your own code:
from generate import load_model, generate_text
# Load the pre-trained model
model = load_model()
# Generate text with a custom prompt
text = generate_text(model, prompt="ROMEO:", max_new_tokens=500)The model follows the standard GPT architecture:
- 6 transformer layers with multi-head self-attention
- 384 embedding dimensions
- 6 attention heads
- 256 context window (block size)
- Character-level vocabulary based on the training text
- Training Hardware: Thunder Compute with A100 GPUs
- Training Cost: ~$3
- Training Time: ~20 minutes
- Dataset: Complete Shakespeare works (character-level tokenization)
The training configuration can be found in gpt_training.py:
- Batch size: 64
- Learning rate: 3e-4
- Max iterations: 5000
- Dropout: 0.2
- Context length: 256 tokens
├── gpt.py # Main GPT model implementation
├── gpt_training.py # Training script with hyperparameters
├── generate.py # Text generation script
├── input.txt # Training dataset (Shakespeare works)
├── saved_models/ # Directory containing trained weights
│ └── model_params.pth # Pre-trained model weights
└── README.md # This file
- Python 3.7+
- PyTorch
- CUDA (optional, for GPU acceleration)
If you want to train the model on different text:
- Replace
input.txtwith your training data - Adjust hyperparameters in
gpt_training.pyif needed - Run the training script:
python3 gpt_training.py
This implementation follows the excellent tutorial by Andrej Karpathy: Let's build GPT: from scratch, in code, spelled out
This project is open source and available under the MIT License.
"All the world's a stage, and all the men and women merely players." - Generated by this model