Skip to content

mmellau/microgpt-swift

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

microgpt-swift

Swift port of Karpathy's microgpt.py. Zero dependencies. Runs on macOS and Linux. ~4x faster than CPython.

About

I built this to understand how GPTs work, not by reading about transformers but by porting every operation from Python to Swift, line by line. Karpathy's blog post puts it well: "if you understand microgpt, you understand the algorithmic essence" of LLMs. I wanted to see if I actually did.

There are already dozens of ports out there (Rust, C, C++, Zig, even other Swift ones) but I still found it worthwhile. I wanted to understand the algorithm by reimplementing it in the language I work in daily.

Quick start

swift build -c release
.build/release/microgpt-swift

Downloads names.txt on first run. swift run also works but compiles in debug mode, which is ~10x slower.

Performance

Time (1000 steps) vs. Swift
Python (CPython 3.14.3) 65.0 ± 0.6s 3.8x
Swift (release) 17.0 ± 0.1s 1.0x

Apple M1 Max. Swift 6.2.4, CPython 3.14.3. Default hyperparameters, data pre-downloaded. Measured with hyperfine (3 runs, 1 warmup).

Options

Flag Default
--steps 1000 training steps
--lr 0.01 learning rate
--seed 42 RNG seed
--temperature 0.5 sampling temperature
--samples 20 names to generate
.build/release/microgpt-swift --steps 2000 --temperature 0.8
# or: swift run microgpt-swift -- --steps 2000 --temperature 0.8

Output

num docs: 32033
vocab size: 27
num params: 4192
step 1000 / 1000 | loss 2.6181
--- inference (new, hallucinated names) ---
sample  1: mari
sample  2: maren
sample  3: ran
sample  4: leynn
sample  5: amaron
sample  6: jaria
sample  7: kara
sample  8: orel
sample  9: tarili
sample 10: jarian
sample 11: fama
sample 12: arian
sample 13: araha
sample 14: ianda
sample 15: varria
sample 16: alinile
sample 17: darcan
sample 18: lare
sample 19: kareen
sample 20: radeia

Acknowledgments

License

MIT

About

Swift port of Karpathy's microgpt.py

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages