3-gram Language Model (NumPy)

A basic implementation of a trigram language model built from scratch using Python and NumPy.
The project trains on a text corpus, builds unigram/bigram/trigram statistics, and supports:

Sentence generation (eval.py)
Next word prediction (predict_one.py)
REST API serving with FastAPI (api.py)

Usage

Train in Jupyter Notebook:

jupyter notebook notebooks/3gram_model.ipynb

Run evaluation / prediction:

python3 models/eval.py the quick 20
python3 models/predict_one.py the quick

Start API:

uvicorn api:app --reload --port 8000

Example Output:

    Input: the quick
    Generated: the quick brown fox jumps over the lazy dog ...

Future Work: Add Kneser–Ney smoothing, sampling with temperature, and top-k decoding.

Contributing

Contributions, issues, and feature requests are welcome!
If you’d like to improve this project:

Open an Issue describing the bug, feature, or enhancement.
Fork the repository and create a new branch.
Open a Pull Request (PR) with your changes.

License

This project is licensed under the Apache License 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
models		models
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

3-gram Language Model (NumPy)

Usage

Contributing

License

© 2025 Ruskaruma. Licensed under the Apache License, Version 2.0 (the "License")

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

ruskaruma/3gramLM

Folders and files

Latest commit

History

Repository files navigation

3-gram Language Model (NumPy)

Usage

Contributing

License

© 2025 Ruskaruma. Licensed under the Apache License, Version 2.0 (the "License")

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages