Ph.D. student in Computational Linguistics with a specialization in character-level language modeling.
- New York City
- jhnwnstd.github.io
- @languagedoodad
Pinned Loading
-
corpus_toolkit
corpus_toolkit PublicPython toolkit for corpus analysis: tokenization, lexical diversity, vocabulary growth prediction, entropy measures, and Zipf/Heaps visualizations.
Python 7
-
-
wiktionary-audio-extension-chrome
wiktionary-audio-extension-chrome PublicChrome extension to download Wiktionary pronunciations and convert them to WAV locally with FFmpeg.wasm.
JavaScript 1
-
writing_direction
writing_direction PublicThis script predicts language directionality (LTR or RTL) using Gini and entropy calculations on character distributions from Europarl and UDHR corpora.
Python 1
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.