Its purpose is to keep an updated list of latest nlp libraries and frameworks. The main focus is on Python programming language. I plan to add more details to this repo in terms of pros and cons for each library.
Note: The ordering is random.
It supports multiple languages and is suitable for both industry and research.
It supports multiple languages and is suitable mainly for both industry applications due to its opiniated nature. It also supports word embedding models.
It is one of the best library for classifical word embeddings (context independent unlike transformers) with pretrained models for more than 170 languages. It also provides pretained models for language identification and other text classification tasks.
Its probably one the best library for using pretrained transformers based model as well as training your own such models for many NLP tasks.
https://github.com/huggingface/transformers
Its one of most recently tokenizers which covers state of the art as well as commonly used tokenizers. It take pride in its exteme fast tokenization.
https://github.com/huggingface/tokenizers/tree/master/bindings/python
Its built on top of NLTK and pattern and also supports multiple languages.
https://textblob.readthedocs.io/en/dev/
Its built on top of Spacy with some additional features.
https://chartbeat-labs.github.io/textacy/
https://www.tensorflow.org/tutorials/tensorflow_text/intro
It is a deep-learning based NLP modeling framework built on PyTorch
https://pytext.readthedocs.io/en/master/index.html
A very simple framework for state-of-the-art NLP. Developed by Zalando Research. It also supports multiple languages and word embeddings.