Skip to content
/ Annif Public
forked from NatLibFi/Annif

Annif is a multi-algorithm automated subject indexing tool for libraries, archives and museums. This repository is used for developing a production version of the system, based on ideas from the initial prototype.

License

Notifications You must be signed in to change notification settings

buildvoc/Annif

 
 

Repository files navigation

DOI License Build Status codecov Code Climate Scrutinizer Code Quality codebeat badge BCH compliance LGTM: Python Quality Gate Status docs

Annif is an automated subject indexing toolkit. It was originally created as a statistical automated indexing tool that used metadata from the Finna.fi discovery interface as a training corpus.

This repo contains a rewritten production version of Annif based on the prototype. It is a work in progress, but already functional for many common tasks.

Basic install

You will need Python 3.6+ to install Annif.

The recommended way is to install Annif from PyPI into a virtual environment.

python3 -m venv annif-venv
source annif-venv/bin/activate
pip install annif

You will also need NLTK data files:

python -m nltk.downloader punkt

Start up the application:

annif

See Getting Started in the wiki for more details.

Docker install

You can use Annif as a pre-built Docker container. Please see the wiki documentation for details.

Development install

A development version of Annif can be installed by cloning the GitHub repository.

Installation and setup

Clone the repository.

Switch into the repository directory.

Create and activate a virtual environment (optional, but highly recommended):

python3 -m venv venv
. venv/bin/activate

Install dependencies (including development) and make the installation editable:

pip install .[dev]
pip install -e .

You will also need NLTK data files:

python -m nltk.downloader punkt

Start up the application:

annif

Unit tests

Run . venv/bin/activate to enter the virtual environment and then run pytest. To have the test suite watch for changes in code and run automatically, use pytest-watch by running ptw.

Getting help

Many resources are available:

Publications / How to cite

An article about Annif has been published in the peer-reviewed Open Access journal LIBER Quarterly. The software itself is also archived on Zenodo and has a citable DOI.

Annif article

Suominen, O., 2019. Annif: DIY automated subject indexing using multiple algorithms. LIBER Quarterly, 29(1), pp.1–25. DOI: https://doi.org/10.18352/lq.10285

@article{suominen2019annif,
  title={Annif: DIY automated subject indexing using multiple algorithms},
  author={Suominen, Osma},
  journal={{LIBER} Quarterly},
  volume={29},
  number={1},
  pages={1--25},
  year={2019},
  doi = {10.18352/lq.10285},
  url = {https://doi.org/10.18352/lq.10285}
}

Citing the software itself

Zenodo DOI: https://doi.org/10.5281/zenodo.2578948

@misc{https://doi.org/10.5281/zenodo.2578948,
  doi = {10.5281/ZENODO.2578948},
  url = {https://doi.org/10.5281/zenodo.2578948},
  title = {NatLibFi/Annif},
  year = {2019}
}

License

The code in this repository is licensed under Apache License 2.0, except for the dependencies included under annif/static/css and annif/static/js, which have their own licenses. See the file headers for details.

About

Annif is a multi-algorithm automated subject indexing tool for libraries, archives and museums. This repository is used for developing a production version of the system, based on ideas from the initial prototype.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 95.1%
  • HTML 3.1%
  • Other 1.8%