Evaluating pyannote.audio on endangered languages. Exporting segmented Praat TextGrid for linguists.
This project aims to evaluate the performance of the pyannote audio diarization toolkit on endangered languages from the Pangloss collection. The goal is to export segmented Praat TextGrids that can be used by linguists for further analysis.
Ces travaux ont été partiellement financés par le projet DIAGNOSTIC soutenu par l’Agence d’Innovation de Défense (contrat no 2022 65 007) et le projet DEEPTYPO soutenu par l’Agence Nationale de la Recherche (ANR-23-CE38-0003-01).
To clone this repository, run:
git clone https://github.com/rfclara/diarization.gitTo install the environment, run:
curl -fsSL https://pixi.sh/install.sh | shpixi installAccept pyannote/segmentation-3.0 user conditions
Accept pyannote/speaker-diarization-3.1 user conditions
Create access token at hf.co/settings/tokens.
To run the diarization on your audio files, use the following command:
pixi run python src/diarization/diarization.py <path_to_wav_file>You can provide the number of speakers in advance (our experiences showed that this may not improve the results):
pixi run python src/diarization/diarization.py <path_to_wav_file> --num_speakers <number_of_speakers>You will get a .TextGrid (and an .rttm) segmented files, enjoy!