Une nouvelle évaluation des capacités multilingues des modèles neuronaux pré-entrainés de la parole

Evaluating pyannote.audio on endangered languages. Exporting segmented Praat TextGrid for linguists.

Introduction

This project aims to evaluate the performance of the pyannote audio diarization toolkit on endangered languages from the Pangloss collection. The goal is to export segmented Praat TextGrids that can be used by linguists for further analysis.

Ces travaux ont été partiellement financés par le projet DIAGNOSTIC soutenu par l’Agence d’Innovation de Défense (contrat no 2022 65 007) et le projet DEEPTYPO soutenu par l’Agence Nationale de la Recherche (ANR-23-CE38-0003-01).

Installation

To clone this repository, run:

git clone https://github.com/rfclara/diarization.git

To install the environment, run:

Pixi.sh

curl -fsSL https://pixi.sh/install.sh | sh

pixi install

Permissions

Accept pyannote/segmentation-3.0 user conditions

Accept pyannote/speaker-diarization-3.1 user conditions

Create access token at hf.co/settings/tokens.

Usage

To run the diarization on your audio files, use the following command:

pixi run python src/diarization/diarization.py <path_to_wav_file>

You can provide the number of speakers in advance (our experiences showed that this may not improve the results):

pixi run python src/diarization/diarization.py <path_to_wav_file> --num_speakers <number_of_speakers>

You will get a .TextGrid (and an .rttm) segmented files, enjoy!

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
gold		gold
src/diarization		src/diarization
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
diarization_poster.pdf		diarization_poster.pdf
download.sh		download.sh
evaluation.py		evaluation.py
metadata.tsv		metadata.tsv
pixi.lock		pixi.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Une nouvelle évaluation des capacités multilingues des modèles neuronaux pré-entrainés de la parole

Introduction

Installation

Permissions

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

rfclara/diarization

Folders and files

Latest commit

History

Repository files navigation

Une nouvelle évaluation des capacités multilingues des modèles neuronaux pré-entrainés de la parole

Introduction

Installation

Permissions

Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages