NoCoLA

This repository is supporting the paper "NoCoLA: The Norwegian Corpus of Linguistic Acceptability" by Matias Jentoft and David Samuel at University of Oslo, Language Technology Group. NoCoLa are two datasets: "class" consisting of Norwegian language sentences with their binary acceptability judgements, and "zero" with pairs of unacceptable sentences with their acceptable counterparts.

NoCoLA is also available on HuggingFace at https://huggingface.co/datasets/ltg/nocola

The two datasets for linguistic acceptability are published here, for the -class version we have pre-made a split of 80/10/10 for training purposes.

If you wish to test a Norwegian Language Model for its competence in Norwegian grammar, all the necessary code is available in this repository.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
datasets		datasets
evaluation		evaluation
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NoCoLA

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

ltgoslo/nocola

Folders and files

Latest commit

History

Repository files navigation

NoCoLA

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages