Skip to content

ltgoslo/nocola

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

NoCoLA

This repository is supporting the paper "NoCoLA: The Norwegian Corpus of Linguistic Acceptability" by Matias Jentoft and David Samuel at University of Oslo, Language Technology Group. NoCoLa are two datasets: "class" consisting of Norwegian language sentences with their binary acceptability judgements, and "zero" with pairs of unacceptable sentences with their acceptable counterparts.

NoCoLA is also available on HuggingFace at https://huggingface.co/datasets/ltg/nocola

The two datasets for linguistic acceptability are published here, for the -class version we have pre-made a split of 80/10/10 for training purposes.

If you wish to test a Norwegian Language Model for its competence in Norwegian grammar, all the necessary code is available in this repository.

About

Official repository for NoCoLA dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •