LEVANTE VLM Benchmark

Extensible Python-first benchmark comparing VLMs (CLIP-style and LLaVA-style) to children's behavioral data from LEVANTE. R is used for downloading trials (Redivis) and for statistical comparison (DevBench-style metrics); Python is used for config, data loaders, model adapters, and the evaluation runner.

Install (pinned)

python3 -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -r requirements.txt
pip install -e .            # install this package
# Optional: pip install -r requirements-transformers.txt   # for CLIP

Pinned deps: requirements.txt. Dev: requirements-dev.txt.

Quick start

Data (R): Install R and the redivis package; run Rscript scripts/download_levante_data.R to fetch trials into data/raw/<version>/.
Assets (Python): Run python scripts/download_levante_assets.py [--version YYYY-MM-DD] to download corpus and images from the public LEVANTE bucket into data/assets/<version>/.
Evaluate: Then:
- levante-bench list-tasks
- levante-bench list-models
- levante-bench run-eval --task egma-math --model clip_base [--version VERSION]
Compare (R): Run levante-bench run-comparison --task egma-math --model clip_base or run Rscript comparison/compare_levante.R --task TASK --model MODEL directly.

Docs

See docs/README.md for data schema, releases, adding tasks/models, and secrets setup.

Citing

Cite the LEVANTE manuscript and the DevBench (NeurIPS 2024) paper when using this benchmark.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LEVANTE VLM Benchmark

Install (pinned)

Quick start

Docs

Citing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
comparison		comparison
docs		docs
references		references
scripts		scripts
src/levante_bench		src/levante_bench
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements-transformers.txt		requirements-transformers.txt
requirements.txt		requirements.txt

langcog/levante-bench

Folders and files

Latest commit

History

Repository files navigation

LEVANTE VLM Benchmark

Install (pinned)

Quick start

Docs

Citing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages