allow providing vlm captions with surrounding page text by edknv · Pull Request #1389 · NVIDIA/nv-ingest

edknv · 2026-02-10T03:28:23Z

Description

This PR adds context_text_max_chars parameter to allow enriching VLM image captions with surrounding page text. When enabled, each image's caption prompt is prepended with nearby text, improving retrieval accuracy for documents where images and surrounding text are semantically linked.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.
If adjusting docker-compose.yaml environment variables have you ensured those are mimicked in the Helm values.yaml file.

edknv and others added 5 commits February 9, 2026 19:25

allow providing vlm captions with surrounding page text

8f85a3a

lint

d768420

Merge branch 'main' into edwardk/vlm-caption-context-text

91dfafa

lint

8430ae0

Merge branch 'main' into edwardk/vlm-caption-context-text

3cd77e9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

allow providing vlm captions with surrounding page text#1389

allow providing vlm captions with surrounding page text#1389
edknv wants to merge 5 commits intoNVIDIA:mainfrom
edknv:edwardk/vlm-caption-context-text

edknv commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

edknv commented Feb 10, 2026

Description

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant