itell-volume-generation

Tests related to converting arbitrary PDFs into iTELL JSON format.

Proof of concept

Pipeline for Extracting Images & Embedding in ITELL JSON using Gemini API

Extract Images from PDF
- Use PyMuPDF to parse the PDF and extract all images.
- Store images locally for now, with possible migration to a hosted DB later.
- Extract and record metadata for each image:
  - Original position (coordinates) within the PDF page
  - Page number, image size, etc.
Gemini API Integration
- Send the PDF file directly to the Gemini API.
- In the prompt, include:
  - Reference to the ITELL guide
  - Example ITELL JSON
  - The image metadata (positions, page numbers, etc.)
- Goal: Ensure Gemini embeds images within the ITELL JSON at their correct locations according to the extracted PDF positions.

Setup

Prerequisites

Python 3.8 or higher

Installation

Clone the repository (if not already done):

git clone <repository-url>
cd itell-volume-generation

Create a virtual environment:
```
python -m venv venv
```
Activate the virtual environment:
- On macOS/Linux:
```
source venv/bin/activate
```
- On Windows:
```
venv\Scripts\activate
```
Install dependencies:
```
pip install -r requirements.txt
```
Create a .env file in the project root:
```
touch .env
```

Configure environment variables in .env:

# Required: Choose one API provider
OPENAI_API_KEY=your_openai_api_key_here
# OR
OPENROUTER_API_KEY=your_openrouter_api_key_here

# Optional: Model configuration
OPENAI_MODEL=gpt-4o-mini
OPENROUTER_MODEL=google/gemini-2.5-flash

# Optional: Base URL overrides
OPENAI_BASE_URL=https://api.openai.com/v1
OPENROUTER_BASE_URL=https://openrouter.ai/api/v1

# Optional: OpenRouter-specific headers
OPENROUTER_SITE_URL=https://yoursite.com
OPENROUTER_APP_NAME=YourAppName

CLI usage

Run the pipeline:

python src/pipeline/main.py \
  --pdf src/resources/input.pdf \
  --guide src/resources/guide.md \
  --reference-json src/resources/reference.json \
  --image-dir results/extracted-images \
  --output results/itell.json

OpenRouter setup

Provide only OPENROUTER_API_KEY (omit OPENAI_API_KEY) to automatically target https://openrouter.ai/api/v1. Override with OPENROUTER_BASE_URL or --base-url if needed.
Most OpenRouter models use provider-scoped names such as google/gemini-2.5-flash (the default when no model is specified) or openrouter/openai/gpt-4o-mini. Pass a custom name via --model or set OPENROUTER_MODEL.
Optional headers recommended by OpenRouter can be set via OPENROUTER_SITE_URL (becomes HTTP-Referer) and OPENROUTER_APP_NAME (becomes X-Title).
If you keep both OpenAI and OpenRouter keys, pass --api-key/--base-url explicitly so the correct provider is used.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
generation_modes_modular		generation_modes_modular
prompts		prompts
src		src
.gitignore		.gitignore
README.md		README.md
generate_mode.sh		generate_mode.sh
requirements.txt		requirements.txt
research_notes.md		research_notes.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

itell-volume-generation

Proof of concept

Setup

Prerequisites

Installation

CLI usage

OpenRouter setup

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

learlab/itell-volume-generation

Folders and files

Latest commit

History

Repository files navigation

itell-volume-generation

Proof of concept

Setup

Prerequisites

Installation

CLI usage

OpenRouter setup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages