page2pod

Convert web pages and articles to chapter-based podcasts using OpenAI TTS.

Features

Extracts chapters from H2 headings
Generates MP3 with embedded chapter markers
Caches individual chapters - only regenerates changed content
Outputs chapters.json for web players
Supports both local HTML files and URLs

Installation

pip install openai mutagen beautifulsoup4 requests

Or install as a tool:

pip install -e .

Usage

# Convert a local HTML file
page2pod index.html

# Convert a URL
page2pod https://example.com/article

# Force regenerate everything
page2pod index.html --force

# Regenerate only chapter 3
page2pod index.html --chapter 3

# Just recombine existing chapters (no TTS calls)
page2pod index.html --combine

# List chapters without generating
page2pod index.html --list

# Custom output directory
page2pod index.html -o ./output

# Use different voice
page2pod index.html -v nova

Voices

OpenAI TTS voices: alloy, echo, fable, onyx (default), nova, shimmer

Cache Structure

Each page gets its own cache directory in ~/.cache/page2pod/<page-id>/:

~/.cache/page2pod/example-com-article/
├── meta.json              # Chapter hashes for change detection
├── source.html            # Original HTML
└── chapters/
    ├── 00-introduction.mp3
    ├── 00-introduction.txt
    ├── 01-getting-started.mp3
    ├── 01-getting-started.txt
    └── ...

Output

./
├── example-com-article.mp3           # Combined MP3 with chapter markers
└── example-com-article.chapters.json # Chapter index for web players

Web Player

The .chapters.json file works with players like Plyr.js:

fetch('article.chapters.json')
  .then(res => res.json())
  .then(data => {
    data.chapters.forEach(ch => {
      console.log(`${ch.title}: ${ch.start}s - ${ch.end}s`);
    });
  });

Environment

Requires OPENAI_API_KEY environment variable.

About Intelligrit Labs

page2pod is developed by Intelligrit Labs, the R&D arm of Intelligrit LLC. We build tools for ourselves and release them for everyone. Intelligrit delivers AI-driven IT modernization for federal agencies.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
page2pod.py		page2pod.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

page2pod

Features

Installation

Usage

Voices

Cache Structure

Output

Web Player

Environment

About Intelligrit Labs

License

About

Uh oh!

Releases

Packages

Languages

License

intelligrit/page2pod

Folders and files

Latest commit

History

Repository files navigation

page2pod

Features

Installation

Usage

Voices

Cache Structure

Output

Web Player

Environment

About Intelligrit Labs

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages