Curate Docs For AI (with Claude Code)

Curate and index documentation from any website into collections like tailwind/, horses/, etc. Reference collection indexes in your AI chats (e.g. @tailwind/INDEX.xml what's a utility?) so that only relevant docs are analysed. Much cleaner than a web-fetch and more focussed than a web-search. Keep your AI context sharp.

Terminal showing three-step workflow: (1) Running /curate-doc biome command, (2) Curation success output showing scraped documentation and generated INDEX.xml entry, (3) Use /ask-docs to query docs. Handwritten annotations highlight each step.

Complete workflow: curate → auto scrape → "/ask-docs biome Validate my config file please"

📦 Repo Collections

Available collections in this repo:

Collection	Collection Index	Description	Scraped	Source
📦 `biome/`	📄 `biome/INDEX.xml`	Fast linter/formatter	2025-11-04	Official
📦 `claudecode/`	📄 `claudecode/INDEX.xml`	Anthropic Claude Code	2026-01-07	Official
📦 `claudeplat/`	📄 `claudeplat/INDEX.xml`	Anthropic Claude Platform	2026-01-07	Official
📦 `clerk/`	📄 `clerk/INDEX.xml`	Authentication	2025-12-03	Official
📦 `convex/`	📄 `convex/INDEX.xml`	Reactive database	2026-01-07	Official
🪝 `lefthook/`	📄 `lefthook/INDEX.xml`	Git hooks manager	2025-11-24	Official
📦 `marimo/`	📄 `marimo/INDEX.xml`	Reactive Python notebooks	2025-11-11	Official
📦 `nextjs/`	📄 `nextjs/INDEX.xml`	React framework	2025-12-02	Official
📦 `playwright/`	📄 `playwright/INDEX.xml`	Browser testing	2025-11-07	Official
📦 `shadcn/`	📄 `shadcn/INDEX.xml`	React UI components	2025-12-16	Official, Guide
📦 `shiny/`	📄 `shiny/INDEX.xml`	Python web apps	2025-11-02	Official
📦 `tailwind/`	📄 `tailwind/INDEX.xml`	CSS framework	2025-10-15	Official
📦 `tailwindplus/`	📄 `tailwindplus/INDEX.xml`	Paid UI Components	2025-11-16	Official
📦 `uv/`	📄 `uv/INDEX.xml`	Python projects	2026-01-16	Official
📦 `vercel/`	📄 `vercel/INDEX.xml`	Deployment platform	2025-10-20	Official
📦 `vitest/`	📄 `vitest/INDEX.xml`	Testing framework	2025-11-05	Official
📦 `zustand/`	📄 `zustand/INDEX.xml`	State management	2026-01-03	Official

Curate your own collections. The lefthook collection is non-standard, docs directly downloaded from GitHub. For Anthropic docs use this tool.

🚀 Setup

# 1. Install UV
# 👉 https://docs.astral.sh/uv/getting-started/installation/

# 2. Clone repository
git clone https://github.com/michellepace/docs-for-ai.git
cd docs-for-ai

# 3. Get free FireCrawl API key
# Visit: https://www.firecrawl.dev/app/api-keys

# 4. Add to your shell profile
echo 'export API_KEY_MCP_FIRECRAWL=your-api-key-here' >> ~/.zshrc
source ~/.zshrc  # Use ~/.bashrc if that's your shell

📖 Usage via Slash Commands

Important

Edit the paths in .claude/commands/ask-docs.md to match your local setup. To use from anywhere, move it to ~/.claude/commands/.

Slash Command	Purpose	.md Files	INDEX `<source>`
`/curate-doc <collection> <url>`	Add new or re-scrape	✅ Write	✅ Add/update INDEX.xml
`/rescrape-docs <collection>`	Re-scrape all docs	✅ Write all	✅ Selective update INDEX.xml
`/improve-index-xml <collection>`	Batch improve descriptions	📖 Read	✅ Update INDEX.xml
`/ask-docs <collection> <question>`	Query any collection	Docs analysed	Relevant docs identified

💡 Usage Example

Assume tailwind was not already a collection in this repo:

# Start a new collection
/curate-doc tailwind https://tailwindcss.com/docs/customizing-colors
# → Creates tailwind/ collection directory, with README.md + INDEX.xml, and first curated doc

# Re-scrape existing doc (refresh content from same URL)
/curate-doc tailwind https://tailwindcss.com/docs/customizing-colors
# → Re-scrapes, writes .md file, replaces source in INDEX.xml

# Curate a new doc into collection
/curate-doc tailwind https://tailwindcss.com/docs/styling-with-utility-classes
# → Scrapes page into collection, writes .md file, adds source to INDEX.xml

# Re-scrape all docs in collection
/rescrape-docs tailwind
# → Re-scrapes all URLs in INDEX.xml, writes all .md files, updates descriptions for changed content

# ✨ Use the docs
/ask-docs tailwind Please evaluate my project for correct usage of utility classes?
# → Searches tailwind/INDEX.xml for relevant docs, analyses these, gives you an answer

🏗️ How This Repo Works

Workflow: Python script scrapes URL → writes .md file → creates INDEX.xml entry with PLACEHOLDER description → Claude Code generates semantic description. The /curate-doc command always regenerates the description, whereas /rescrape-docs only regenerates descriptions for files with content changes.

Directory Structure:

uv/
├── INDEX.xml               # Index of all docs
├── README.md
├── api-reference.md        # Scraped doc
├── getting-started.md      # Scraped doc
└── ...

INDEX.xml Schema:

<docs_index>
  <source>
    <title>Hello Document Title</title>
    <description>20-30 word dense summary optimised for semantic search...</description>
    <source_url>https://docs.example.com/hello</source_url>
    <local_file>hello-document-title.md</local_file>
    <scraped_at>2025-10-15</scraped_at>
  </source>
  <!-- Multiple <source> entries, one per .md file -->
</docs_index>

Scripts use FireCrawl Python SDK. MCP server also configured (.mcp.json, .claude/settings.json).

👉 Notes to Improve later

Old Idea

Instead of crawling, rather go to GitHub and automate downloading and index creation. Docs are much cleaner than crawling. Keep .mdx files as-is; do not convert to .md. Trade-off: bulk downloads bloat the index; curating individually keeps focus.

New Idea (2026.01.16) — use `llms.txt` + direct fetch

Instruction given to Claude Code and successfully run on uv/ directory to update all documents via direct HTTP fetch (Python script), so no scraping, 100% clean, and no Firecrawl tokens.

Claude Code terminal showing user prompt to assess llms.txt approach: explains that instead of FireCrawl scraping (which isn't always clean), match INDEX.xml source_url entries to llms.txt markdown URLs and curl content directly. Shows Claude reading README.md, uv/llms.txt, and uv/INDEX.xml files.

Refactor to use llms.txt + direct fetch

Adding this as a note for later to refactor to this method. (The screenshot mentions curl but we used Python's urllib.request.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Curate Docs For AI (with Claude Code)

📦 Repo Collections

🚀 Setup

📖 Usage via Slash Commands

💡 Usage Example

🏗️ How This Repo Works

👉 Notes to Improve later

Old Idea

New Idea (2026.01.16) — use `llms.txt` + direct fetch

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 220 Commits
.claude		.claude
.vscode		.vscode
biome		biome
claudecode		claudecode
claudeplat		claudeplat
clerk		clerk
convex		convex
lefthook		lefthook
marimo		marimo
nextjs		nextjs
playwright		playwright
scripts		scripts
shadcn		shadcn
shiny		shiny
tailwind		tailwind
tailwindplus		tailwindplus
tests		tests
uv		uv
vercel		vercel
vitest		vitest
x_docs		x_docs
zustand		zustand
.gitattributes		.gitattributes
.gitignore		.gitignore
.markdownlint.yaml		.markdownlint.yaml
.mcp.json		.mcp.json
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
CLAUDE.md		CLAUDE.md
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

michellepace/docs-for-ai

Folders and files

Latest commit

History

Repository files navigation

Curate Docs For AI (with Claude Code)

📦 Repo Collections

🚀 Setup

📖 Usage via Slash Commands

💡 Usage Example

🏗️ How This Repo Works

👉 Notes to Improve later

Old Idea

New Idea (2026.01.16) — use llms.txt + direct fetch

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

New Idea (2026.01.16) — use `llms.txt` + direct fetch

Packages