Skip to content

feat: generate llms.txt for packages from agent instruction files #1378

@lukeocodes

Description

@lukeocodes

Summary

Add an /llms.txt route for every package on npmx.dev that generates an llms.txt file by aggregating well-known agent instruction files found in the package.

Example: https://npmx.dev/package/@deepgram/sdk/llms.txt

This would make npmx.dev the first npm registry browser to support the llms.txt standard — giving AI coding agents structured, LLM-friendly context for any npm package, directly from the registry.

Motivation

AI coding agents (Claude Code, Cursor, Windsurf, Copilot, Codex, etc.) increasingly rely on project-specific instruction files to understand codebases. The ecosystem has fragmented across many conventions:

File Tool
CLAUDE.md Claude Code
AGENTS.md Open standard (60k+ projects)
.cursorrules / .cursor/rules/** Cursor
.windsurfrules / .windsurf/rules/** Windsurf
.github/copilot-instructions.md GitHub Copilot
.clinerules Cline
AGENT.md agent.md standard
README.md Universal (always relevant)

When an agent needs to understand a dependency, it currently has no standard way to discover these files from the registry. The llms.txt specification solves this — it's a markdown file that gives LLMs structured context about a resource, and it's already adopted by 844k+ websites including Anthropic, Cloudflare, and Stripe.

npmx.dev already has all the infrastructure to make this work — file tree listing, CDN file fetching, and README rendering.

Proposed behaviour

URL pattern

/package/{name}/llms.txt           → latest version
/package/{name}/v/{version}/llms.txt → specific version
/{name}/llms.txt                   → short URL (consistent with existing URL patterns)

Generation logic

  1. Fetch the file tree for the package version (existing /api/registry/files/ endpoint)
  2. Scan for well-known agent files in the package root:
    • README.md (always include if present)
    • CLAUDE.md
    • AGENTS.md
    • AGENT.md
    • .github/copilot-instructions.md
    • .cursorrules
    • .windsurfrules
    • .clinerules
  3. Fetch each discovered file from jsDelivr CDN (existing pattern from code viewer)
  4. Assemble into llms.txt format per the specification:
# {package-name}

> {package description from package.json}

- Version: {version}
- License: {license}
- Homepage: {homepage}
- Repository: {repository url}

## README

{raw README.md content}

## Agent Instructions

{content of CLAUDE.md, AGENTS.md, etc. — each with a heading identifying the source file}

Content type

Return text/markdown; charset=utf-8 with appropriate cache headers (versioned content = long cache, latest = short TTL like other package routes).

Technical approach

This fits naturally into the existing architecture:

  • New API route: server/api/registry/llms/[...pkg].get.ts — follows the same parsePackageParams() + defineCachedEventHandler pattern as badge, readme, files, etc.
  • New page route or server route: Either a Nitro server route returning plain text at /package/{name}/llms.txt, or a route rule proxying to the API
  • File discovery: Reuse fetchFileTree() to scan for known filenames, then fetch() from jsDelivr CDN (same as code viewer)
  • Caching: Versioned URLs get maxAge: CACHE_MAX_AGE_ONE_YEAR (immutable). Latest gets ISR 60s (same as package pages)

The implementation is lightweight — it's essentially combining existing file-tree scanning and CDN fetching into a new output format.

Considerations

  • llms-full.txt: The spec also defines /llms-full.txt which inlines all content. For packages, the base llms.txt would already include full content (since we're assembling from individual files), so this may not need a separate route — but worth considering for packages with extensive docs directories.
  • Size limits: Some packages may have very large READMEs. Consider a reasonable size cap (e.g. 500KB, matching the existing code viewer limit).
  • File priority: When multiple agent instruction files exist, include all of them with clear source attribution — agents can decide which is relevant to their tool.
  • Discovery: Consider adding a <link> tag or header pointing to the llms.txt URL from package pages, making it discoverable by agents browsing npmx.dev.

Prior art

  • llmstxt.org — the specification
  • llms-txt-hub — directory of sites implementing llms.txt
  • agents.md — open standard for agent instruction files (60k+ repos)
  • Context7 — MCP server that provides up-to-date library docs to agents

This would position npmx.dev as the go-to tool for AI-assisted development with npm packages.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions