diff --git a/x402-proxy-template/AGENTS.md b/x402-proxy-template/AGENTS.md index 0228f703f..ae5cb82cb 100644 --- a/x402-proxy-template/AGENTS.md +++ b/x402-proxy-template/AGENTS.md @@ -6,91 +6,141 @@ Context for AI coding agents to help users set up x402-proxy for payment-gated c A Cloudflare Worker that adds payment gating to any origin using the x402 protocol. Users pay to access protected routes, then get a JWT cookie valid for 1 hour. ---- +**With Bot Management Filtering:** Requires Bot Management for Enterprise to enable bot filtering. With it enabled, x402-proxy can implement "default closed" - humans pass through free, only bots must pay. This is optional and enhances the base functionality. -## Deployment Modes +--- -There are three deployment modes. The choice depends on your origin and whether an existing worker already owns the target domain. +## Interactive Setup Flow -### 1. Standard Proxy Mode (DNS-Based) +When a user asks for help setting up x402-proxy, follow this discovery process: -- x402-proxy owns the route (e.g., `api.example.com/*`) -- All traffic flows through x402-proxy -- Protected paths require payment, others pass through to DNS origin -- **Use when:** Origin is a traditional server (VM, container) with DNS pointing to it +### Step 1: Verify Cloudflare Authentication -``` -User → x402-proxy (owns route) → Origin Server (via Cloudflare DNS) +```bash +npx wrangler whoami ``` -### 2. External Origin Mode +If not logged in, guide them to run `npx wrangler login`. -- x402-proxy owns the route and proxies to an external URL -- No code changes needed to the existing worker/service -- **Use when:** Origin is an external API, or an existing worker you don't want to modify +If they have multiple accounts, note them for Step 2. -``` -User → x402-proxy (owns route) → External Service (via ORIGIN_URL) -``` +--- -### 3. Service Binding Mode +### Step 2: Select Domain -- x402-proxy calls the origin worker directly via [Service Binding](https://developers.cloudflare.com/workers/runtime-apis/bindings/service-bindings/) -- Zero network overhead - both workers run on the same thread -- Origin worker doesn't need a public route -- **Use when:** Origin is another Worker in your account and you want optimal performance +Ask: **"Which domain do you want to add payment gating to?"** -``` -User → x402-proxy (owns route) → Origin Worker (via Service Binding) -``` +If the user has multiple Cloudflare accounts (from Step 1), also ask: **"Which account is this domain on?"** + +**Save the domain** - it scopes everything that follows. --- -## Interactive Setup Flow +### Step 3: Check for Bot Management (Optional Enhancement) -When a user asks for help setting up x402-proxy, follow this discovery process: +Ask: **"Do you have Bot Management enabled on `{domain}`?"** -### Step 1: Verify Cloudflare Authentication +Explain why you're asking: -```bash -npx wrangler whoami -``` +> With Bot Management, x402-proxy can implement "default closed" - blocking bot traffic by score threshold while letting humans through automatically. You can also make specific exceptions for bots like Googlebot or verified AI crawlers. +> +> Without Bot Management, x402-proxy still works perfectly - it just charges for protected routes without distinguishing between bots and humans. All traffic to protected routes must pay. -If not logged in, guide them to run `npx wrangler login`. +| Answer | Effect | +| ------- | ------------------------------------------------- | +| **Yes** | Enable Bot Management Filtering prompts in Step 4 | +| **No** | Skip threshold/exception prompts in Step 4 | -### Step 2: Discover Existing Workers +--- -```bash -npx wrangler deployments list -``` +### Step 4: Configure Protected Paths (Iterative) -This shows what workers are already deployed in the account. +Ask: **"What path on `{domain}` do you want to charge for?"** -### Step 3: Check Existing Routes +If the user provides multiple paths at once, queue them and configure each in sequence. -```bash -npx wrangler routes list --zone +**For EACH path, ask:** + +#### 4.1 Price + +Ask: **"What price (in USD) for `{path}`?"** + +Format: `$0.01`, `$0.10`, `$1.00`, etc. + +#### 4.2 Description + +Ask: **"What description for `{path}`?"** (shown to users explaining what they're paying for) + +Example: "Access to premium content for 1 hour" + +--- + +#### If User Has Bot Management (from Step 3): + +Continue with these additional prompts: + +#### 4.3 Bot Score Threshold + +Ask: **"What bot score threshold for `{path}`?"** + +**ALWAYS offer exactly these three options:** + +| Option | Threshold | What it means | +| -------------------- | --------- | -------------------------------------------------------------- | +| **1** | 1 | Very strict - only verified humans pass free | +| **2** | 2 | Strict - only clear human traffic passes free | +| **30 (Recommended)** | 30 | Balanced - likely automated traffic must pay, humans pass free | + +**Recommended: 30** - This is the typical starting point that blocks likely-automated traffic while letting humans through free. + +#### 4.4 Bot Exceptions + +Ask: **"Any bots that should get FREE access to `{path}`?"** + +**Offer these preset options:** + +| Preset | Bots Included | Use When | +| ------------------------- | ------------------------------------------------------------------------ | ---------------------------- | +| **Googlebot + BingBot** | Googlebot, BingBot | Allow major crawlers | +| **Above + AI assistants** | Above + ChatGPT-User, Claude-User, Perplexity-User, Meta-ExternalFetcher | Allow AI assistant citations | +| **None** | (empty) | All bots must pay | + +If the user selects a preset or names specific bots: + +1. Look up each bot name in the Bot Registry (see below) +2. Resolve to detection IDs +3. Write to config with inline comments + +**Example resolution:** + +- User says: "Googlebot and BingBot" +- Agent looks up: Googlebot → 120623194, BingBot → 117479730 +- Config output: + +```jsonc +"except_detection_ids": [ + 120623194, // Googlebot + 117479730 // BingBot +] ``` -This reveals if another worker already owns routes on that domain. +--- -**Decision guide:** +#### After Configuring Each Path -| Situation | Recommended Mode | -| ------------------------------------------- | ---------------------- | -| Origin is traditional server (VM/container) | Standard Proxy Mode | -| Origin is external API or service | External Origin Mode | -| Origin is another Worker in your account | Service Binding Mode | -| Existing worker owns `domain/*` | External Origin Mode\* | +Ask: **"Any more paths on `{domain}` to protect?"** -\*For existing workers with source code available, you can use Service Binding Mode for better performance. +- If **yes** → repeat Step 4 for the next path +- If **no** → continue to Step 5 -### Step 4: Gather Required Config +--- + +### Step 5: Wallet & Network Configuration -1. **Wallet address (PAY_TO)?** - Where payments go -2. **Which paths need payment?** - e.g., `/premium/*`, `/api/paid/*` -3. **Price for each path?** - e.g., `$0.01`, `$0.10` -4. **Network?** - `base-sepolia` (testing) or `base` (production) +Ask these together: + +1. **"What wallet address should receive payments (PAY_TO)?"** +2. **"Which network: `base-sepolia` (testing) or `base` (production)?"** #### If User Doesn't Have a Wallet Address @@ -105,190 +155,276 @@ For production, they'll need a real wallet: --- -## Standard Proxy Mode Setup +## Deployment Phase + +Now that configuration is complete, discover infrastructure and deploy. + +### Step 6: Discover Existing Workers & Routes + +```bash +npx wrangler deployments list +npx wrangler routes list --zone {domain} +``` + +This reveals: + +- What workers are already deployed +- If another worker owns routes on the target domain + +**Determine deployment mode:** + +| Situation | Recommended Mode | +| ------------------------------------------- | ---------------------- | +| Origin is traditional server (VM/container) | Standard Proxy Mode | +| Origin is external API or service | External Origin Mode | +| Origin is another Worker in your account | Service Binding Mode | +| Existing worker owns `domain/*` | External Origin Mode\* | + +\*For existing workers with source code available, you can use Service Binding Mode for better performance. + +--- + +### Step 7: Generate wrangler.jsonc -Use this when no existing worker owns the target domain. +Based on gathered information, generate the complete configuration. -### Step 1: Configure wrangler.jsonc +**Example - Basic (no Bot Management Filtering):** ```jsonc { - "routes": [{ "pattern": "api.example.com/*", "zone_name": "example.com" }], + "routes": [ + { "pattern": "example.com/premium/*", "zone_name": "example.com" }, + ], "vars": { "PAY_TO": "0x000000000000000000000000000000000000dEaD", "NETWORK": "base-sepolia", "PROTECTED_PATTERNS": [ { "pattern": "/premium/*", - "price": "$0.01", - "description": "Premium access for 1 hour", + "price": "$0.10", + "description": "Access to premium content for 1 hour", }, ], }, } ``` -### Step 2: Set JWT Secret +**Example - With Bot Management Filtering:** + +Requires Bot Management for Enterprise to enable bot filtering. + +```jsonc +{ + "routes": [ + { "pattern": "example.com/premium/*", "zone_name": "example.com" }, + ], + "vars": { + "PAY_TO": "0x000000000000000000000000000000000000dEaD", + "NETWORK": "base-sepolia", + "PROTECTED_PATTERNS": [ + { + "pattern": "/premium/*", + "price": "$0.10", + "description": "Access to premium content for 1 hour", + "bot_score_threshold": 30, + "except_detection_ids": [ + 120623194, // Googlebot + 117479730, // BingBot + ], + }, + ], + }, +} +``` + +--- + +### Step 8: Set JWT Secret ```bash node -e "console.log(require('crypto').randomBytes(32).toString('hex'))" | npx wrangler secret put JWT_SECRET ``` -### Step 3: Deploy +--- + +### Step 9: Deploy ```bash npm run deploy ``` -### Step 4: Verify +--- + +### Step 10: Verify ```bash -curl https://api.example.com/__x402/health +curl https://{domain}/__x402/health # Should return: {"status":"ok","timestamp":...} + +curl https://{domain}/__x402/config +# Should show protected patterns and Bot Management Filtering status ``` --- -## External Origin Mode Setup +## Deployment Modes -Use this when an existing worker already owns the domain and you don't want to modify its code. +### Standard Proxy Mode (DNS-Based) -### Step 1: Find the Existing Worker's workers.dev URL +- x402-proxy owns the route (e.g., `api.example.com/*`) +- All traffic flows through x402-proxy +- Protected paths require payment, others pass through to DNS origin +- **Use when:** Origin is a traditional server (VM, container) with DNS pointing to it -```bash -npx wrangler deployments list +``` +User → x402-proxy (owns route) → Origin Server (via Cloudflare DNS) ``` -Look for the existing worker's name. Its URL will be: `https://..workers.dev` - -### Step 2: Remove the Route from the Existing Worker +### External Origin Mode -Edit the existing worker's `wrangler.toml` or `wrangler.jsonc` to remove/comment out the route, then redeploy it: +- x402-proxy owns the route and proxies to an external URL +- No code changes needed to the existing worker/service +- **Use when:** Origin is an external API, or an existing worker you don't want to modify -```bash -npx wrangler deploy +``` +User → x402-proxy (owns route) → External Service (via ORIGIN_URL) ``` -The worker is now only accessible via its `workers.dev` URL. - -### Step 3: Configure x402-proxy with ORIGIN_URL +**Setup:** Add `ORIGIN_URL` to vars: ```jsonc -{ - "routes": [{ "pattern": "api.example.com/*", "zone_name": "example.com" }], - "vars": { - "ORIGIN_URL": "https://my-existing-worker.myaccount.workers.dev", - "PAY_TO": "0x000000000000000000000000000000000000dEaD", - "NETWORK": "base-sepolia", - "PROTECTED_PATTERNS": [ - { - "pattern": "/premium/*", - "price": "$0.01", - "description": "Premium access for 1 hour", - }, - ], - }, -} +"ORIGIN_URL": "https://my-existing-worker.myaccount.workers.dev" ``` -### Step 4: Set JWT Secret and Deploy +### Service Binding Mode -```bash -node -e "console.log(require('crypto').randomBytes(32).toString('hex'))" | npx wrangler secret put JWT_SECRET -npm run deploy +- x402-proxy calls the origin worker directly via [Service Binding](https://developers.cloudflare.com/workers/runtime-apis/bindings/service-bindings/) +- Zero network overhead - both workers run on the same thread +- Origin worker doesn't need a public route +- **Use when:** Origin is another Worker in your account and you want optimal performance + +``` +User → x402-proxy (owns route) → Origin Worker (via Service Binding) ``` -### Step 5: Verify +**Setup:** Add services binding: -```bash -curl https://api.example.com/__x402/health -curl https://api.example.com/premium/content # Should return 402 -curl https://api.example.com/public/content # Should proxy to original worker +```jsonc +"services": [{ "binding": "ORIGIN_SERVICE", "service": "my-origin-worker" }] ``` --- -## Service Binding Mode Setup - -Use this when the origin is another Worker in your account and you want the fastest possible performance. +## Bot Management Filtering Reference -### Step 1: Ensure Origin Worker is Deployed +Requires Bot Management for Enterprise to enable bot filtering. When enabled, users can configure payment exemptions based on bot score and detection IDs. -The origin worker must be deployed to your account. It doesn't need any routes - Service Bindings work without public access. +### How It Works -```bash -npx wrangler deployments list +``` +Request arrives at protected route + │ + ▼ + Bot Management Filtering configured? + ┌────────┴────────┐ + No Yes + │ │ + ▼ ▼ + All traffic Check bot score & exceptions + must pay │ + ┌────┴────┐ + │ │ + Human OR Bot (not excepted) + Excepted Bot │ + │ ▼ + ▼ Check cookie/payment + Pass FREE │ + to origin Valid? → Proxy + set cookie + Invalid? → Return 402 ``` -### Step 2: Configure x402-proxy with Service Binding +### Bot Score Threshold Reference -```jsonc -{ - "routes": [{ "pattern": "api.example.com/*", "zone_name": "example.com" }], - "services": [{ "binding": "ORIGIN_SERVICE", "service": "my-origin-worker" }], - "vars": { - "PAY_TO": "0x000000000000000000000000000000000000dEaD", - "NETWORK": "base-sepolia", - "PROTECTED_PATTERNS": [ - { - "pattern": "/premium/*", - "price": "$0.01", - "description": "Premium access for 1 hour", - }, - ], - }, -} -``` +| Threshold | Meaning | Use Case | +| --------- | ------------------------------------------------------ | ---------------------------------------- | +| **1** | Very strict - only verified humans pass free | Maximum monetization | +| **2** | Strict - only clear human traffic passes free | High-value APIs | +| **30** | Balanced - likely automated must pay, humans pass free | **Recommended** - typical starting point | -**Key points:** +### Bot Registry Reference -- `binding`: Must be `"ORIGIN_SERVICE"` (this is what x402-proxy looks for) -- `service`: The deployed name of your origin worker +When configuring bot exceptions, use this registry to resolve bot names to detection IDs. -### Step 3: Set JWT Secret and Deploy +**Included operators:** Google, Microsoft, OpenAI, Anthropic, Perplexity, Meta -```bash -node -e "console.log(require('crypto').randomBytes(32).toString('hex'))" | npx wrangler secret put JWT_SECRET -npm run deploy -``` +For additional bots, users can find detection IDs in the Cloudflare dashboard: +AI Crawl Control → Crawlers → Actions menu → Copy detection ID -### Step 4: Verify +#### Google -```bash -curl https://api.example.com/__x402/health -curl https://api.example.com/__x402/config -# Should show: "hasOriginService": true -``` +| Bot Name | Detection ID | Notes | +| --------------------- | ------------ | ------------------ | +| Googlebot | 120623194 | Google Search | +| Google-CloudVertexBot | 133730073 | Google AI training | ---- +#### Microsoft -## Route Migration +| Bot Name | Detection ID | Notes | +| -------- | ------------ | -------------- | +| BingBot | 117479730 | Microsoft Bing | -When switching between modes, you may encounter route ownership conflicts. +#### OpenAI -### Understanding Route Ownership +| Bot Name | Detection ID | Notes | +| ------------- | ------------ | ---------------------- | +| GPTBot | 123815556 | OpenAI training | +| ChatGPT-User | 132995013 | ChatGPT browsing mode | +| ChatGPT agent | 129220581 | ChatGPT agents/plugins | +| OAI-SearchBot | 126255384 | OpenAI SearchGPT | -- Routes can only be owned by one worker at a time -- A route must be deleted from the old worker before the new worker can claim it -- Custom domains and routes are different mechanisms (check which one your worker uses) +#### Anthropic (Heuristics IDs) -### Migrating Routes Between Workers +| Bot Name | Heuristics ID | Notes | +| ---------------- | ------------- | ------------------ | +| ClaudeBot | 33563859 | Anthropic training | +| Claude-SearchBot | 33564301 | Anthropic search | +| Claude-User | 33564303 | Claude web access | -**Via wrangler.jsonc:** Remove the route from the old worker's config and redeploy. +#### Perplexity (Heuristics IDs) -**Via Dashboard:** +| Bot Name | Heuristics ID | Notes | +| --------------- | ------------- | ------------------ | +| PerplexityBot | 33563889 | Perplexity search | +| Perplexity-User | 33564371 | Perplexity answers | -1. Go to Workers & Pages → your worker -2. Click "Triggers" tab -3. Remove the route +#### Meta -See [Workers Routes documentation](https://developers.cloudflare.com/workers/configuration/routing/routes/) for more details. +| Bot Name | Detection ID | Notes | +| -------------------- | ---------------------- | ----------------- | +| Meta-ExternalAgent | 124581738 | Meta AI training | +| Meta-ExternalFetcher | 132272919 | Meta AI assistant | +| FacebookBot | (heuristics: 33563972) | Meta crawling | -### Order of Operations to Minimize Downtime +### Example Preset -1. Deploy new worker (without routes) and verify it works via `workers.dev` -2. Delete route from old worker -3. Immediately redeploy new worker with routes +```jsonc +"except_detection_ids": [ + 120623194, // Googlebot + 117479730, // BingBot + 132995013, // ChatGPT-User + 33564303 // Claude-User +] +``` + +### Finding Custom Detection IDs + +If a bot isn't in the registry, the user can find its detection ID in the dashboard: + +1. Go to **AI Crawl Control** in Cloudflare dashboard +2. Navigate to **Crawlers** +3. Find the crawler in the list +4. Click the **three dot menu** in the Actions column +5. Copy the detection ID --- @@ -301,7 +437,20 @@ See [Workers Routes documentation](https://developers.cloudflare.com/workers/con | `PAY_TO` | Wallet address to receive payments | | `NETWORK` | `"base-sepolia"` (test) or `"base"` (production) | | `JWT_SECRET` | Secret for signing tokens (64 hex chars) | -| `PROTECTED_PATTERNS` | Array of `{pattern, price, description}` | +| `PROTECTED_PATTERNS` | Array of protected route configurations | + +### Protected Pattern Schema + +```typescript +{ + pattern: string; // Route to protect (e.g., "/premium/*") + price: string; // Price in USD (e.g., "$0.01") + description: string; // Shown to users + // Bot Management Filtering (optional) + bot_score_threshold?: number; // 1, 2, or 30 + except_detection_ids?: number[]; // Bot detection IDs to allow free +} +``` ### Optional Variables @@ -314,7 +463,7 @@ See [Workers Routes documentation](https://developers.cloudflare.com/workers/con ### Debug Endpoints - `/__x402/health` - Health check -- `/__x402/config` - Current config (no secrets exposed) +- `/__x402/config` - Current config (no secrets exposed, shows Bot Management Filtering status) - `/__x402/protected` - Test payment flow ($0.01) ### Origin Auto-Detection @@ -329,6 +478,36 @@ x402-proxy automatically detects how to reach the origin: --- +## Route Migration + +When switching between modes, you may encounter route ownership conflicts. + +### Understanding Route Ownership + +- Routes can only be owned by one worker at a time +- A route must be deleted from the old worker before the new worker can claim it +- Custom domains and routes are different mechanisms (check which one your worker uses) + +### Migrating Routes Between Workers + +**Via wrangler.jsonc:** Remove the route from the old worker's config and redeploy. + +**Via Dashboard:** + +1. Go to Workers & Pages → your worker +2. Click "Triggers" tab +3. Remove the route + +See [Workers Routes documentation](https://developers.cloudflare.com/workers/configuration/routing/routes/) for more details. + +### Order of Operations to Minimize Downtime + +1. Deploy new worker (without routes) and verify it works via `workers.dev` +2. Delete route from old worker +3. Immediately redeploy new worker with routes + +--- + ## Common Issues ### "A route with the same pattern already exists" [code: 10020] @@ -394,6 +573,30 @@ The template includes static assets in the `public/` directory for standalone de // "assets": { "directory": "public" }, ``` +### Bot Management Filtering: "cf.botManagement not available" + +This warning appears in logs when Bot Management Filtering is configured but Bot Management data isn't present in the request. + +**Causes:** + +- Bot Management for Enterprise is not enabled +- Request is from local development (Bot Management not available locally) + +**Fix:** + +- Enable Bot Management for Enterprise in the Cloudflare dashboard +- For local testing, the warning is expected - filtering will work after deployment + +### Bot Management Filtering: Humans still getting 402 + +Check that: + +1. `bot_score_threshold` is set (e.g., 30) +2. The request actually has a bot score > threshold +3. Bot Management for Enterprise is enabled + +Use `npx wrangler tail` to see bot scores in logs after deployment. + --- ## Testing Locally @@ -407,31 +610,7 @@ curl http://localhost:8787/__x402/health # Should return 200 curl http://localhost:8787/__x402/protected # Should return 402 ``` ---- - -## Architecture - -``` -User Request → x402-proxy (owns route) - | - Is path in PROTECTED_PATTERNS? - / \ - No Yes - | | - Proxy to Check cookie/payment - origin | - Valid? → Proxy + set cookie - Invalid? → Return 402 -``` - -**Cookie flow after payment:** - -1. User requests protected path without valid cookie -2. x402-proxy returns 402 with payment requirements -3. User submits payment via X-PAYMENT header -4. x402 middleware verifies payment with facilitator -5. x402-proxy generates JWT, sets cookie, proxies to origin -6. Subsequent requests include cookie - no payment needed for 1 hour +**Note:** Bot Management data is not available in local development. Bot Management Filtering will only work after deployment to Cloudflare. --- @@ -446,6 +625,12 @@ Before running `npm run deploy`, verify: - [ ] `PROTECTED_PATTERNS` configured with correct paths and prices - [ ] `routes` configured with correct pattern and zone_name +**If using Bot Management Filtering:** + +- [ ] Bot Management for Enterprise is enabled +- [ ] `bot_score_threshold` is set on relevant patterns (typically 30) +- [ ] `except_detection_ids` are resolved from bot names + --- ## Additional Resources @@ -454,3 +639,5 @@ Before running `npm run deploy`, verify: - [Service Bindings](https://developers.cloudflare.com/workers/runtime-apis/bindings/service-bindings/) - Worker-to-Worker communication - [Custom Domains](https://developers.cloudflare.com/workers/configuration/routing/custom-domains/) - Alternative to routes - [Wrangler Commands](https://developers.cloudflare.com/workers/wrangler/commands/) - CLI reference for discovery +- [x402 Protocol](https://x402.org) - Payment protocol specification +- [Bot Management](https://developers.cloudflare.com/bots/) - Cloudflare Bot Management documentation diff --git a/x402-proxy-template/README.md b/x402-proxy-template/README.md index 6a8820df0..82bb704df 100644 --- a/x402-proxy-template/README.md +++ b/x402-proxy-template/README.md @@ -521,6 +521,38 @@ Cookies are configured with security best practices: - Payment amount validation - Network/token validation +## Bot Management Filtering (Optional) + +With **Bot Management for Enterprise** enabled on your domain, x402-proxy can distinguish between human and automated traffic: + +- Humans can access protected routes without payment +- Bots are charged unless explicitly exempted +- You can allow specific crawlers (e.g., Googlebot, search engines) free access + +### Configuration Example + +```jsonc +"PROTECTED_PATTERNS": [ + { + "pattern": "/api/premium/*", + "price": "$0.10", + "description": "Premium API access", + "bot_score_threshold": 30, // Lower score = more likely automated + "except_detection_ids": [ + 120623194, // Googlebot + 132995013 // ChatGPT-User + ] + } +] +``` + +The configuration uses two settings: + +- `bot_score_threshold` (1-99) - determines the cutoff for blocking bot traffic and allowing humans through. See [Bot Score](https://developers.cloudflare.com/bots/concepts/bot-score/) for how scores are calculated. +- `except_detection_ids` - array of bot detection IDs to whitelist. A sample list is available in [`src/bots.ts`](./src/bots.ts). + +Without Bot Management, all traffic to protected routes requires payment. + ## Deployment ### Production Deployment diff --git a/x402-proxy-template/src/auth.ts b/x402-proxy-template/src/auth.ts index 13ca59963..ecb33a45f 100644 --- a/x402-proxy-template/src/auth.ts +++ b/x402-proxy-template/src/auth.ts @@ -59,6 +59,12 @@ export interface ProtectedRouteConfig { price: string; /** Human-readable description of what the payment is for */ description: string; + /** + * Bot Management Filtering (optional) + * Requires Bot Management for Enterprise. See src/bot-management/ for details. + */ + bot_score_threshold?: number; + except_detection_ids?: number[]; } /** diff --git a/x402-proxy-template/src/bot-management/index.ts b/x402-proxy-template/src/bot-management/index.ts new file mode 100644 index 000000000..a7635db05 --- /dev/null +++ b/x402-proxy-template/src/bot-management/index.ts @@ -0,0 +1,76 @@ +/** + * Bot Management Filtering (Optional) + * Requires Bot Management for Enterprise. + * + * Non-Bot Management users can ignore this entire directory. + * + * When Bot Management data is available and bot_score_threshold is configured: + * - Humans (bot score > threshold) pass through FREE + * - Excepted bots (detection ID in except_detection_ids) pass through FREE + * - All other traffic must pay + */ + +import type { ProtectedRouteConfig } from "../auth"; + +/** + * Check if request has a Bot Management exception (human or excepted bot). + * Returns false if Bot Management filtering is not configured or data unavailable. + * + * @param request - The incoming request (to extract cf.botManagement) + * @param config - Protected route configuration + * @returns true if request should bypass payment, false if payment required + */ +export function hasBotManagementException( + request: Request, + config: ProtectedRouteConfig +): boolean { + // No threshold configured = no Bot Management filtering (all traffic must pay) + if (config.bot_score_threshold === undefined) { + return false; + } + + // Access Bot Management data via Cloudflare's cf object + // Types are defined in worker-configuration.d.ts (IncomingRequestCfProperties) + const cf = (request as { cf?: IncomingRequestCfProperties }).cf; + const botManagement = cf?.botManagement; + + // No Bot Management data = can't evaluate, require payment + if (!botManagement) { + console.warn( + "[x402-proxy] Bot Management Filtering configured but cf.botManagement not available. " + + "Requires Bot Management for Enterprise. Falling back to payment requirement." + ); + return false; + } + + const botScore = botManagement.score; + + // No score available = can't evaluate, require payment (safe default) + if (botScore === undefined || botScore === null) { + console.warn( + "[x402-proxy] Bot Management data available but score is missing. " + + "Falling back to payment requirement." + ); + return false; + } + + const detectionIds = botManagement.detectionIds ?? []; + + // Check 1: Is this a human? (bot score ABOVE threshold) + if (botScore > config.bot_score_threshold) { + return true; // Human - bypass payment + } + + // Check 2: Is this an excepted bot? (detection ID in exception list) + if (config.except_detection_ids && config.except_detection_ids.length > 0) { + const isExcepted = detectionIds.some((id) => + config.except_detection_ids!.includes(id) + ); + if (isExcepted) { + return true; // Excepted bot - bypass payment + } + } + + // Neither human nor excepted bot - require payment + return false; +} diff --git a/x402-proxy-template/src/bot-management/reference.ts b/x402-proxy-template/src/bot-management/reference.ts new file mode 100644 index 000000000..1146081fa --- /dev/null +++ b/x402-proxy-template/src/bot-management/reference.ts @@ -0,0 +1,134 @@ +/** + * x402-proxy Template - Bot Registry (Agent Reference Only) + * + * Used by AI agents during setup to resolve bot names to detection IDs. + * NOT used at runtime - IDs are resolved during setup and stored in wrangler.jsonc. + * + * INCLUDED OPERATORS: Google, Microsoft, OpenAI, Anthropic, Perplexity, Meta + * + * For additional bots, find detection IDs in Cloudflare dashboard: + * AI Crawl Control > Crawlers > Actions menu > Copy detection ID + */ + +export interface BotEntry { + name: string; + operator: string; + category: string; + detectionIds: number[]; +} + +export const BOTS: Record = { + // ========================================================================= + // GOOGLE + // ========================================================================= + Googlebot: { + name: "Googlebot", + operator: "Google", + category: "Search Engine Crawler", + detectionIds: [120623194, 33554459], + }, + "Google-CloudVertexBot": { + name: "Google-CloudVertexBot", + operator: "Google", + category: "AI Crawler", + detectionIds: [133730073, 33564321], + }, + + // ========================================================================= + // MICROSOFT + // ========================================================================= + BingBot: { + name: "BingBot", + operator: "Microsoft", + category: "Search Engine Crawler", + detectionIds: [117479730, 33554461], + }, + + // ========================================================================= + // OPENAI + // ========================================================================= + GPTBot: { + name: "GPTBot", + operator: "OpenAI", + category: "AI Crawler", + detectionIds: [123815556, 33563875], + }, + "ChatGPT agent": { + name: "ChatGPT agent", + operator: "OpenAI", + category: "AI Assistant", + detectionIds: [129220581], + }, + "ChatGPT-User": { + name: "ChatGPT-User", + operator: "OpenAI", + category: "AI Assistant", + detectionIds: [132995013, 33563857], + }, + "OAI-SearchBot": { + name: "OAI-SearchBot", + operator: "OpenAI", + category: "AI Search", + detectionIds: [126255384, 33563986], + }, + + // ========================================================================= + // ANTHROPIC + // ========================================================================= + ClaudeBot: { + name: "ClaudeBot", + operator: "Anthropic", + category: "AI Crawler", + detectionIds: [33563859], + }, + "Claude-SearchBot": { + name: "Claude-SearchBot", + operator: "Anthropic", + category: "AI Search", + detectionIds: [33564301], + }, + "Claude-User": { + name: "Claude-User", + operator: "Anthropic", + category: "AI Assistant", + detectionIds: [33564303], + }, + + // ========================================================================= + // PERPLEXITY + // ========================================================================= + PerplexityBot: { + name: "PerplexityBot", + operator: "Perplexity", + category: "AI Search", + detectionIds: [33563889], + }, + "Perplexity-User": { + name: "Perplexity-User", + operator: "Perplexity", + category: "AI Assistant", + detectionIds: [33564371], + }, + + // ========================================================================= + // META + // ========================================================================= + FacebookBot: { + name: "FacebookBot", + operator: "Meta", + category: "AI Crawler", + detectionIds: [33563972], + }, + "Meta-ExternalAgent": { + name: "Meta-ExternalAgent", + operator: "Meta", + category: "AI Crawler", + detectionIds: [124581738, 33563982], + }, + "Meta-ExternalFetcher": { + name: "Meta-ExternalFetcher", + operator: "Meta", + category: "AI Assistant", + detectionIds: [132272919, 33563980], + }, +}; diff --git a/x402-proxy-template/src/index.ts b/x402-proxy-template/src/index.ts index 9265472b0..3c1316c8b 100644 --- a/x402-proxy-template/src/index.ts +++ b/x402-proxy-template/src/index.ts @@ -2,6 +2,7 @@ import { Hono } from "hono"; import { setCookie } from "hono/cookie"; import { createProtectedRoute, type ProtectedRouteConfig } from "./auth"; import { generateJWT } from "./jwt"; +import { hasBotManagementException } from "./bot-management"; import type { AppContext, Env } from "./env"; const app = new Hono(); @@ -139,6 +140,14 @@ app.use("*", async (c, next) => { // Check if this path is protected (including /__x402/protected) const protectedConfig = findProtectedRouteConfig(path, protectedPatterns); if (protectedConfig) { + // Bot Management Filtering: check if request has exception (human or excepted bot) + if (hasBotManagementException(c.req.raw, protectedConfig)) { + if (path === "/__x402/protected") { + return next(); + } + return proxyToOrigin(c.req.raw, c.env); + } + // Ensure JWT_SECRET is configured before processing protected routes if (!c.env.JWT_SECRET) { return c.json( @@ -256,12 +265,27 @@ app.get("/__x402/health", (c) => { * Useful for debugging and verifying deployment */ app.get("/__x402/config", (c) => { + const patterns = (c.env.PROTECTED_PATTERNS || []) as ProtectedRouteConfig[]; + const botFilteringEnabled = patterns.some( + (p) => p.bot_score_threshold !== undefined + ); + return c.json({ network: c.env.NETWORK, payTo: c.env.PAY_TO ? `***${c.env.PAY_TO.slice(-6)}` : null, hasOriginUrl: !!c.env.ORIGIN_URL, hasOriginService: !!c.env.ORIGIN_SERVICE, - protectedPatterns: c.env.PROTECTED_PATTERNS?.map((p) => p.pattern) || [], + protectedPatterns: patterns.map((p) => ({ + pattern: p.pattern, + botManagementFiltering: + p.bot_score_threshold !== undefined + ? { + threshold: p.bot_score_threshold, + exceptionsCount: p.except_detection_ids?.length ?? 0, + } + : null, + })), + botManagementFiltering: botFilteringEnabled, }); }); diff --git a/x402-proxy-template/wrangler.jsonc b/x402-proxy-template/wrangler.jsonc index bf9d9926a..864bb48c6 100644 --- a/x402-proxy-template/wrangler.jsonc +++ b/x402-proxy-template/wrangler.jsonc @@ -72,12 +72,42 @@ // // After payment, users get a JWT cookie valid for 1 hour. // + // ───────────────────────────────────────────────────────────────────── + // BOT MANAGEMENT FILTERING (Optional) + // Requires Bot Management for Enterprise to enable bot filtering. + // ───────────────────────────────────────────────────────────────────── + // With Bot Management enabled, you can add: + // - bot_score_threshold: Score at or below which payment is required + // - except_detection_ids: Detection IDs of bots that access FREE + // + // This enables "default closed" - humans pass free, bots must pay. + // "PROTECTED_PATTERNS": [ + // ───────────────────────────────────────────────────────────────── + // Example: Basic - All traffic must pay + // ───────────────────────────────────────────────────────────────── { "pattern": "/premium/*", "price": "$0.01", "description": "Access to premium content for 1 hour", }, + + // ───────────────────────────────────────────────────────────────── + // Example: Bot Management Filtering + // Requires Bot Management for Enterprise to enable bot filtering. + // ───────────────────────────────────────────────────────────────── + // { + // "pattern": "/content/*", + // "price": "$0.25", + // "description": "Content access for 1 hour", + // "bot_score_threshold": 30, + // "except_detection_ids": [ + // 120623194, // Googlebot + // 117479730, // BingBot + // 132995013, // ChatGPT-User + // 33564303 // Claude-User + // ] + // } ], // ===================================================================== // ORIGIN_URL - External origin URL (OPTIONAL)