An MCP server that provides eight tools: a fast web search powered by SearXNG, five Crawl4AI-powered tools (web_fetch, web_screenshot, web_pdf, web_execute_js, web_crawl), and two Wayback Machine tools (web_snapshots, web_archive).
graph LR
Client["MCP Client<br/>(Claude, Cursor, etc.)"] -->|web_search| Server["MCP Server"]
Client -->|web_fetch| Server
Client -->|web_screenshot| Server
Client -->|web_pdf| Server
Client -->|web_execute_js| Server
Client -->|web_crawl| Server
Client -->|web_snapshots| Server
Client -->|web_archive| Server
Server --> SearXNG
SearXNG --> Redis
Server --> Crawl4AI
Server --> Wayback["Wayback Machine"]
The web_search tool queries SearXNG for search results. The Crawl4AI tools handle content extraction, screenshots, PDFs, JS execution, and multi-URL crawling. The Wayback Machine tools list and retrieve archived pages.
The full stack deploys as 4 services: Redis, SearXNG, Crawl4AI, and this MCP server.
The server exposes eight MCP tools:
Lightweight web search via SearXNG. Returns structured results.
| Parameter | Type | Description |
|---|---|---|
query |
string (required) | The search query |
limit |
number (optional) | Max results to return (default: 10, max: 20) |
Returns a JSON array of { url, title, description } results.
Fetch a single URL and return its content as clean markdown via Crawl4AI.
| Parameter | Type | Description |
|---|---|---|
url |
string (required) | URL to fetch |
f |
enum (optional) | Content-filter strategy: raw, fit, bm25, or llm (default: fit) |
q |
string (optional) | Query string for BM25/LLM filters |
Returns the page content as markdown.
Capture a full-page PNG screenshot of a URL via Crawl4AI.
| Parameter | Type | Description |
|---|---|---|
url |
string (required) | URL to screenshot |
screenshot_wait_for |
number (optional) | Seconds to wait before capture (default: 2) |
Returns a base64-encoded PNG image.
Generate a PDF document of a URL via Crawl4AI.
| Parameter | Type | Description |
|---|---|---|
url |
string (required) | URL to convert to PDF |
Returns a base64-encoded PDF.
Execute JavaScript snippets on a URL via Crawl4AI and return the full crawl result.
| Parameter | Type | Description |
|---|---|---|
url |
string (required) | URL to execute scripts on |
scripts |
string[] (required) | List of JavaScript snippets to execute in order |
Returns the full CrawlResult JSON including markdown, links, media, and JS execution results.
Crawl one or more URLs and extract their content using Crawl4AI.
| Parameter | Type | Description |
|---|---|---|
urls |
string[] (required) | List of URLs to crawl |
browser_config |
object (optional) | Crawl4AI browser configuration |
crawler_config |
object (optional) | Crawl4AI crawler configuration |
Returns the extracted content from each URL.
List Wayback Machine snapshots for a URL.
| Parameter | Type | Description |
|---|---|---|
url |
string (required) | URL to check for snapshots |
from |
string (optional) | Start date in YYYYMMDD format |
to |
string (optional) | End date in YYYYMMDD format |
limit |
number (optional) | Max number of snapshots to return (default: 100) |
match_type |
enum (optional) | URL matching: exact, prefix, host, or domain (default: exact) |
filter |
string[] (optional) | CDX API filters (e.g. ["statuscode:200", "mimetype:text/html"]) |
Returns a JSON array of snapshots with timestamps, status codes, and archive URLs.
Retrieve an archived page from the Wayback Machine.
| Parameter | Type | Description |
|---|---|---|
url |
string (required) | URL of the page to retrieve |
timestamp |
string (required) | Timestamp in YYYYMMDDHHMMSS format |
original |
boolean (optional) | Get original content without Wayback Machine banner (default: false) |
Returns the archived page content.
All examples below assume your server is running at https://your-server.up.railway.app/mcp with an API key. Replace the URL and key with your own values.
claude mcp add web_tools \
--transport http \
https://your-server.up.railway.app/mcp \
--header "Authorization: Bearer your-api-key"Add to .mcp.json at the root of any project to make the tool available to all collaborators:
{
"mcpServers": {
"web_tools": {
"type": "http",
"url": "https://your-server.up.railway.app/mcp",
"headers": {
"Authorization": "Bearer your-api-key"
}
}
}
}{
"mcpServers": {
"web_tools": {
"type": "http",
"url": "https://your-server.up.railway.app/mcp",
"headers": {
"Authorization": "Bearer your-api-key"
}
}
}
}By default, Claude Code uses its own WebSearch and WebFetch tools. You can replace them with this server's web_search and web_fetch tools for privacy-respecting, self-hosted results.
1. Add the MCP server globally:
claude mcp add web_tools --scope user \
--transport http \
https://your-server.up.railway.app/mcp \
--header "Authorization: Bearer your-api-key"2. Disable the built-in tools by editing ~/.claude/settings.json:
{
"permissions": {
"deny": ["WebSearch", "WebFetch"]
}
}3. Guide Claude via ~/.claude/CLAUDE.md so it uses your tools:
## Search & Fetch
- Use the web_search MCP tool for all web searches
- Use the web_fetch MCP tool to fetch and read web pages
- Do not attempt to use the built-in WebSearch or WebFetch tools4. Verify by running /mcp inside Claude Code to check the server is connected, then ask Claude to search for something or fetch a URL.
- Click Deploy on Railway: you'll see all 4 services listed (Redis, SearXNG, Crawl4AI, MCP Server)
- Click Deploy: Railway provisions everything and wires the services together automatically
- An
API_KEYis auto-generated during deployment. Find it in your MCP Server service's Variables tab and use it as your Bearer token
git clone https://github.com/arnaudjnn/web-tools
cd web-tools
pnpm installcp .env.example .env.localdocker compose up -d redis searxng crawl4aiThis starts Redis, SearXNG, and Crawl4AI. Then run the MCP server:
SEARXNG_URL=http://localhost:8080 CRAWL4AI_URL=http://localhost:11235 pnpm run startThe server is available at http://localhost:3000/mcp.
docker compose upThe API_KEY environment variable is required.
On Railway, the key is auto-generated at deploy time (via ${{secret()}}). For local development, set it in your .env.local file.
Clients provide the key as a Bearer token in the Authorization header (shown in the examples above) or as an ?api_key= query parameter. The /health endpoint is unauthenticated.
MIT