Free OpenAI API proxy with automatic failover, token rotation, and multi-provider support. Deploy your own AI gateway on Cloudflare Workers in minutes - no server costs, unlimited scalability.
Universal AI API proxy that converts Anthropic Claude, Google Gemini, Cloudflare AI, and other providers into OpenAI-compatible endpoints. Perfect for developers who want one API for all AI models with built-in failover and load balancing.
- Anthropic Claude (Claude Opus, Sonnet, Haiku) - streaming + tools
- Google Gemini (Gemini Pro, Flash, Thinking models) - full support
- OpenAI (GPT-4, GPT-4o, o1, o3) - native compatibility
- Cloudflare AI Workers - free tier models
- Custom OpenAI APIs (NVIDIA NIM, Azure, OpenRouter, etc.)
- ✅ Automatic Failover - switches providers when one fails
- 🔑 Token Rotation - cycles through multiple API keys
- 📊 Model-Based Routing - use model names to route requests
- 🌊 Streaming Support - real-time SSE responses
- 🛠️ Function Calling - tools & MCP support
- 🔒 Built-in Auth - secure your proxy with tokens
- 💲 Free Hosting on Cloudflare Workers (100k requests/day)
- ⚡ Zero Latency - edge deployment worldwide
- 🔧 One-Click Deploy via GitHub Actions
- 🌐 CORS Enabled - works directly from browsers
- 📝 Drop-in Replacement for OpenAI SDK
git clone https://github.com/zxcloli666/AI-Worker-Proxy.git
cd AI-Worker-Proxy
npm installGo to Cloudflare Dashboard → Workers → Settings → Variables:
PROXY_AUTH_TOKEN = your-secret-token-123
ANTHROPIC_KEY_1 = sk-ant-xxxxx
GOOGLE_KEY_1 = AIzaxxxxx
OPENAI_KEY_1 = sk-xxxxx
Add GitHub Secrets and push to main:
git push origin main✅ Done! Your proxy is live at https://your-worker.workers.dev
📖 Detailed Setup Guide: See Installation & Configuration below
- 🤖 AI Chatbots with automatic provider fallback
- 📝 Content Generation tools with cost optimization
- 🔍 AI Search using multiple models simultaneously
- 🎨 Creative Apps with model mixing (Claude + GPT-4)
- 📊 Analytics Tools comparing AI model outputs
- 🌐 Browser Extensions with CORS-enabled AI access
- 📱 Mobile Apps using OpenAI SDK → your proxy URL
| Provider | Model Examples | Streaming | Function Calling | Notes |
|---|---|---|---|---|
| Anthropic | claude-opus-4, claude-sonnet-4.5 |
✅ | ✅ | Official SDK |
gemini-2.0-flash, gemini-thinking |
✅ | ✅ | Gemini API | |
| OpenAI | gpt-4o, o1, o3-mini |
✅ | ✅ | Native support |
| Cloudflare AI | @cf/meta/llama-3.1-8b |
✅ | ✅ | Free tier |
| OpenAI-Compatible | NVIDIA NIM, Azure, OpenRouter | ✅ | ✅ | Custom base URL |
- Node.js 18+
- Cloudflare Workers account (free tier works)
- API keys for desired providers
# Install dependencies
npm install
# Create .dev.vars file
cp .dev.vars.example .dev.vars
# Add your keys to .dev.vars
PROXY_AUTH_TOKEN=test-token
ANTHROPIC_KEY_1=sk-ant-xxxxx
# Start dev server
npm run dev-
Add Cloudflare Credentials (GitHub Settings → Secrets):
CLOUDFLARE_API_TOKEN- Get from hereCLOUDFLARE_ACCOUNT_ID- Find on dashboard
-
Add Route Configuration (GitHub Settings → Variables):
- Variable name:
ROUTES_CONFIG - Value:
{ "deep-think": [ { "provider": "anthropic", "model": "claude-opus-4-20250514", "apiKeys": ["ANTHROPIC_KEY_1"] } ], "fast": [ { "provider": "google", "model": "gemini-2.0-flash-exp", "apiKeys": ["GOOGLE_KEY_1"] } ] } - Variable name:
-
Add API Keys (Cloudflare Dashboard → Variables):
PROXY_AUTH_TOKENANTHROPIC_KEY_1GOOGLE_KEY_1- etc.
-
Push to deploy:
git push origin main
npm run deployfrom openai import OpenAI
client = OpenAI(
base_url="https://your-worker.workers.dev/v1",
api_key="your-secret-proxy-token"
)
# Use Claude via "deep-think" model name
response = client.chat.completions.create(
model="deep-think",
messages=[{"role": "user", "content": "Explain quantum computing"}],
stream=True
)
for chunk in response:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")const response = await fetch('https://your-worker.workers.dev/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer your-secret-proxy-token'
},
body: JSON.stringify({
model: 'fast', // Routes to Google Gemini
messages: [
{ role: 'user', content: 'Write a haiku about AI' }
],
stream: false
})
});
const data = await response.json();
console.log(data.choices[0].message.content);curl -X POST https://your-worker.workers.dev/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-secret-proxy-token" \
-d '{
"model": "deep-think",
"messages": [{"role": "user", "content": "Hello AI!"}]
}'tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"}
},
"required": ["location"]
}
}
}
]
response = client.chat.completions.create(
model="deep-think",
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
tools=tools
)Routes map model names to provider chains with automatic failover:
{
"model-name": [
{
"provider": "anthropic",
"model": "claude-opus-4-20250514",
"apiKeys": ["ANTHROPIC_KEY_1", "ANTHROPIC_KEY_2"]
},
{
"provider": "google",
"model": "gemini-2.0-flash-exp",
"apiKeys": ["GOOGLE_KEY_1"]
}
]
}Failover Logic:
- Try
ANTHROPIC_KEY_1→ if fails, tryANTHROPIC_KEY_2 - If all Anthropic keys fail → try Google provider
- If all providers fail → return 500 error
{
"provider": "anthropic",
"model": "claude-opus-4-20250514",
"apiKeys": ["ANTHROPIC_KEY_1"]
}{
"provider": "google",
"model": "gemini-2.0-flash-thinking-exp-01-21",
"apiKeys": ["GOOGLE_KEY_1"]
}{
"provider": "openai",
"model": "gpt-4o",
"apiKeys": ["OPENAI_KEY_1"]
}{
"provider": "openai-compatible",
"baseUrl": "https://integrate.api.nvidia.com/v1",
"model": "nvidia/llama-3.1-nemotron-70b-instruct",
"apiKeys": ["NVIDIA_KEY_1", "NVIDIA_KEY_2"]
}{
"provider": "cloudflare-ai",
"model": "@cf/meta/llama-3.1-8b-instruct",
"apiKeys": []
}Note: Cloudflare AI requires the AI binding in
wrangler.toml:[ai] binding = "AI"
The proxy requires authentication via Authorization header:
Authorization: Bearer your-secret-proxy-token
# or
Authorization: your-secret-proxy-tokenSet your token in Cloudflare Dashboard (Workers → Settings → Variables):
PROXY_AUTH_TOKEN = your-random-secret-123
- ❌ NEVER commit API keys to git
- ✅ Store secrets in Cloudflare Dashboard (persist forever)
- ✅ Store
ROUTES_CONFIGin GitHub Variables (replaced during deploy) - ✅ Use
.dev.varsfor local development (add to.gitignore)
📖 See PRIVATE_CONFIG.md for detailed security guide
AI-Worker-Proxy/
├── src/
│ ├── index.ts # Main worker entry
│ ├── types.ts # TypeScript types
│ ├── router.ts # Route configuration & failover
│ ├── token-manager.ts # Token rotation logic
│ ├── providers/
│ │ ├── base.ts # Base provider interface
│ │ ├── anthropic.ts # Claude provider
│ │ ├── google.ts # Gemini provider
│ │ ├── openai.ts # OpenAI provider
│ │ └── cloudflare-ai.ts # Cloudflare AI provider
│ └── utils/
│ ├── error-handler.ts # Error handling
│ └── response-mapper.ts # OpenAI format conversion
├── .github/workflows/
│ ├── deploy.yml # Auto-deploy workflow
│ └── lint.yml # Code quality checks
├── wrangler.toml # Cloudflare config
└── README.md
npm install # Install dependencies
npm run dev # Start local dev server
npm run deploy # Deploy to Cloudflare
npm run type-check # TypeScript validation
npm run lint # ESLint
npm run format # Prettier- deploy.yml: Auto-deploys to Cloudflare on push to
main - lint.yml: Runs linting and type checking on all branches/PRs
https://your-worker.workers.dev/v1
OpenAI-compatible chat completions endpoint.
Request:
{
"model": "deep-think",
"messages": [
{"role": "user", "content": "Hello!"}
],
"stream": false,
"temperature": 0.7,
"max_tokens": 1000
}Response:
{
"id": "chatcmpl-xxx",
"object": "chat.completion",
"model": "deep-think",
"choices": [{
"message": {
"role": "assistant",
"content": "Hello! How can I help you?"
},
"finish_reason": "stop"
}]
}Health check endpoint (no authentication required).
Response:
{
"status": "ok",
"timestamp": "2025-01-15T10:30:00Z"
}The proxy returns OpenAI-compatible error responses:
{
"error": {
"message": "All providers failed",
"type": "proxy_error",
"code": "provider_failure"
}
}401- Unauthorized (invalid proxy token)404- Model configuration not found429- Rate limit exceeded (all API keys exhausted)500- All providers failed502- Provider unreachable
View logs at Cloudflare Dashboard → Workers → Logs:
- Request/response logs
- Provider failover events
- Token rotation attempts
- Error traces with stack traces
✅ Request: model=deep-think provider=anthropic key=KEY_1
⚠️ Failover: anthropic/KEY_1 → anthropic/KEY_2 (rate limit)
❌ Provider failed: anthropic → trying google
✅ Success: google/KEY_1 responded in 1.2s
- Request/response caching layer
- Per-user rate limiting
- Analytics dashboard (usage, costs, latency)
- Load balancing strategies (round-robin, least-loaded)
- Retry with exponential backoff
- Custom model name mappings
- Response transformation webhooks
- Multi-region deployment
- Cost tracking per API key
- Admin dashboard
Contributions are welcome! Here's how:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Run tests:
npm run lint && npm run type-check - Commit your changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
MIT License - see LICENSE file for details.
- 🐛 Issues: GitHub Issues
- 💡 Discussions: GitHub Discussions
- 📧 Email: Create an issue instead for faster response
If this project helped you, please give it a ⭐️!
openai proxy, ai gateway, api proxy, cloudflare workers ai, anthropic proxy, claude proxy, gemini proxy, multi provider ai, ai load balancer, openai compatible api, ai failover, free ai proxy, serverless ai, ai token rotation, ai api gateway, llm proxy, gpt proxy, free openai alternative
Made with ❤️ for the AI community