Production-grade system for enforcing AI spending limits in real-time
Features • Quick Start • Architecture • API Reference • Contributing
B2B SaaS companies charge customers flat monthly fees ($500/month) but AI costs are completely variable. Some customers cost $5/month, others cost $500/month in AI expenses. Without real-time enforcement, you discover which customers are unprofitable 30 days later when the bill arrives.
Beam solves this by:
- ⚡ Pre-flight validation: Check if customer can afford a request before it starts
- 🔄 Streaming enforcement: Count tokens and deduct balance in real-time during streaming
- 🛑 Kill switch: Immediately terminate streaming when balance hits zero
- 💯 Perfect reconciliation: Use provider's exact token counts for final billing
- Sub-5ms balance checks via Redis Lua scripts
- Atomic operations prevent race conditions at scale
- Dual storage: Redis for speed, PostgreSQL for durability
- Auto-reconciliation between estimated and actual costs
- Multi-provider support: OpenAI, Anthropic, Google AI
- gRPC API with Protocol Buffers for efficiency
- REST API for easy integration without gRPC clients
- CLI tool for manual operations and testing
- Docker support with docker-compose for instant setup
- TimescaleDB integration for time-series analytics
- Prometheus metrics for observability
- Comprehensive logging with structured JSON output
- Standalone: Works without SDK - direct API calls
- Well documented with examples and guides
- Easy setup: One command to run locally
- Type-safe Protocol Buffer definitions
- Tested: Unit and integration tests included
- Docker & Docker Compose
- Go 1.25+ (for building from source)
- Make (optional, for convenience)
# Clone the repository
git clone https://github.com/kelpejol/beam
cd beam
# Start PostgreSQL and Redis
docker-compose up -d
# Wait for services to be ready (about 10 seconds)
docker-compose ps# Build the binary
make build
# Run the server
./backend/bin/beam-api
# Or use Docker
docker-compose up -d beam-api# Using the CLI tool
./backend/bin/beam-cli balance get --customer-id test_customer_1
# Or with grpcurl
grpcurl -plaintext \
-H "authorization: Bearer beam_test_key_1234567890" \
-d '{"customer_id": "test_customer_1"}' \
localhost:9090 beam.balance.v1.BalanceService/GetBalance
# Or with REST API
curl -H "Authorization: Bearer beam_test_key_1234567890" \
http://localhost:8080/v1/balance/test_customer_1Response:
{
"balance": "100000000",
"reserved": "0",
"available": "100000000"
}┌─────────────────────────────────────────────────────────────────┐
│ YOUR APPLICATION │
└───────────────────────┬─────────────────────────────────────────┘
│
│ gRPC/REST: CheckBalance()
↓
┌─────────────────────────────────────────────────────────────────┐
│ BEAM BACKEND (Go) │
│ │
│ ┌──────────────┐ ┌─────────────┐ ┌──────────────┐ │
│ │ gRPC/REST │───→│ Ledger │───→│ Redis (Hot) │ │
│ │ API │ │ (Atomic) │ │ <1ms ops │ │
│ └──────────────┘ └──────┬──────┘ └──────────────┘ │
│ │ │
│ ↓ │
│ ┌──────────────┐ │
│ │ PostgreSQL │ │
│ │ (Durable) │ │
│ │ +TimescaleDB │ │
│ └──────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
│ Forward to provider after approval
↓
┌─────────────────────────────────────────────────────────────────┐
│ OpenAI / Anthropic / Google AI │
└─────────────────────────────────────────────────────────────────┘
Ledger - The heart of Beam
- Atomic balance operations using Redis Lua scripts
- Prevents race conditions with reservation system
- Automatic reconciliation of estimates vs actuals
Storage Layer
- Redis: Sub-millisecond balance checks, in-memory state
- PostgreSQL: Durable storage with complete audit trail
- TimescaleDB: Time-series optimizations for analytics
API Layer
- gRPC: High-performance binary protocol for production
- REST: HTTP/JSON for easy integration and testing
- CLI: Command-line tool for operations and debugging
-
CheckBalance - Pre-flight validation
- Your app calls Beam before making an AI request
- Beam checks if customer has enough balance
- If yes, reserves grains and returns approval token
- Latency: 2-4ms
-
Make AI Request - Your responsibility
- Your app proceeds to call OpenAI/Anthropic/etc
- Stream the response to your end user
- Count tokens as they arrive
-
DeductTokens - Real-time deduction (optional but recommended)
- Call Beam every ~50 tokens during streaming
- Beam deducts from balance atomically
- If balance hits zero, Beam returns
success: false→ kill the stream - Latency: 1-3ms per call
-
FinalizeRequest - Final reconciliation
- Call once with exact token counts from provider
- Beam reconciles estimated vs actual costs
- Refunds overcharges, releases reservation
- Latency: 3-8ms
Get Balance - Query current balance
GET /v1/balance/:customer_id
Authorization: Bearer <api_key>
Response:
{
"balance": "100000000",
"reserved": "5000000",
"available": "95000000"
}Check Balance - Pre-flight validation
POST /v1/balance/check
Authorization: Bearer <api_key>
Content-Type: application/json
{
"customer_id": "cus_123",
"estimated_grains": 50000,
"buffer_multiplier": 1.2,
"request_id": "req_xyz",
"metadata": {
"model": "gpt-4",
"max_tokens": 1000
}
}
Response:
{
"approved": true,
"remaining_balance": "99950000",
"request_token": "secure_token_xyz",
"reserved_grains": 60000
}Deduct Tokens - Real-time deduction
POST /v1/balance/deduct
Authorization: Bearer <api_key>
Content-Type: application/json
{
"customer_id": "cus_123",
"request_id": "req_xyz",
"request_token": "secure_token_xyz",
"tokens_consumed": 50,
"model": "gpt-4",
"is_completion": true
}
Response:
{
"success": true,
"remaining_balance": "99900000"
}Finalize Request - Final reconciliation
POST /v1/balance/finalize
Authorization: Bearer <api_key>
Content-Type: application/json
{
"customer_id": "cus_123",
"request_id": "req_xyz",
"status": "COMPLETED_SUCCESS",
"actual_prompt_tokens": 234,
"actual_completion_tokens": 487,
"total_actual_cost_grains": 48700,
"model": "gpt-4"
}
Response:
{
"success": true,
"refunded_grains": 11300,
"final_balance": "99911300"
}Full Protocol Buffer definitions in proto/balance/v1/balance.proto
service BalanceService {
rpc CheckBalance(CheckBalanceRequest) returns (CheckBalanceResponse);
rpc DeductTokens(DeductTokensRequest) returns (DeductTokensResponse);
rpc FinalizeRequest(FinalizeRequestRequest) returns (FinalizeRequestResponse);
rpc GetBalance(GetBalanceRequest) returns (GetBalanceResponse);
}# Check balance
beam-cli balance get --customer-id cus_123
# Add balance (credit)
beam-cli balance add --customer-id cus_123 --amount 1000000 --description "Monthly top-up"
# Deduct balance (debit)
beam-cli balance deduct --customer-id cus_123 --amount 50000
# List recent requests
beam-cli requests list --customer-id cus_123 --limit 10
# Show request details
beam-cli requests show --request-id req_xyz
# Create new customer
beam-cli customers create --customer-id cus_new --name "New Customer" --balance 10000000
# Verify balance integrity
beam-cli admin verify-integrity --customer-id cus_123
# Sync Redis from PostgreSQL
beam-cli admin sync-allcustomers - End customers with their balances
CREATE TABLE customers (
customer_id VARCHAR(255) PRIMARY KEY,
platform_user_id VARCHAR(255) NOT NULL,
current_balance_grains BIGINT NOT NULL DEFAULT 0,
lifetime_spent_grains BIGINT NOT NULL DEFAULT 0,
buffer_strategy VARCHAR(20) DEFAULT 'conservative',
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
CONSTRAINT positive_balance CHECK (current_balance_grains >= 0)
);transactions - Append-only ledger (complete audit trail)
CREATE TABLE transactions (
transaction_id VARCHAR(255) PRIMARY KEY,
customer_id VARCHAR(255) NOT NULL,
amount_grains BIGINT NOT NULL, -- Positive=credit, Negative=debit
transaction_type VARCHAR(50) NOT NULL,
reference_id VARCHAR(255),
description TEXT,
created_at TIMESTAMP NOT NULL DEFAULT NOW()
);requests - Detailed AI request tracking
CREATE TABLE requests (
request_id VARCHAR(255) PRIMARY KEY,
customer_id VARCHAR(255) NOT NULL,
model VARCHAR(100) NOT NULL,
estimated_cost_grains BIGINT NOT NULL,
reserved_grains BIGINT NOT NULL,
streaming_deducted_grains BIGINT DEFAULT 0,
actual_cost_grains BIGINT,
status VARCHAR(50) NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
completed_at TIMESTAMP
);Every request requires an API key:
Authorization: Bearer beam_sk_live_xxxxxxxxxxxxx
- Keys are hashed with SHA-256 before storage
- Stored in Redis for sub-millisecond authentication
- Plaintext keys never logged or stored
- Use different API keys for development and production
- Rotate keys regularly
- Enable TLS in production
- Set appropriate rate limits
- Monitor for unusual activity
- CheckBalance: < 5ms (typically 2-4ms)
- DeductTokens: < 3ms (typically 1-2ms)
- FinalizeRequest: < 10ms (typically 3-8ms)
- 10,000+ concurrent requests per server
- 100,000+ balance checks/second with horizontal scaling
- Sub-millisecond Redis operations via Lua scripts
Horizontal Scaling
# Run multiple instances
docker-compose up -d --scale beam-api=3
# Use load balancer (nginx, haproxy, etc)
# Configure health checks on /health endpointRedis Scaling
- Single Redis handles 100k+ operations/second
- If needed, shard by customer_id using Redis Cluster
- Use Redis Sentinel for high availability
PostgreSQL Scaling
- Read replicas for analytics queries
- Single primary for writes (sufficient for most use cases)
- Connection pooling prevents bottlenecks
beam/
├── backend/
│ ├── cmd/
│ │ ├── api/ # Main server (gRPC + REST)
│ │ └── cli/ # CLI tool
│ ├── internal/
│ │ ├── api/ # gRPC service implementation
│ │ ├── rest/ # REST API handlers
│ │ ├── auth/ # API key authentication
│ │ ├── ledger/ # Core balance logic
│ │ └── sync/ # Redis-PostgreSQL sync
│ ├── pkg/proto/ # Generated protobuf code
│ └── migrations/ # Database migrations
├── scripts/
│ ├── lua/ # Redis Lua scripts
│ └── load-test.js # k6 load testing script
├── docs/ # Detailed documentation
├── docker-compose.yml # Local development environment
├── Dockerfile # Production Docker image
└── Makefile # Build automation
# Install dependencies
go mod download
# Generate protobuf code (requires protoc)
make proto
# Build all binaries
make build
# Binaries created at:
# - backend/bin/beam-api
# - backend/bin/beam-cli# Unit tests
make test
# Integration tests (requires Docker)
make test-integration
# Test coverage report
make test-coverage
# Benchmark tests
make benchmark# 1. Start infrastructure
docker-compose up -d postgres redis
# 2. Run server in dev mode (with auto-reload)
make dev
# 3. In another terminal, test the API
./backend/bin/beam-cli balance get --customer-id test_customer_1
# 4. View logs
docker-compose logs -f beam-api
# 5. Clean up
make cleanComplete example workflow:
# 1. Check initial balance
curl -H "Authorization: Bearer beam_test_key_1234567890" \
http://localhost:8080/v1/balance/test_customer_1
# 2. Pre-flight check (reserve grains)
curl -X POST -H "Authorization: Bearer beam_test_key_1234567890" \
-H "Content-Type: application/json" \
-d '{
"customer_id": "test_customer_1",
"estimated_grains": 50000,
"buffer_multiplier": 1.2,
"request_id": "req_test_'$(date +%s)'",
"metadata": {"model": "gpt-4"}
}' \
http://localhost:8080/v1/balance/check
# Save the request_token from response, then:
# 3. Simulate streaming deductions (repeat as needed)
curl -X POST -H "Authorization: Bearer beam_test_key_1234567890" \
-H "Content-Type: application/json" \
-d '{
"customer_id": "test_customer_1",
"request_id": "req_test_'$(date +%s)'",
"request_token": "YOUR_TOKEN_HERE",
"tokens_consumed": 50,
"model": "gpt-4",
"is_completion": true
}' \
http://localhost:8080/v1/balance/deduct
# 4. Finalize with exact costs
curl -X POST -H "Authorization: Bearer beam_test_key_1234567890" \
-H "Content-Type: application/json" \
-d '{
"customer_id": "test_customer_1",
"request_id": "req_test_'$(date +%s)'",
"status": "COMPLETED_SUCCESS",
"actual_prompt_tokens": 234,
"actual_completion_tokens": 487,
"total_actual_cost_grains": 48700,
"model": "gpt-4"
}' \
http://localhost:8080/v1/balance/finalize
# 5. Verify final balance
curl -H "Authorization: Bearer beam_test_key_1234567890" \
http://localhost:8080/v1/balance/test_customer_1# Install k6
brew install k6 # macOS
# or: https://k6.io/docs/getting-started/installation
# Run load test
k6 run scripts/load-test.js
# Custom scenario
k6 run --vus 100 --duration 30s scripts/load-test.jsComprehensive guides available in docs/:
- API Reference - Complete API documentation with examples
- Architecture Deep Dive - System design and decisions
- Integration Guide - How to integrate Beam into your app
- Operations Guide - Production deployment and monitoring
- Performance Tuning - Optimization strategies
- Database Guide - Schema details and queries
We love contributions! See CONTRIBUTING.md for guidelines.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Test thoroughly (
make test) - Commit (
git commit -m 'Add amazing feature') - Push (
git push origin feature/amazing-feature) - Open a Pull Request
- Follow Effective Go guidelines
- Write tests for new features (maintain >80% coverage)
- Update documentation for API changes
- Use conventional commit messages
- Add examples for new features
- None! (Please report any issues you find)
v1.0 (Current)
- ✅ Core balance operations
- ✅ gRPC and REST APIs
- ✅ CLI tool
- ✅ Docker support
- ✅ TimescaleDB integration
v1.1 (Planned)
- WebSocket API for real-time balance updates
- GraphQL API
- Multi-region deployment support
- Advanced analytics dashboard
v2.0 (Future)
- Multi-currency support
- Automatic cost optimization recommendations
- Machine learning for cost prediction
- Stripe/payment provider integrations
This project is licensed under the MIT License - see the LICENSE file for details.
- ✅ Commercial use allowed
- ✅ Modification allowed
- ✅ Distribution allowed
- ✅ Private use allowed
⚠️ No warranty provided⚠️ No liability
Built with amazing open source tools:
- Go - Efficient, concurrent backend language
- gRPC - High-performance RPC framework
- Protocol Buffers - Type-safe serialization
- Redis - Lightning-fast in-memory database
- PostgreSQL - Rock-solid relational database
- TimescaleDB - Time-series superpowers for PostgreSQL
- zerolog - Zero-allocation structured logging
Special thanks to all contributors and users!
- GitHub Issues: Report bugs
- GitHub Discussions: Ask questions
- Documentation: Read the docs
- Examples: See examples
If you find Beam useful, please consider giving it a star on GitHub! It helps others discover the project.
Made with ⚡ by developers, for developers