A complete observability stack for monitoring Claude Code usage, costs, and performance using OpenTelemetry, Prometheus, Loki, and Grafana.
make upThis starts:
- OpenTelemetry Collector (ports 4317/4318) - receives telemetry from Claude Code
- Prometheus (port 9090) - stores metrics
- Loki (port 3100) - stores logs/events
- Grafana (port 3000) - visualizes everything
Option A: Use the setup script (each terminal session)
source setup-env.sh
claudeOption B: Create a convenience alias (recommended)
Add this to your ~/.zshrc or ~/.bashrc:
alias claude-telemetry='source /path/to/claude_grafana/setup-env.sh && claude'Then use:
claude- Normal Claude Code (no telemetry)claude-telemetry- Claude Code with telemetry enabled
This gives you control over when telemetry is collected without the overhead of permanent environment variables.
Option C: Set environment variables manually
export CLAUDE_CODE_ENABLE_TELEMETRY=1
export OTEL_METRICS_EXPORTER=otlp
export OTEL_LOGS_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
claudeOpen http://localhost:3000 (login: admin/admin)
When you're done, exit Claude normally. The Docker containers will continue running in the background. To stop them:
make downThis command can be run from any terminal as long as you're in the project root directory. Use make clean instead if you want to remove all collected data.
The pre-configured dashboard includes:
- Total sessions, cost, tokens, lines changed, commits, PRs
- Active time tracking (CLI and user time)
- Productivity ratio and leverage metrics
- Cost by model over time
- Token usage by type (input/output/cache read/creation)
- Token distribution by model
- Cache efficiency gauge (98%+ is typical)
- Cost per 1K output tokens
- Lines of code added/removed
- Git commits and pull requests
- Active time distribution
- CLI vs user time comparison
- Peak productivity leverage
- Real-time Claude Code event logs (via Loki)
make up # Start all services
make down # Stop all services
make restart # Restart all services
make status # Show service status
make logs # View all logs
make logs-collector # View OTel Collector logs
make clean # Remove containers and volumes
make validate # Validate configuration files
make setup # Show setup instructionsClaude Code ──OTLP──▶ OTel Collector ──▶ Prometheus (metrics)
│
└──────────▶ Loki (logs/events)
│
Grafana ◀──────────┘
| Metric | Description |
|---|---|
claude_code_session_count_total |
CLI sessions started |
claude_code_cost_usage_total |
Cost in USD by model |
claude_code_token_usage_total |
Tokens by type (input/output/cache) |
claude_code_lines_of_code_count_total |
Lines added/removed |
claude_code_commit_count_total |
Git commits |
claude_code_pull_request_count_total |
Pull requests created |
claude_code_code_edit_tool_decision_total |
Tool accept/reject decisions |
# Enable user prompt logging (disabled by default)
export OTEL_LOG_USER_PROMPTS=1export OTEL_METRICS_INCLUDE_SESSION_ID=true # default: true
export OTEL_METRICS_INCLUDE_VERSION=false # default: false
export OTEL_METRICS_INCLUDE_ACCOUNT_UUID=true # default: trueexport OTEL_RESOURCE_ATTRIBUTES="department=engineering,team.id=platform"Edit setup-env.sh:
export OTEL_METRIC_EXPORT_INTERVAL=60000 # 60 seconds (production)
export OTEL_LOGS_EXPORT_INTERVAL=5000 # 5 secondsPlace JSON dashboard files in config/grafana/dashboards/
Edit config/otel-collector-config.yaml
- Check collector is running:
make status - View collector logs:
make logs-collector - Verify environment:
echo $CLAUDE_CODE_ENABLE_TELEMETRY - Try console exporter first:
export OTEL_METRICS_EXPORTER=console
Claude Code sends delta temporality metrics, but Prometheus requires cumulative. The OTel Collector config includes a deltatocumulative processor to handle this conversion. If you see errors like:
Exporting failed. Dropping data. error: invalid temporality and type combination
Ensure your collector config includes:
- The
deltatocumulativeprocessor defined in the processors section - The processor added to the metrics pipeline:
processors: [memory_limiter, deltatocumulative, batch] - OTel Collector version 0.111.0+ (earlier versions don't include this processor)
- Ensure you've used Claude Code after enabling telemetry
- Check time range in Grafana (top right)
- Wait for export interval to elapse (default: 60s for metrics)
MIT
