-
Notifications
You must be signed in to change notification settings - Fork 21
🌱 Otel auto instrumentation extensions #105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
Ladas
wants to merge
10
commits into
kagenti:main
Choose a base branch
from
Ladas:phoenix-autoinstrumentation
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Implements comprehensive OTEL observability for weather service agent with
Phoenix integration, baggage propagation, and GenAI semantic conventions.
## Changes
### New Module: observability.py
- ObservabilityConfig: Reads OTEL config from environment variables
- setup_observability(): Configures OTEL tracer with proper resource attributes
- Baggage propagation functions for context tracking (user_id, request_id, etc.)
- Extract baggage from HTTP headers
- Phoenix project routing via resource attributes
### Updated Dependencies (pyproject.toml)
- opentelemetry-api>=1.20.0
- opentelemetry-sdk>=1.20.0
- opentelemetry-exporter-otlp>=1.20.0
- opentelemetry-exporter-otlp-proto-grpc>=1.20.0
- opentelemetry-instrumentation>=0.41b0
- openinference-semantic-conventions>=0.1.0
### Updated __init__.py
- Replace basic tracer setup with comprehensive setup_observability()
- Ensures OpenInference instrumentation loads before agent code
### Updated agent.py
- Remove duplicate LangChainInstrumentor() call (now in observability.py)
- Extract baggage context from request headers
- Set baggage for each request (user_id, request_id, task_id, etc.)
- Log trace info for debugging
## Features
✅ Auto-instrumentation with OpenInference (LangChain LLM, Tool, Chain spans)
✅ OTEL baggage propagation across all services
✅ Phoenix project routing (team1-agents, etc.)
✅ GenAI semantic conventions compliance
✅ K8s metadata via resource attributes
✅ Comprehensive logging for observability debugging
## Resource Attributes
All traces include:
- service.name: weather-service
- service.namespace: {namespace}
- k8s.namespace.name: {namespace}
- phoenix.project.name: {namespace}-agents
- deployment.environment: kind-local
## Baggage Context
Baggage propagates across all spans:
- user_id: User identifier
- request_id: Unique request ID
- task_id: A2A task ID
- context_id: A2A context ID
## Testing
To test locally:
1. Deploy with OTEL environment variables set
2. Send request with user-id and request-id headers
3. Check Phoenix UI for traces with baggage attributes
## Related
- Part of Phoenix Integration (TODO_PHOENIX_INTEGRATION.md)
- Phase 1: Agent Instrumentation
- Prepares for E2E tests in Phase 2
Signed-off-by: Claude Code AI Assistant <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
- Replace baggage.get_current() with context.get_current() (API fix) - Add configurable OTLP exporter supporting both gRPC and HTTP protocols - Default to HTTP/protobuf for wider compatibility - Update default endpoint to use correct kagenti-system namespace and port 4318 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>
The OTEL Collector in kagenti-system is configured to listen on port 8335 (via --set override), not the standard 4318. Updated the default endpoint in observability.py to match. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>
- Add create_agent_span() for root AGENT span with OI attributes - Wrap graph execution with using_attributes for session/user tracking - Add a2a.task_id, a2a.context_id, user.id to spans for filtering - Set input.value and output.value on agent spans This ensures all LangChain auto-instrumented spans are properly nested under an AGENT span and have session/user context. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
The previous commit removed these imports but they're still used by set_baggage_context function. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Use OpenInference-compatible tracer name "openinference.instrumentation.agent" so that root AGENT spans pass through the OTEL Collector's filter/phoenix processor which only allows spans from "openinference.instrumentation.*". Previously used "weather_service.observability" which was filtered out. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add as_root=True parameter to create_agent_span() - Use empty Context() to break parent inheritance from A2A SDK telemetry - Ensures weather_agent_task appears as root span in Phoenix UI 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Instead of breaking parent context in create_agent_span(), we'll allow a2a.utils.telemetry spans through the OTEL Collector filter. This preserves the complete trace hierarchy. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Phase 1 of trace propagation implementation: 1. observability.py: - Configure W3C Trace Context and Baggage propagators - Add extract_trace_context() for incoming HTTP headers - Add inject_trace_context() for outgoing HTTP calls - Add trace_context_from_headers() context manager 2. agent.py: - Wrap entire execute method with trace_context_from_headers() - All spans now become children of incoming traceparent - Enables proper parent-child relationships across A2A calls This enables: - Single connected trace from HTTP request through LLM calls - Multi-agent call flows with proper trace hierarchy - Phoenix visibility into complete request flow 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
WIP: experiment with otel auto instrumentation for weather agent
Related issue(s)
Relates to kagenti/kagenti#436