🌱 Otel auto instrumentation extensions #105

Ladas · 2025-12-03T09:22:59Z

Summary

WIP: experiment with otel auto instrumentation for weather agent

Related issue(s)

Implements comprehensive OTEL observability for weather service agent with Phoenix integration, baggage propagation, and GenAI semantic conventions. ## Changes ### New Module: observability.py - ObservabilityConfig: Reads OTEL config from environment variables - setup_observability(): Configures OTEL tracer with proper resource attributes - Baggage propagation functions for context tracking (user_id, request_id, etc.) - Extract baggage from HTTP headers - Phoenix project routing via resource attributes ### Updated Dependencies (pyproject.toml) - opentelemetry-api>=1.20.0 - opentelemetry-sdk>=1.20.0 - opentelemetry-exporter-otlp>=1.20.0 - opentelemetry-exporter-otlp-proto-grpc>=1.20.0 - opentelemetry-instrumentation>=0.41b0 - openinference-semantic-conventions>=0.1.0 ### Updated __init__.py - Replace basic tracer setup with comprehensive setup_observability() - Ensures OpenInference instrumentation loads before agent code ### Updated agent.py - Remove duplicate LangChainInstrumentor() call (now in observability.py) - Extract baggage context from request headers - Set baggage for each request (user_id, request_id, task_id, etc.) - Log trace info for debugging ## Features ✅ Auto-instrumentation with OpenInference (LangChain LLM, Tool, Chain spans) ✅ OTEL baggage propagation across all services ✅ Phoenix project routing (team1-agents, etc.) ✅ GenAI semantic conventions compliance ✅ K8s metadata via resource attributes ✅ Comprehensive logging for observability debugging ## Resource Attributes All traces include: - service.name: weather-service - service.namespace: {namespace} - k8s.namespace.name: {namespace} - phoenix.project.name: {namespace}-agents - deployment.environment: kind-local ## Baggage Context Baggage propagates across all spans: - user_id: User identifier - request_id: Unique request ID - task_id: A2A task ID - context_id: A2A context ID ## Testing To test locally: 1. Deploy with OTEL environment variables set 2. Send request with user-id and request-id headers 3. Check Phoenix UI for traces with baggage attributes ## Related - Part of Phoenix Integration (TODO_PHOENIX_INTEGRATION.md) - Phase 1: Agent Instrumentation - Prepares for E2E tests in Phase 2 Signed-off-by: Claude Code AI Assistant <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Signed-off-by: Ladislav Smola <lsmola@redhat.com>

- Replace baggage.get_current() with context.get_current() (API fix) - Add configurable OTLP exporter supporting both gRPC and HTTP protocols - Default to HTTP/protobuf for wider compatibility - Update default endpoint to use correct kagenti-system namespace and port 4318 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

The OTEL Collector in kagenti-system is configured to listen on port 8335 (via --set override), not the standard 4318. Updated the default endpoint in observability.py to match. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

- Add create_agent_span() for root AGENT span with OI attributes - Wrap graph execution with using_attributes for session/user tracking - Add a2a.task_id, a2a.context_id, user.id to spans for filtering - Set input.value and output.value on agent spans This ensures all LangChain auto-instrumented spans are properly nested under an AGENT span and have session/user context. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

The previous commit removed these imports but they're still used by set_baggage_context function. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Use OpenInference-compatible tracer name "openinference.instrumentation.agent" so that root AGENT spans pass through the OTEL Collector's filter/phoenix processor which only allows spans from "openinference.instrumentation.*". Previously used "weather_service.observability" which was filtered out. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Add as_root=True parameter to create_agent_span() - Use empty Context() to break parent inheritance from A2A SDK telemetry - Ensures weather_agent_task appears as root span in Phoenix UI 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Instead of breaking parent context in create_agent_span(), we'll allow a2a.utils.telemetry spans through the OTEL Collector filter. This preserves the complete trace hierarchy. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Phase 1 of trace propagation implementation: 1. observability.py: - Configure W3C Trace Context and Baggage propagators - Add extract_trace_context() for incoming HTTP headers - Add inject_trace_context() for outgoing HTTP calls - Add trace_context_from_headers() context manager 2. agent.py: - Wrap entire execute method with trace_context_from_headers() - All spans now become children of incoming traceparent - Enables proper parent-child relationships across A2A calls This enables: - Single connected trace from HTTP request through LLM calls - Multi-agent call flows with proper trace hierarchy - Phoenix visibility into complete request flow 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Ladas and others added 10 commits November 27, 2025 15:43

Update uv.lock with OpenTelemetry dependencies

6714c97

Signed-off-by: Ladislav Smola <lsmola@redhat.com>

🐛 Fix missing context and baggage imports in observability

c70217b

The previous commit removed these imports but they're still used by set_baggage_context function. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Ladas marked this pull request as draft December 3, 2025 09:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🌱 Otel auto instrumentation extensions #105

🌱 Otel auto instrumentation extensions #105

Uh oh!

Ladas commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

🌱 Otel auto instrumentation extensions #105

Are you sure you want to change the base?

🌱 Otel auto instrumentation extensions #105

Uh oh!

Conversation

Ladas commented Dec 3, 2025

Summary

Related issue(s)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant