-
Notifications
You must be signed in to change notification settings - Fork 40
Description
What would you like to be added:
We’d like a supported way (ideally upstream, no downstream fork) to observe HTTP/2 connection/health-check behavior without exposing tokens—either by:
redacting sensitive headers in the relevant debug logging path, or
providing a more targeted/safer “connection/health-check debug” mode that does not dump full requests/responses.
If there’s an existing supported knob in 0.20.1 that we missed, we’d love to use it.
If you think a downstream patch is unavoidable, any high-level guidance for maintaining a fork for admiral repo would also be appreciated.
Why is this needed:
We’re running into intermittent service-discovery sync delays and are trying to debug whether lighthouse-agent is holding on to stale/broken HTTP/2 connections to the Kubernetes API server.
Context / what triggered this
- We’re on AKS and observed periods where EndpointSlice updates appear delayed (EndpointSlice/watch processing lag on the control plane), which then delays Lighthouse’s EndpointSlice-driven updates.
- We opened a case with the AKS team. Their suspicion is that the control plane may have delays, but they also asked us to validate whether the client connection might have been broken while the agent continued to use a stale connection (i.e., connection health / keepalive / HTTP/2 behavior).
- AKS suggested we “enable HTTP/2 health check logging” to confirm whether the agent’s HTTP/2 connection is still healthy during the incident window.
Important detail
- As far as we can tell, HTTP/2 health checks are already enabled by default in the agent; our problem is observability: we can’t see useful health-check logs without enabling very verbose HTTP/2 debug output.
What we tried
- We set GODEBUG=http2debug=2 in the lighthouse-agent pod to get HTTP/2 client debug logs.
- This does show health-check activity, but it also logs request/response details broadly (and includes Authorization: Bearer …), which is not acceptable for us because logs are shipped to a centralized system.
Problem
We need a way to troubleshoot/confirm HTTP/2 connection health (especially around health checks and reconnects) without dumping sensitive headers/tokens.