fix(gateway): count_tokens 不支持时返回 404 而非伪造的 200#657
Conversation
…kens endpoint PR Wei-Shaw#635 returned HTTP 200 with {"input_tokens": 0} when upstream doesn't support count_tokens (404). This caused Claude Code CLI to trust the zero value, believing context uses 0 tokens, so auto-compression never triggers. Fix: return 404 with proper error body so CLI falls back to its local tokenizer for accurate estimation. Return nil (not error) to avoid polluting ops error metrics with expected 404s. Affected paths: - Passthrough APIKey accounts: upstream 404 now passed through as 404 - Antigravity accounts: same fix (was also returning fake 200)
There was a problem hiding this comment.
Pull request overview
Fixes count_tokens handling so clients (e.g., Claude Code CLI) correctly fall back to local tokenization when the upstream/platform doesn’t support the endpoint, by returning a non-2xx response instead of a fake 200 {"input_tokens":0}.
Changes:
- Return
404+ Anthropic-style error body for Antigravity accounts (and returnnilerror to avoid handler logging/ops context). - For Anthropic API key passthrough, convert upstream
404into a404response (also returningnilerror). - Update passthrough tests to validate the new
404behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
backend/internal/service/gateway_service.go |
Returns 404 for unsupported count_tokens cases (Antigravity + upstream passthrough 404) and avoids propagating an error to the handler. |
backend/internal/service/gateway_anthropic_apikey_passthrough_test.go |
Renames/adjusts the 404 test case to expect 404 instead of 200 fallback behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // 中转站不支持 count_tokens 端点时(404),透传 404 让客户端 fallback 到本地估算。 | ||
| // 返回 nil 避免 handler 层记录为错误,也不设置 ops 上游错误上下文。 | ||
| if resp.StatusCode == http.StatusNotFound { | ||
| logger.LegacyPrintf("service.gateway", | ||
| "[count_tokens] Upstream does not support count_tokens (404), returning fallback: account=%d name=%s msg=%s", | ||
| "[count_tokens] Upstream does not support count_tokens (404), passing through: account=%d name=%s msg=%s", | ||
| account.ID, account.Name, truncateString(upstreamMsg, 512)) | ||
| c.JSON(http.StatusOK, gin.H{"input_tokens": 0}) | ||
| s.countTokensError(c, http.StatusNotFound, "not_found_error", "count_tokens endpoint is not supported by upstream") | ||
| return nil |
There was a problem hiding this comment.
The log/comment wording says the upstream 404 is being “passed through”, but the implementation returns a synthesized Anthropic-style error body via countTokensError (it does not forward the upstream response body/headers). Consider adjusting the wording to “return 404” (or actually passthrough the upstream 404 body if that’s the intent) to avoid confusion when debugging.
| if tt.wantPassthrough { | ||
| // 404 透传:返回 nil(不记录为错误),但 HTTP 状态码是 404 | ||
| require.NoError(t, err) | ||
| require.Equal(t, http.StatusOK, rec.Code) | ||
| require.JSONEq(t, `{"input_tokens":0}`, rec.Body.String()) | ||
| require.Equal(t, http.StatusNotFound, rec.Code) | ||
| } else { |
There was a problem hiding this comment.
The 404 passthrough test only asserts err == nil and the HTTP status code, but it no longer verifies the response body. Since the core behavior change is “return 404 with a proper Anthropic error body”, please assert the JSON envelope (e.g., top-level type: error and error.type: not_found_error) so regressions (empty body / wrong schema) are caught.
|
感谢 review,两个建议都已采纳(e6969acb):
|
|
@alfadb 似乎还是会在运维监控中记录错误,版本 v0.1.87
|
目前是降级为P2异常,非阻断性错误,需要直接修改为忽略吗? |
…through fix(gateway): count_tokens 不支持时返回 404 而非伪造的 200
|
关于 404 错误在运维监控中显示的问题,我已提交 PR #735 将 默认值改为 true 。 这样默认情况下这些预期的 404 错误将不再记录到运维监控,避免干扰。如果用户需要调试该行为,仍可通过配置显式启用错误记录。 |


问题
PR #635 在上游不支持
count_tokens端点(返回 404)时,将响应伪造为HTTP 200 + {"input_tokens": 0}。这导致 Claude Code CLI 信任该零值,认为当前对话消耗 0 个 token,自动压缩永远不会触发,上下文窗口最终会静默溢出。Antigravity 账户也存在同样的问题,同样返回伪造的
{"input_tokens": 0}。根因
Claude Code CLI 仅在
count_tokens返回非 2xx 响应时才会 fallback 到本地 tokenizer 估算。200 + {"input_tokens": 0}会被视为合法的可信结果。修复方案
404+ 标准 Anthropic 错误体,而非伪造200nilerror(而非fmt.Errorf),handler 层不会记录为错误日志,也不设置 ops 上游错误上下文——避免平台错误率虚高IgnoreCountTokensErrors配置提供额外兜底改动
gateway_service.go:Passthrough 路径——上游 404 现在透传为 404gateway_service.go:Antigravity 路径——同样修复