Skip to content

fix(gateway): count_tokens 不支持时返回 404 而非伪造的 200#657

Merged
Wei-Shaw merged 2 commits intoWei-Shaw:mainfrom
alfadb:fix/count-tokens-404-passthrough
Feb 27, 2026
Merged

fix(gateway): count_tokens 不支持时返回 404 而非伪造的 200#657
Wei-Shaw merged 2 commits intoWei-Shaw:mainfrom
alfadb:fix/count-tokens-404-passthrough

Conversation

@alfadb
Copy link
Contributor

@alfadb alfadb commented Feb 26, 2026

问题

PR #635 在上游不支持 count_tokens 端点(返回 404)时,将响应伪造为 HTTP 200 + {"input_tokens": 0}。这导致 Claude Code CLI 信任该零值,认为当前对话消耗 0 个 token,自动压缩永远不会触发,上下文窗口最终会静默溢出。

Antigravity 账户也存在同样的问题,同样返回伪造的 {"input_tokens": 0}

根因

Claude Code CLI 仅在 count_tokens 返回非 2xx 响应时才会 fallback 到本地 tokenizer 估算。200 + {"input_tokens": 0} 会被视为合法的可信结果。

修复方案

  • 返回 404 + 标准 Anthropic 错误体,而非伪造 200
  • 返回 nil error(而非 fmt.Errorf),handler 层不会记录为错误日志,也不设置 ops 上游错误上下文——避免平台错误率虚高
  • ops 错误记录器中已有的 IgnoreCountTokensErrors 配置提供额外兜底

改动

  • gateway_service.go:Passthrough 路径——上游 404 现在透传为 404
  • gateway_service.go:Antigravity 路径——同样修复
  • 更新测试以验证 404 透传行为

…kens endpoint

PR Wei-Shaw#635 returned HTTP 200 with {"input_tokens": 0} when upstream doesn't
support count_tokens (404). This caused Claude Code CLI to trust the zero
value, believing context uses 0 tokens, so auto-compression never triggers.

Fix: return 404 with proper error body so CLI falls back to its local
tokenizer for accurate estimation. Return nil (not error) to avoid
polluting ops error metrics with expected 404s.

Affected paths:
- Passthrough APIKey accounts: upstream 404 now passed through as 404
- Antigravity accounts: same fix (was also returning fake 200)
Copilot AI review requested due to automatic review settings February 26, 2026 15:37
@alfadb alfadb changed the title fix(gateway): return 404 instead of fake 200 for unsupported count_tokens endpoint fix(gateway): count_tokens 不支持时返回 404 而非伪造的 200 Feb 26, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes count_tokens handling so clients (e.g., Claude Code CLI) correctly fall back to local tokenization when the upstream/platform doesn’t support the endpoint, by returning a non-2xx response instead of a fake 200 {"input_tokens":0}.

Changes:

  • Return 404 + Anthropic-style error body for Antigravity accounts (and return nil error to avoid handler logging/ops context).
  • For Anthropic API key passthrough, convert upstream 404 into a 404 response (also returning nil error).
  • Update passthrough tests to validate the new 404 behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
backend/internal/service/gateway_service.go Returns 404 for unsupported count_tokens cases (Antigravity + upstream passthrough 404) and avoids propagating an error to the handler.
backend/internal/service/gateway_anthropic_apikey_passthrough_test.go Renames/adjusts the 404 test case to expect 404 instead of 200 fallback behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 6226 to 6233
// 中转站不支持 count_tokens 端点时(404),透传 404 让客户端 fallback 到本地估算。
// 返回 nil 避免 handler 层记录为错误,也不设置 ops 上游错误上下文。
if resp.StatusCode == http.StatusNotFound {
logger.LegacyPrintf("service.gateway",
"[count_tokens] Upstream does not support count_tokens (404), returning fallback: account=%d name=%s msg=%s",
"[count_tokens] Upstream does not support count_tokens (404), passing through: account=%d name=%s msg=%s",
account.ID, account.Name, truncateString(upstreamMsg, 512))
c.JSON(http.StatusOK, gin.H{"input_tokens": 0})
s.countTokensError(c, http.StatusNotFound, "not_found_error", "count_tokens endpoint is not supported by upstream")
return nil
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log/comment wording says the upstream 404 is being “passed through”, but the implementation returns a synthesized Anthropic-style error body via countTokensError (it does not forward the upstream response body/headers). Consider adjusting the wording to “return 404” (or actually passthrough the upstream 404 body if that’s the intent) to avoid confusion when debugging.

Copilot uses AI. Check for mistakes.
Comment on lines +348 to 352
if tt.wantPassthrough {
// 404 透传:返回 nil(不记录为错误),但 HTTP 状态码是 404
require.NoError(t, err)
require.Equal(t, http.StatusOK, rec.Code)
require.JSONEq(t, `{"input_tokens":0}`, rec.Body.String())
require.Equal(t, http.StatusNotFound, rec.Code)
} else {
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 404 passthrough test only asserts err == nil and the HTTP status code, but it no longer verifies the response body. Since the core behavior change is “return 404 with a proper Anthropic error body”, please assert the JSON envelope (e.g., top-level type: error and error.type: not_found_error) so regressions (empty body / wrong schema) are caught.

Copilot uses AI. Check for mistakes.
@alfadb
Copy link
Contributor Author

alfadb commented Feb 26, 2026

感谢 review,两个建议都已采纳(e6969acb):

  1. 日志措辞:从 "passing through" 改为 "returning 404",更准确地描述实际行为(返回合成的 Anthropic 错误体,而非透传上游原始响应)。

  2. 测试响应体断言:补充了 JSON 结构验证,确认返回 {"type":"error","error":{"type":"not_found_error",...}} 格式。

@Wei-Shaw Wei-Shaw merged commit 1941b20 into Wei-Shaw:main Feb 27, 2026
4 checks passed
@alfadb alfadb deleted the fix/count-tokens-404-passthrough branch February 27, 2026 00:56
@PMExtra
Copy link
Contributor

PMExtra commented Mar 2, 2026

@alfadb 似乎还是会在运维监控中记录错误,版本 v0.1.87

image

@alfadb
Copy link
Contributor Author

alfadb commented Mar 2, 2026

@alfadb 似乎还是会在运维监控中记录错误,版本 v0.1.87

image

目前是降级为P2异常,非阻断性错误,需要直接修改为忽略吗?

@PMExtra
Copy link
Contributor

PMExtra commented Mar 2, 2026

@alfadb 似乎还是会在运维监控中记录错误,版本 v0.1.87

目前是降级为P2异常,非阻断性错误,需要直接修改为忽略吗?

因为这个404错误是预期之中的且会高频触发,我觉得记录大量的404日志会对运维造成不必要的干扰。

所以我个人是希望直接忽略这个错误,或者至少在某个地方提供忽略的开关。

@Wei-Shaw 您有什么意见吗?

xuebkgithub pushed a commit to xuebkgithub/sub2api that referenced this pull request Mar 3, 2026
…through

fix(gateway): count_tokens 不支持时返回 404 而非伪造的 200
@alfadb
Copy link
Contributor Author

alfadb commented Mar 3, 2026

关于 404 错误在运维监控中显示的问题,我已提交 PR #735 将 默认值改为 true 。

这样默认情况下这些预期的 404 错误将不再记录到运维监控,避免干扰。如果用户需要调试该行为,仍可通过配置显式启用错误记录。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants