switchrpc: improve SendOnion error handling#10545
switchrpc: improve SendOnion error handling#10545calvinrzachman wants to merge 4 commits intolightningnetwork:elle-base-branch-payment-servicefrom
Conversation
The new structure uses a top-level `oneof` to provide a compile-time distinction between a successful payment (preimage) and a failed one. Additional information on a failed attempt can be found in FailureDetails. We now also use a structured ForwardingFailure type for communicating the failure index and wire message from failures which occur during htlc forwarding downstream in the route.
Update both proto and handler to communicate error information via gRPC status details.
Summary of ChangesHello @calvinrzachman, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly overhauls the error handling mechanism for the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This is an excellent pull request that significantly improves the error handling of the SendOnion RPC. The refactoring to use gRPC status codes and structured error details aligns with best practices, making the API more robust, explicit, and easier for clients to consume. The changes are implemented consistently across the codebase, including the protobuf definitions, server logic, and tests. The updated ErrorCode enum and the addition of client helper functions for error unmarshalling are particularly valuable. The test suite has been thoroughly updated to reflect the new API contract, with new tests ensuring the correctness of the new error handling logic. The release notes are clear and accurately describe the breaking changes. I have no suggestions for improvement.
🟠 PR Severity: HIGH
🟠 High (3 files)
🟢 Low (6 files)
AnalysisThis PR improves error handling in the SendOnion RPC endpoint, which is part of the switchrpc package that interfaces with the critical htlcswitch component. The changes focus on better error reporting and handling for onion packet sending operations. Severity Rationale:
Key Review Points:
To override, add a |
bitromortac
left a comment
There was a problem hiding this comment.
I like the approach, it seems cleaner this way. Do you know if the details mechanism can be used via REST as well?
| // which an rpc client is safe to retry the SendOnion | ||
| // RPC until an explicit acknowledgement of HTLC | ||
| // dispatch can be received from the server. | ||
| name: "idempotency anchor fails", |
There was a problem hiding this comment.
does that test name make sense? or do we also need a test for InitAttempt general errors?
| linkErr := htlcswitch.NewLinkError(wireMsg) | ||
| fwdErr := htlcswitch.NewForwardingError(wireMsg, 1) | ||
|
|
||
| // Create a forwarding error to be returned by the mock decrypter. |
There was a problem hiding this comment.
there seem to be changes that belong to the TrackOnion PR/commits
| * The `switchrpc.SendOnion` RPC has been overhauled to provide a more robust, | ||
| client-friendly, and forward-compatible API. Failures are no longer reported | ||
| in the response body but are instead communicated exclusively via gRPC status | ||
| codes with rich, structured `SendOnionFailureDetails` attached. The | ||
| `ErrorCode` enum has been redesigned to represent actionable client states, | ||
| and a new `CLEAR_TEXT_ERROR` code provides forward-compatibility for clients | ||
| when new definitive local errors are introduced. This is a **breaking change** | ||
| for any clients of the `SendOnion` RPC. |
There was a problem hiding this comment.
when merging to master, we can just reflect the final state, maybe
Change Description
This PR refactors the
SendOnionRPC's error handling to adhere strictly to gRPC best practices, moving away from error reporting within the response message itself.Problem: Previously,
SendOnionconveyed error information within its response body, even for successful gRPC calls (status.OK). This conflated success and failure states, limited the ability to send rich, structured error details, and complicated client-side error handling and observability.Solution:
SendOnionResponseis now an empty message. A gRPCstatus.OKexplicitly indicates a successful dispatch, while all failures result in a non-OK gRPC status.SendOnionFailureDetailsmessages attached to the gRPC status Details field (as per https://grpc.io/docs/guides/error/).ErrorCodeEnum: The ErrorCode enum has been slimmed down to represent high-level, actionable client states (e.g.,HTLC_STATUS_UNKNOWN,DUPLICATE_HTLC). TheCLEAR_TEXT_ERRORhas been re-added for crucial forward-compatibility, allowing clients to definitively interpret new errors, all of which can fall under the same clear text error classification.GetSendOnionFailureDetails(extracts raw failure details without any translation) andUnmarshallSendOnionError(translates into types expected byChannelRouter, are provided to simplify client interaction with these structured errors.Benefits:
SendOnionendpoint issues, offering better insight into remote router communication health.NOTE
This is a breaking change for any existing clients of the
SendOnionRPC. Given its current usage, this is deemed acceptable. This PR is a follow-up to the error handling improvements for theTrackOnionRPC.Continues error handling updates from: #10472
Steps to Test
make itest icase=send_oniongo test -v -tags switchrpc github.com/lightningnetwork/lnd/lnrpc/switchrpc