-
Notifications
You must be signed in to change notification settings - Fork 5
rfc: QUIC Changes #2494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
rfc: QUIC Changes #2494
Conversation
a9dc9c6 to
5c9fead
Compare
0d2b0f5 to
f9ba0f0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This RFC proposes modifications to Solana validators' and transaction senders' use of QUIC protocol to enable FPGA-based edge filtering in the DoubleZero network. The changes aim to overcome QUIC's encryption, flow control, and packet formatting challenges while minimizing modifications to existing Solana validator and QUIC library code.
Key Changes:
- Introduction of in-band encrypted session key sharing mechanism via modified
HANDSHAKE_DONEpackets - Frame substitution approach using
RESET_STREAMto handle dropped traffic while maintaining flow control - Enforcement of specific packet formatting requirements including fixed 8-byte CIDs, 1232-byte stream frames, and standardized encryption
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| rfcs/rfcx-quic-changes-for-edge-filtering.md | New RFC document detailing QUIC protocol modifications for FPGA edge filtering support |
| CHANGELOG.md | Added changelog entry for the new RFC |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
alexpyattaev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together. Some initial feedback below:
|
|
||
| - if Quinn receives a `RESET_STREAM` with a different `FINAL_LENGTH` then it has already determined, Quinn must not issue a `FNAL_SIZE_ERROR`. This is slightly different from the previous point, and addresses an edge case where the `RESET_STREAM` replaced a packet without the Fin but, but that packet has already been received by the server. | ||
|
|
||
| **4k Transactions & Fragmentation:** Transactions which are fragmented across multiple packets may not be dropped until the last packet depending on the criteria causing the drop. There isn’t a workaround for this other than storing and forwarding, which is not practicable for the amount of traffic a single edge filtering node might be handling. The FPGA must issue the `RESET_STREAM` as soon as it knows that a drop is desired. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not clear how fragmented would be handled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% on how we're going to handle it in the FPGA either, other than that we have to find a way to do so since fragmentation is inevitable. I have a few different ideas, but I think the handling system will partly depend on the fragmentation rules settled on. I'm coming around to a possible store-forward system being able to be performant enough if we place enough limits on fragmentation.
The point here is that the FPGA will issue the RESET_STREAM as soon as it knows the transaction is bad- which could be subbing for any frame in a multi-frame stream. So in making modifications to a receiving validator ahead of edge filtering support, software needs to account for three possibilities:
- FPGA subs reset_stream for first frame, but delivers the rest of the frames.
- FPGA subs reset_stream for a middle frame, having delivered some frames already, and delivering some additional one(s) after.
- FPGA subs reset_stream for last frame, having already delivered all preceding data in the stream.
The first and last are most likely scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delivering anything for a stream after reset is a protocol breaking change. hacking the server to ignore it is possible but not desirable. I'd prefer if FPGA zeroed out the payload or just dropped the last fragment. Granted it would not help with bandwidth much, but the server is welcome to blacklist the peer to save on bandwidth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as the size of the data delivered after the reset doesn't violate the max_data provided in the reset, I don't believe it is protocol breaking.
RFC9000 Seciont 3.2: Similarly, it is possible for remaining stream data to arrive after receiving a RESET_STREAM frame (the "Reset Recvd" state). An implementation is free to manage this situation as it chooses.
However, on closer inspection, I think there is a risk of violating point 2 in the description of FINAL_SIZE_ERROR:
(2) an endpoint received a STREAM frame or a RESET_STREAM frame containing a final size that was lower than the size of stream data that was already received
A subsequent frame with a FIN bit would contain a smaller final size than the FINAL_SIZE included in the reset_stream.
If we don't implement as a store-forward in the FPGA, I'm concerned about having the FPGA have to track what streams its done resets for so that it can zero out subsequent frames. If we do implement as store-forward, then it's technically possible.
imo, I'd love to just require server-sides to never generate a final_size_error. Generating that error requires keeping state for closed streams, which is a waste anyway. I'm fairly certain Firedancer's QUIC implementation will throw away all state for stream as soon as it receives the reset or FIN, and so couldn't generate this error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A subsequent frame with a FIN bit would contain a smaller final size than the FINAL_SIZE included in the reset_stream.
yeah in this case its ok but then it will all get delivered to the application.... not ideal
imo, I'd love to just require server-sides to never generate a final_size_error. Generating that error requires keeping state for closed streams, which is a waste anyway.
maybe worth hacking it, yes,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think any bytes would be delivered after the reset is delivered to the app, but would be if the reset hadn't been passed up yet.
But we still don't want to generate final_size_errors, since we don't want to confuse txn senders.
|
|
||
| <br> | ||
|
|
||
| **Packet Fragmentation**: Stream frames must be 1232 bytes, except for the last frame in a stream. If a frame is shorter than 1232 bytes and does not have the FIN flag, then the FPGA must replace it with a `RESET_STREAM` to prevent abuse of the connection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is way more than needed to parse tx header. why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo, it makes reassembly and managing buffers simpler all around- max data until you don't have that much left, then the rest. On the Rx side, you have four windows into which new data gets slotted, and it always lands at one of those four addresses.
It's similar to the approach USB uses, but USB has to do it that way because there's no equivalent of the FIN flag. The short frame defines the end. We don't have to do it, but it does make for a clean RX side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
problem is, this requires client compliance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, but any requirement on framing will require client compliance, even if just leaving the existing "four frames maximum" requirement that already exists.
My understanding was that the 4 frame requirement may not have required transaction senders to make much or any changes so far, because the odds of fragmentation into more than four pieces was small. However with 4k transactions, I would (perhaps naively) expect senders that have not explicitly enforced framing will have higher odds of exceeding four frames per stream. If they have to start adding better framing support anyway, we could use the opportunity to make the requirement whatever makes the best engineering sense for all components involved.
This max size frames followed by one short frame would be easy to have the FPGA handle enforcement of too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah with larger txs we'd need to allow more fragmentation. still, forcing no fragmentation is probably non-viable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was the idea that with once 4k transactions exist there would be something like 16 allowable fragments (since ~1K transactions -> 4 fragments) to basically allow any normal fragmentation to happen, but prevent egregious cases?
For Quinn, in quinn-proto/src/connection/mod.rs, the populate_packet function fills in all other frames, then puts stream frame(s) in any leftover space. I'd suspect that as transactions get longer, it will lead to more fragmentation, since there will be housekeeping inserted into many packets containing bits of stream, shortening the space for stream data per packet.
Without sample wire data to look at, intuitively I see it being likely that streams will start with a small fragment in Quinn's implementation since a stream is likely to wind up tacked onto the end of a packet that contains housekeeping and the end of some other stream. Adjusting frame::Stream::SIZE_BOUND could be a quick hack for ensuring there's space for a larger fragment, but wouldn't let a whole transactions that is small fill in smaller spaces.
It seems like enforcing any kind of fragmentation rule that isn't just ruling out extreme edge cases is going to require more from txn senders that has previously been required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was the idea that with once 4k transactions exist there would be something like 16 allowable fragments (since ~1K transactions -> 4 fragments) to basically allow any normal fragmentation to happen, but prevent egregious cases?
Something like that, yes.
For Quinn, in quinn-proto/src/connection/mod.rs, the populate_packet function fills in all other frames, then puts stream frame(s) in any leftover space. I'd suspect that as transactions get longer, it will lead to more fragmentation, since there will be housekeeping inserted into many packets containing bits of stream, shortening the space for stream data per packet.
any quic implementation is likely to work similarly
It seems like enforcing any kind of fragmentation rule that isn't just ruling out extreme edge cases is going to require more from txn senders that has previously been required.
Yes, and that is a major concern when it comes to interop with other implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me ruminate on this one a little longer. I may need to get further into implementation to see what's necessary vs nice-to-have.
Perhaps the reasonable ask is something like a 200 byte minimum size. That's roughly the minimum transaction size, In Quinn it could be implemented with frame::Stream::SIZE_BOUND being set to 225 instead of 25.
|
|
||
| > 💡 This ensures the FPGA has the most information possible as early as possible, that transactions are broken up into a predictable pattern, and that a malicious sender does not break a transaction into tiny pieces to get around filtering. Currently Agave allows smaller frames, but only four total fragments. Since there are already rules about fragmentation, this adjustment of those rules allows the Edge Filtration to be more useful and efficient. This is already usually met by a normal sender. | ||
|
|
||
| > 👉 There is an option to instead enforce some smaller (but reasonable) minimum size for the first frame in a stream. For example, requiring all the signatures, or signatures + header of a transaction to be in the first frame. However once some size constraint must be enforced, we might as well enforce something that will make both Validator and FPGA code paths more efficient (thus optimizing for processor time, rather than network bandwidth). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
far more reasonable!
|
|
||
| **Coalescing:** Short header packets must not be coalesced, even after a long-header packet. If the UDP datagram contains a QUIC short header packet, then that must be the first and only thing in the packet. | ||
|
|
||
| **Frame Ordering:** Stream Frames must be the first thing in a QUIC packet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this may be very hard to enforce...
|
|
||
| **Frame Ordering:** Stream Frames must be the first thing in a QUIC packet. | ||
|
|
||
| **Single Stream Frame per Packet**: There must be only one Stream frame in a QUIC packet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
smallest TX is <200 bytes. epic waste of IOPS...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was one that had been suggested (don't recall by who) back in September and everyone seemed on board with it. In a bare-metal embedded world or an FPGA world, the added processor overhead is workable, but perhaps running on top of Linux makes that concerning?
Were transactions in separate UDP packets before QUIC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it was 1 tx per packet. wasting iops is fine i guess, just not desirable for a niche feature. + requires hacking client side code. i'd try to avoid.
Copilot fixes typos Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Summary of Changes
RFC defines changes required to Solana validators and transaction senders' use of QUIC to support edge filtering.