rfc: QUIC Changes #2494

ben-dz · 2025-12-19T16:45:15Z

Summary of Changes

RFC defines changes required to Solana validators and transaction senders' use of QUIC to support edge filtering.

Copilot

Pull request overview

This RFC proposes modifications to Solana validators' and transaction senders' use of QUIC protocol to enable FPGA-based edge filtering in the DoubleZero network. The changes aim to overcome QUIC's encryption, flow control, and packet formatting challenges while minimizing modifications to existing Solana validator and QUIC library code.

Key Changes:

Introduction of in-band encrypted session key sharing mechanism via modified HANDSHAKE_DONE packets
Frame substitution approach using RESET_STREAM to handle dropped traffic while maintaining flow control
Enforcement of specific packet formatting requirements including fixed 8-byte CIDs, 1232-byte stream frames, and standardized encryption

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File	Description
rfcs/rfcx-quic-changes-for-edge-filtering.md	New RFC document detailing QUIC protocol modifications for FPGA edge filtering support
CHANGELOG.md	Added changelog entry for the new RFC

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rfcs/rfcx-quic-changes-for-edge-filtering.md

alexpyattaev

Thanks for putting this together. Some initial feedback below:

rfcs/rfcx-quic-changes-for-edge-filtering.md

alexpyattaev · 2025-12-26T23:08:36Z

rfcs/rfcx-quic-changes-for-edge-filtering.md

+
+- if Quinn receives a `RESET_STREAM` with a different `FINAL_LENGTH` then it has already determined, Quinn must not issue a `FNAL_SIZE_ERROR`. This is slightly different from the previous point, and addresses an edge case where the `RESET_STREAM` replaced a packet without the Fin but, but that packet has already been received by the server.
+
+**4k Transactions & Fragmentation:** Transactions which are fragmented across multiple packets may not be dropped until the last packet depending on the criteria causing the drop. There isn’t a workaround for this other than storing and forwarding, which is not practicable for the amount of traffic a single edge filtering node might be handling. The FPGA must issue the `RESET_STREAM` as soon as it knows that a drop is desired.


not clear how fragmented would be handled

I'm not 100% on how we're going to handle it in the FPGA either, other than that we have to find a way to do so since fragmentation is inevitable. I have a few different ideas, but I think the handling system will partly depend on the fragmentation rules settled on. I'm coming around to a possible store-forward system being able to be performant enough if we place enough limits on fragmentation.

The point here is that the FPGA will issue the RESET_STREAM as soon as it knows the transaction is bad- which could be subbing for any frame in a multi-frame stream. So in making modifications to a receiving validator ahead of edge filtering support, software needs to account for three possibilities:

FPGA subs reset_stream for first frame, but delivers the rest of the frames.

FPGA subs reset_stream for a middle frame, having delivered some frames already, and delivering some additional one(s) after.

FPGA subs reset_stream for last frame, having already delivered all preceding data in the stream.

The first and last are most likely scenarios.

delivering anything for a stream after reset is a protocol breaking change. hacking the server to ignore it is possible but not desirable. I'd prefer if FPGA zeroed out the payload or just dropped the last fragment. Granted it would not help with bandwidth much, but the server is welcome to blacklist the peer to save on bandwidth.

As long as the size of the data delivered after the reset doesn't violate the max_data provided in the reset, I don't believe it is protocol breaking.

RFC9000 Seciont 3.2: Similarly, it is possible for remaining stream data to arrive after receiving a RESET_STREAM frame (the "Reset Recvd" state). An implementation is free to manage this situation as it chooses.

However, on closer inspection, I think there is a risk of violating point 2 in the description of FINAL_SIZE_ERROR:

(2) an endpoint received a STREAM frame or a RESET_STREAM frame containing a final size that was lower than the size of stream data that was already received

A subsequent frame with a FIN bit would contain a smaller final size than the FINAL_SIZE included in the reset_stream.

If we don't implement as a store-forward in the FPGA, I'm concerned about having the FPGA have to track what streams its done resets for so that it can zero out subsequent frames. If we do implement as store-forward, then it's technically possible.

imo, I'd love to just require server-sides to never generate a final_size_error. Generating that error requires keeping state for closed streams, which is a waste anyway. I'm fairly certain Firedancer's QUIC implementation will throw away all state for stream as soon as it receives the reset or FIN, and so couldn't generate this error.

A subsequent frame with a FIN bit would contain a smaller final size than the FINAL_SIZE included in the reset_stream.

yeah in this case its ok but then it will all get delivered to the application.... not ideal

imo, I'd love to just require server-sides to never generate a final_size_error. Generating that error requires keeping state for closed streams, which is a waste anyway.

maybe worth hacking it, yes,

I don't think any bytes would be delivered after the reset is delivered to the app, but would be if the reset hadn't been passed up yet.

But we still don't want to generate final_size_errors, since we don't want to confuse txn senders.

alexpyattaev · 2025-12-26T23:10:44Z

rfcs/rfcx-quic-changes-for-edge-filtering.md

+
+<br>
+
+**Packet Fragmentation**: Stream frames must be 1232 bytes, except for the last frame in a stream. If a frame is shorter than 1232 bytes and does not have the FIN flag, then the FPGA must replace it with a `RESET_STREAM` to prevent abuse of the connection.


this is way more than needed to parse tx header. why?

imo, it makes reassembly and managing buffers simpler all around- max data until you don't have that much left, then the rest. On the Rx side, you have four windows into which new data gets slotted, and it always lands at one of those four addresses.

It's similar to the approach USB uses, but USB has to do it that way because there's no equivalent of the FIN flag. The short frame defines the end. We don't have to do it, but it does make for a clean RX side.

problem is, this requires client compliance

Indeed, but any requirement on framing will require client compliance, even if just leaving the existing "four frames maximum" requirement that already exists.

My understanding was that the 4 frame requirement may not have required transaction senders to make much or any changes so far, because the odds of fragmentation into more than four pieces was small. However with 4k transactions, I would (perhaps naively) expect senders that have not explicitly enforced framing will have higher odds of exceeding four frames per stream. If they have to start adding better framing support anyway, we could use the opportunity to make the requirement whatever makes the best engineering sense for all components involved.

This max size frames followed by one short frame would be easy to have the FPGA handle enforcement of too.

yeah with larger txs we'd need to allow more fragmentation. still, forcing no fragmentation is probably non-viable.

Was the idea that with once 4k transactions exist there would be something like 16 allowable fragments (since ~1K transactions -> 4 fragments) to basically allow any normal fragmentation to happen, but prevent egregious cases?

For Quinn, in quinn-proto/src/connection/mod.rs, the populate_packet function fills in all other frames, then puts stream frame(s) in any leftover space. I'd suspect that as transactions get longer, it will lead to more fragmentation, since there will be housekeeping inserted into many packets containing bits of stream, shortening the space for stream data per packet.

Without sample wire data to look at, intuitively I see it being likely that streams will start with a small fragment in Quinn's implementation since a stream is likely to wind up tacked onto the end of a packet that contains housekeeping and the end of some other stream. Adjusting frame::Stream::SIZE_BOUND could be a quick hack for ensuring there's space for a larger fragment, but wouldn't let a whole transactions that is small fill in smaller spaces.

It seems like enforcing any kind of fragmentation rule that isn't just ruling out extreme edge cases is going to require more from txn senders that has previously been required.

Was the idea that with once 4k transactions exist there would be something like 16 allowable fragments (since ~1K transactions -> 4 fragments) to basically allow any normal fragmentation to happen, but prevent egregious cases?

Something like that, yes.

For Quinn, in quinn-proto/src/connection/mod.rs, the populate_packet function fills in all other frames, then puts stream frame(s) in any leftover space. I'd suspect that as transactions get longer, it will lead to more fragmentation, since there will be housekeeping inserted into many packets containing bits of stream, shortening the space for stream data per packet.

any quic implementation is likely to work similarly

It seems like enforcing any kind of fragmentation rule that isn't just ruling out extreme edge cases is going to require more from txn senders that has previously been required.

Yes, and that is a major concern when it comes to interop with other implementations.

Let me ruminate on this one a little longer. I may need to get further into implementation to see what's necessary vs nice-to-have.

Perhaps the reasonable ask is something like a 200 byte minimum size. That's roughly the minimum transaction size, In Quinn it could be implemented with frame::Stream::SIZE_BOUND being set to 225 instead of 25.

alexpyattaev · 2025-12-26T23:11:09Z

rfcs/rfcx-quic-changes-for-edge-filtering.md

+
+> 💡 This ensures the FPGA has the most information possible as early as possible, that transactions are broken up into a predictable pattern, and that a malicious sender does not break a transaction into tiny pieces to get around filtering. Currently Agave allows smaller frames, but only four total fragments. Since there are already rules about fragmentation, this adjustment of those rules allows the Edge Filtration to be more useful and efficient. This is already usually met by a normal sender.
+
+> 👉 There is an option to instead enforce some smaller (but reasonable) minimum size for the first frame in a stream. For example, requiring all the signatures, or signatures + header of a transaction to be in the first frame. However once some size constraint must be enforced, we might as well enforce something that will make both Validator and FPGA code paths more efficient (thus optimizing for processor time, rather than network bandwidth).


far more reasonable!

alexpyattaev · 2025-12-26T23:14:46Z

rfcs/rfcx-quic-changes-for-edge-filtering.md

+
+**Coalescing:** Short header packets must not be coalesced, even after a long-header packet. If the UDP datagram contains a QUIC short header packet, then that must be the first and only thing in the packet.
+
+**Frame Ordering:** Stream Frames must be the first thing in a QUIC packet.


this may be very hard to enforce...

alexpyattaev · 2025-12-26T23:15:26Z

rfcs/rfcx-quic-changes-for-edge-filtering.md

+
+**Frame Ordering:** Stream Frames must be the first thing in a QUIC packet.
+
+**Single Stream Frame per Packet**: There must be only one Stream frame in a QUIC packet.


smallest TX is <200 bytes. epic waste of IOPS...

This was one that had been suggested (don't recall by who) back in September and everyone seemed on board with it. In a bare-metal embedded world or an FPGA world, the added processor overhead is workable, but perhaps running on top of Linux makes that concerning?

Were transactions in separate UDP packets before QUIC?

yes, it was 1 tx per packet. wasting iops is fine i guess, just not desirable for a niche feature. + requires hacking client side code. i'd try to avoid.

rfcs/rfcx-quic-changes-for-edge-filtering.md

Copilot fixes typos Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

ben-dz force-pushed the bdz/quic-rfc branch 2 times, most recently from a9dc9c6 to 5c9fead Compare December 19, 2025 17:03

ben-dz marked this pull request as ready for review December 19, 2025 17:12

ben-dz force-pushed the bdz/quic-rfc branch 2 times, most recently from 0d2b0f5 to f9ba0f0 Compare December 19, 2025 22:30

snormore requested a review from Copilot December 19, 2025 22:32

Copilot AI reviewed Dec 19, 2025

View reviewed changes

ben-dz force-pushed the bdz/quic-rfc branch from 44d6988 to fbfaeed Compare December 19, 2025 22:45

alexpyattaev reviewed Dec 26, 2025

View reviewed changes

ben-dz force-pushed the bdz/quic-rfc branch from fbfaeed to b42312c Compare January 5, 2026 17:43

martinsander00 reviewed Jan 5, 2026

View reviewed changes

rfcs/rfcx-quic-changes-for-edge-filtering.md Outdated Show resolved Hide resolved

First draft of quic chagnes for edge filtering rfc.

fb707df

Copilot fixes typos Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

ben-dz force-pushed the bdz/quic-rfc branch from b42312c to fb707df Compare January 5, 2026 22:27


		- if Quinn receives a `RESET_STREAM` with a different `FINAL_LENGTH` then it has already determined, Quinn must not issue a `FNAL_SIZE_ERROR`. This is slightly different from the previous point, and addresses an edge case where the `RESET_STREAM` replaced a packet without the Fin but, but that packet has already been received by the server.

		4k Transactions & Fragmentation: Transactions which are fragmented across multiple packets may not be dropped until the last packet depending on the criteria causing the drop. There isn’t a workaround for this other than storing and forwarding, which is not practicable for the amount of traffic a single edge filtering node might be handling. The FPGA must issue the `RESET_STREAM` as soon as it knows that a drop is desired.


		<br>

		Packet Fragmentation: Stream frames must be 1232 bytes, except for the last frame in a stream. If a frame is shorter than 1232 bytes and does not have the FIN flag, then the FPGA must replace it with a `RESET_STREAM` to prevent abuse of the connection.


		> 💡 This ensures the FPGA has the most information possible as early as possible, that transactions are broken up into a predictable pattern, and that a malicious sender does not break a transaction into tiny pieces to get around filtering. Currently Agave allows smaller frames, but only four total fragments. Since there are already rules about fragmentation, this adjustment of those rules allows the Edge Filtration to be more useful and efficient. This is already usually met by a normal sender.

		> 👉 There is an option to instead enforce some smaller (but reasonable) minimum size for the first frame in a stream. For example, requiring all the signatures, or signatures + header of a transaction to be in the first frame. However once some size constraint must be enforced, we might as well enforce something that will make both Validator and FPGA code paths more efficient (thus optimizing for processor time, rather than network bandwidth).


		Coalescing: Short header packets must not be coalesced, even after a long-header packet. If the UDP datagram contains a QUIC short header packet, then that must be the first and only thing in the packet.

		Frame Ordering: Stream Frames must be the first thing in a QUIC packet.


		Frame Ordering: Stream Frames must be the first thing in a QUIC packet.

		Single Stream Frame per Packet: There must be only one Stream frame in a QUIC packet.

rfc: QUIC Changes #2494

Are you sure you want to change the base?

rfc: QUIC Changes #2494

Uh oh!

Conversation

ben-dz commented Dec 19, 2025

Summary of Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexpyattaev left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants