-
Notifications
You must be signed in to change notification settings - Fork 5
Application Level Acknowledgements #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
trond-snekvik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are slightly too many corner cases in the sliding window mechanism for me to confidently say whether everything works as it should, but overall, this looks good to me.
| const uint32_t modulus = 1 << (CHAR_BIT * sizeof(uint8_t)); | ||
| return (modulus + atomic_get(&sender->last_sent) - atomic_get(&sender->last_acknowledged)) | ||
| % modulus; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The compiler is probably able to optimize this into a bitmask, but even so, this feels like it might be a little easier to understand, and still works across the wraparound:
| const uint32_t modulus = 1 << (CHAR_BIT * sizeof(uint8_t)); | |
| return (modulus + atomic_get(&sender->last_sent) - atomic_get(&sender->last_acknowledged)) | |
| % modulus; | |
| return (atomic_get(&sender->last_sent) - atomic_get(&sender->last_acknowledged)) & UINT8_MAX; |
| } | ||
|
|
||
| enum pouch_gatt_ack_code code; | ||
| if (pouch_gatt_packetizer_is_fin(data, length, &code)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this set receiver->complete = true? And potentially call receiver->push with empty data to trigger functionality like downlink_finish()? Presumably, if we get FIN, we could still be in an open context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think that's right. The FIN handling in general is a little incomplete/inconsistent, so I'll clean that up. The reason I put this here originally was mostly for debugging, because at first I was handling the FIN message in the caller, so that the receiver could respond to a FIN received while it was idle. But I think it's better to handle FINs received while the receiver is active within the receiver itself, as you suggest. So the caller only handles received messages directly when the receiver is idle (NACKs in response to packets, nothing in response to FINs). We don't respond to FINs, because the sender will send a FIN if it receives an ACK while idle, so we need to be careful to not get into a loop of the sender and receiver acknowledging each other.
Anyway, this comment is only half responding to you and half me getting my thoughts out for myself 😅. But yes, I think when the receiver is active, we should handle the FIN here and I'll update the PR to reflect that. In thinking about this, I made a bunch of sequence diagrams to work the various scenarios, and those will make good figures to include in the design doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trond-snekvik The handling of FINs (and NACKs) should be more robust and consistent now. I'd appreciate another look.
634bcd7 to
b207674
Compare
b207674 to
36b7b89
Compare
36b7b89 to
4d796f8
Compare
Signed-off-by: Sam Friedman <sam@golioth.io>
Signed-off-by: Sam Friedman <sam@golioth.io>
Signed-off-by: Sam Friedman <sam@golioth.io>
4d796f8 to
8655791
Compare
trond-snekvik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sam-golioth I realize I didn't get a review request here, so I appreciate that some pieces are still moving, but since you suggested I take another look at the deadlock scenarios, I went ahead anyway.
I made a sequence diagram for myself to understand the logic here, and I have attached it with some annotations to illustrate the three scenarios that concern me:
- The sender's window size is 0 until it receives an ack, so it can't start sending
- No one initiates sending, and there's a potential deadlock in the sender's send_data function that will have to be resolved by whoever initiates (and resumes) the data sending
- No one resumes sending after an empty payload packet
It's actually not entirely clear to me what the function of the dynamic window size is for pouch. The receivers don't allocate any resources based on window size, and I don't think they'd need to anyway, as the packetizer just pushes data until it's done.
The real need for a dynamic window size is on the sender side, where the senders could/should keep buffers allocated until they've been acked or nacked, so we can retry sending them. At the moment though, the sender just bails whenever it receives a NACK, so it doesn't need it either at the moment.
From what I can deduce, the acks just serve two purposes at the moment:
- If something goes wrong in the receiver, it can tell the sender to stop (by NACK) without breaking the connection. The sender currently breaks the connection whenever this happens though, and the receiver will NACK an out-of-order packet (which would include potential retries).
- At the end of the transfer, the sender will wait for the receiver to ack the messages before it tells pouch that the transfer is done. This is good and necessary, but it doesn't require a sliding window. Could this be the only mechanism we need for now?
|
|
||
| send_ack(receiver, POUCH_GATT_ACK); | ||
|
|
||
| reset_ack_timer(receiver); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant, technically.
| receiver->last_acknowledged = receiver->last_received; | ||
| } | ||
|
|
||
| reset_ack_timer(receiver); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this really be reset if the ack send fails? If that only happens on success, the reset_ack_timer call in the ack_timer_handler below could serve a purpose.
| ctx->indicate_params.len = 0; | ||
|
|
||
| ctx->state = UPLINK_INDICATE_IN_PROGRESS; | ||
| ctx->sender = pouch_gatt_sender_create(ctx->packetizer, send_uplink_data, conn, mtu); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also have a NULL check, I think
| return err; | ||
| } | ||
|
|
||
| int pouch_gatt_sender_data_available(struct pouch_gatt_sender *sender) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is never called, but the caller will be responsible for getting us out of the deadlock discussed in send_data
|
|
||
|
|
||
| finish: | ||
| err = send_ack(receiver, err ? POUCH_GATT_NACK_UNKNOWN : POUCH_GATT_ACK); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem right to me. We'll end up here every time we receive data, but we don't actually want to send an ack on every packet -- that would defeat the whole purpose of the sliding window, right?
If the receiver push above is successful, we should just return, as the ack has been scheduled as a timer.
| goto finish; | ||
| } | ||
|
|
||
| atomic_set(&sender->last_sent, pouch_gatt_packetizer_get_sequence(sender->buffer, len)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be reset if the send fails
| packets_sent++; | ||
| } | ||
|
|
||
| finish: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for gotos here, better to consistently break from the loop
| goto finish; | ||
| } | ||
|
|
||
| if (POUCH_GATT_PACKETIZER_EMPTY_PAYLOAD == ret) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we're handling this as a special case in the ack-triggered sending, but I wonder if we might just want to return success here? We're going to hit this every time we have less than a window's worth of data to send, so it shouldn't really be an error IMO.
There could be an argument for returning ENODATA if packets_sent is 0, but even then, that's not really an issue, I think.
| { | ||
| uint8_t ack[POUCH_GATT_ACK_SIZE]; | ||
| pouch_gatt_ack_encode(ack, sizeof(ack), code, receiver->last_received, receiver->window); | ||
| int err = receiver->send_ack(receiver->send_ack_arg, ack, sizeof(ack)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use size returned from ack_endcode()
|
|
||
| atomic_set(&sender->last_sent, UINT8_MAX); | ||
| atomic_set(&sender->last_acknowledged, UINT8_MAX); | ||
| atomic_set(&sender->offered_window, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing gets sent until we have a non-zero window size, but the window size doesn't get set until we get an ack, which we won't get until we send something 🤔
This changes the Pouch GATT transport to use application-level sliding window acknowledgments on top of unreliable Bluetooth GATT transfers. Doing so eliminates the need to wait for round-trip acknowledgements on each packet, and results in significant speedups, as much as 10x in testing. The window size is configurable, and severely constrained devices can lower the window all the way back down to 1, in which case the transfer speed is essentially equivalent to the previous speed when using reliable GATT writes.