Skip to content

Reduce resample cropping for online deployment#429

Merged
wbenoit26 merged 6 commits intoML4GW:mainfrom
wbenoit26:reduce-resample-buffer
Mar 3, 2026
Merged

Reduce resample cropping for online deployment#429
wbenoit26 merged 6 commits intoML4GW:mainfrom
wbenoit26:reduce-resample-buffer

Conversation

@wbenoit26
Copy link
Contributor

Crop only half the filter size from the buffer after resampling to remove latency. With this, we crop only 1/512th of a second, rather than a full second. Below is a comparison of the whitened data at different cropping sizes. The bottom two are 1/512 and 1/1024, respectively, with the latter there to demonstrate that we can't crop less.

image image image

@wbenoit26 wbenoit26 requested a review from EthanMarx February 24, 2026 15:26
@EthanMarx
Copy link
Contributor

So currently I think the main latency is from the fact that we have to wait for the next frame to perform the resampling. Is that avoided here?

@wbenoit26
Copy link
Contributor Author

Yeah, what's different here is that we don't need to wait for the next frame for most of the data in the current frame. Only the final few samples require waiting the additional second.

@EthanMarx
Copy link
Contributor

EthanMarx commented Feb 24, 2026

Okay cool I think I see it in the code now.

So for the latest frame at time T (which ends at time T + 1), we will get uncorrupted whitened data from T - crop_size to T + 1 - crop_size.
And then the data from T + 1 - crop_size to T + 1 needs to wait for the next frame to be whitened?

This is nice. So if the event maximum occurs between T and T + 1 - crop_size, we'll save a second.

@wbenoit26
Copy link
Contributor Author

Yeah, that's exactly it. For most signals, we won't need to wait for extra data. I've tested this briefly on the O3 MDC to make sure everything runs, and it seems okay.

@wbenoit26
Copy link
Contributor Author

@EthanMarx Any other thoughts on this?

dur = block_buffer.shape[-1] / STRAIN_SAMPLE_RATE
# Need at least 3 blocks to be able to crop out edge effects
# from resampling and just yield the middle second
if dur >= 3 * BLOCK_DURATION:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was using BLOCK_DURATION here a bug before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this was correct. The comment was wrong, it wasn't yielding the middle second, but rather the middle block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why you asked this question now. Using the duration was correct, and using the size like I've done in this PR is incorrect. Fixing that now.

# removing `crop_size` samples on the right and, because
# frames come in 1-second chunks, `sample_rate - crop_size`
# samples on the left.
buffer_slc = slice(-int(crop_size + sample_rate), -int(crop_size))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we un-hard code the assumption that frames come in at 1 second frames here? I know this was in the code before but might be a good time to either add a comment or do what we do in the ngdd dataloader

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let me copy the setup from the ngdd dataloader.

@EthanMarx
Copy link
Contributor

@EthanMarx Any other thoughts on this?

Left a few comments yesterday that I accidentally didnt "submit" until now

@wbenoit26
Copy link
Contributor Author

Here's the first event with this PR: https://gracedb-test.ligo.org/events/G1585761/view/

The latency is definitely lower, and I'll let this run for a few days to make sure nothing else goes wrong.

@wbenoit26
Copy link
Contributor Author

Here's the summary page for the run: https://ldas-jobs.ligo.caltech.edu/~aframe.online/o3_mdc/summary.html

Latency looks lower, and the event rate looks as expected for the MDC.

@wbenoit26
Copy link
Contributor Author

@EthanMarx I think this is good to go.

@wbenoit26 wbenoit26 merged commit f36ce8e into ML4GW:main Mar 3, 2026
5 checks passed
@wbenoit26 wbenoit26 deleted the reduce-resample-buffer branch March 3, 2026 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants