-
Notifications
You must be signed in to change notification settings - Fork 7
Description
The OnlineAverager model for streaming server-side online averaging has been accelerated by using all tensor operations, but this comes at the expense of the first few predictions being incorrect. This should be fixed either by
- Implementing an
update_idxscalar state that keeps track of the number of updates that have been performed, and scaling the normalization during averaging accordingly. - Documenting this drawback in all relevant places (including in
EnsembleModel.add_streaming_output) so users can be aware and rescale outputs themselves.
In general, these predictions will be thrown away anyway since in the context of streaming input, they represent predictions on the past, but it's still worth making people aware of.
The OnlineAverager model itself also suffers from some lack of generality that might be worth addressing. In particular, num_channels and batch_size could both be inferred at inference time with minor adjustments to the code at the expense of some performance. Since this implementation is meant just to be used for Triton streaming output, performance is more important than generality. But it could be worth implementing something like this in the ml4gw library that fixes these problems.