We extract g for the training video clips using [ 2] on each frame followed by a simple histogram-based clustering algorithm https://github.com/Ahmednull/L2CS-Net https://arxiv.org/abs/2203.03339