Observed in DDP mode with 2x3 GPUs (not tested on single node training).
This is surprising, because the per-class IoUs are computed with a bespoke method from the confusion matrix of the MultiClassJaccardIndex object that computes mean IoU
|
self.log("val/iou", iou_epoch, on_step=False, on_epoch=True, prog_bar=True) |
Mean IoU:

Per-class:

Might be linked to #108