Hello, I would like to ask why the loss fluctuates so dramatically. Does this have any impact on the training? Is the model converging?