Hi ,This is a nice work. I'd like to use VisualQuality-R1 as the reward model for reinforcement learning. However, I've found that VisualQuality-R1 gives different scores for the same image each time. This is unacceptable for training GRPO. Do you have any solutions?