-
Notifications
You must be signed in to change notification settings - Fork 77
Description
Hi! I modified the training code of SOLD2 into a multi-GPU version. The repeatability metric of the model trained on the synthetic dataset is similar to or even better than the one in your publicly released sold2_synthetic.tar. However, the repeatability metric for the model trained on the Wireframe dataset is much lower than the one in your publicly released sold2_wireframe.tar. I'm not sure where the issue might be. Could you please help me check whether it's a problem with my training approach?
Firstly, I trained the model with fixed loss weights set to 1. During training on the synthetic dataset, with a batch size of 16 and other settings kept consistent, I trained for 30 epochs on two RTX 2080 Ti GPUs. The Rep-5 metric for structural distance obtained on the Wireframe dataset was approximately 0.351. In comparison, the publicly released sold2_synthetic.tar achieved a metric of 0.300 on our machine.
For training the detector on the Wireframe real dataset, a batch size of 10 was used, and other settings remained consistent. The model was fine-tuned for 200 epochs on two GPUs, with an interim model around the 90th epoch achieving the highest metric of approximately 0.508. However, your publicly released model sold2_wireframe.tar achieved a metric of 0.587 on our machine.
Next, I conducted training using learnable loss weights. In the training on the synthetic dataset, the settings remained consistent with the previous setup, and the obtained metric was approximately 0.315. For training on the Wireframe real dataset, the achieved metric was 0.505.
Could you please help me identify where the issue lies in the training approach and suggest ways to improve the repeatability metric of the model?