Hi, thanks for the great work! I have a question regarding the geometric constraints used in your training process.
As I understand it:
-
The model predicts an affine-invariant pointmap, which allows for arbitrary scale ($s$) and shift ($t$).
-
During training, the model predicts a single scale factor to recover the metric geometry.
My confusion is: if the pointmap is truly affine-invariant, then to recover the original metric geometry, shouldn’t the model predict both a scale and a shift?
Alternatively, should the pointmap itself be trained to be scale-invariant only, so that predicting a single scale is sufficient?
Thanks in advance for your time!