-
Notifications
You must be signed in to change notification settings - Fork 899
Open
Description
Hi!
thanks for this little piece of juicy code!
Just for curiosity, I've noticed that in your implementation you are using nn.LayerNorm with the standard denominator constant eps=1e-5, whereas in other implementations (DINO [here] and ViT in timm[here]) this parameter is explicitly set to eps=1e-6.
I know that it is a small detail, but details sometimes are super-important for having better models.
Do you think the model is sensitive to this kind of parameter change? Have you ever tried/noticed it?
Thanks!
Metadata
Metadata
Assignees
Labels
No labels