-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
Hi, thank you for sharing your work and providing the implementation!
While reviewing the code, I noticed a few differences between the implementation and the description in the paper. In the paper, during the 2D-to-3D construction process, it mentions that a 3D model is utilized to extract features for each point. However, in the code, it seems that CLIP features are being used instead.
Additionally, the paper describes processing features for top views (as outlined in OpenMask3D), but in the code, it appears that the CLIP features are computed for the entire frame instead.
Could you clarify if I might be misunderstanding something here? Thank you!
Metadata
Metadata
Assignees
Labels
No labels