-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
Hi GRAM authors,
Thanks for your excellent work.
I'm trying to reproduce your results with your code.
Since I don’t have the VAST27M 150k subset, I tried fine-tuning the MSRVTT dataset using the three checkpoints based on VAST model, 4modal, and 5modal to reproduce your results. However, regardless of which dataset or which base checkpoint I use, I cannot reach your reported 64% R@1.
Could you let me know on which base model you fine-tuned the MSRVTT dataset?
Here are my reproduced results:
| base model | T2D r1 | D2T r1 |
|---|---|---|
| VAST model | 59 | 59 |
| 4modal | 60.5 | 61 |
| 5modal | 60.6 | 60.9 |
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels