Skip to content

Question regarding on CLIP loss  #8

@whikwon

Description

@whikwon

Thank you for sharing nice work.

In the paper, it is stated that "The CLIP loss incentivizes the video and text encoders to make the embeddings of paired videos and reports as similar as possible, while making the embeddings of unpaired videos and reports as different as possible (Fig. 1a)."

Given the specificity of the medical imaging field, I assume that patients with the same disease might have similar reports. How did you address this issue when sampling batch data or calculating the loss function?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions