Skip to content

Pretrained Prophet checkpoint fails to load due to embedding size mismatch #9

@vsarsam

Description

@vsarsam

When initializing a Prophet model with one of the pretrained checkpoints from HuggingFace and the published iv/cell-line embedding files from the Mendeley dataset (as referenced in your README), loading the model fails with a RuntimeError due to a mismatch in embedding matrix size.

I am following the official example from your tutorial like that:

pretrained_checkpoint_path = '../epoch=29-step=45360.ckpt'
model = Prophet(
    iv_emb_path='../prophet/embeddings/global_iv_scaledv3.csv',
    cl_emb_path='../prophet/embeddings/cell_line_embedding_full_ccle_300_scaled.csv',
    ph_emb_path=None,
    model_pth=pretrained_checkpoint_path,
)

Error message

RuntimeError: Error(s) in loading state_dict for TransformerPredictor:
    size mismatch for learnable_embedding.weight:
        copying a param with shape torch.Size([1000, 512]) from checkpoint,
        the shape in current model is torch.Size([2000, 512]).

The model construction fails due to a size mismatch in learnable_embedding.weight, suggesting that:

  • the checkpoint expects 1000 embeddings,
  • while the provided embedding files define 2000 embeddings.

This makes it unclear whether:

  1. the published embedding CSVs https://data.mendeley.com/datasets/g7z3pw3bfw/1 match the pretrained models on HuggingFace, or

  2. the pretrained checkpoints were trained with different internal embedding sizes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions