Skip to content

Issues to predict from a pretrained 'base' model for new dataset #10

@sreerampeela

Description

@sreerampeela

Hi,

I am using Prophet to predict IC50 for a given set of drugs (different from the ones the model has been trained on) and cell lines (from DepMap data). I have been trying to follow the insilico_screening.ipynb notebook as a reference.

iv_list = ['COCCOC1=C(C=C2C(=C1)C(=NC=N2)NC3=CC=CC(=C3)C#C)OCCOC', 'cs(=o)c']
ph_list = ["DepMap"]
cell_lines_list = full_input_data.index.unique().tolist()
input_df = pd.MultiIndex.from_product([iv_list, cell_lines_list,], names=["iv1", "cell_line"],)
input_df = input_df.to_frame(index=False).reset_index(drop=True)
input_df["iv2"] = "cs(=o)c"  # DMSO
input_df["phenotype"] = "DepMap"
df = model.predict(input_df, save=False)

Below are the errors:

  1. When trying to import set_config, I am getting:

ImportError: cannot import name 'set_config' from 'prophet' (/home/sreeramp/prophet/prophet/init.py)

  1. How can I use my own gene expression data (300 PCs) and drug SMILES to make predictions? The PCs were inferred using sklearn.PCA method with default params as needed (except for the number of components). The rdKit2D embeddings were from rdkit v2025.09.2 and were of length 217 for each drug (different from 200 or 195 as described in the paper). What are the descriptors used for the model training?

Env specs attached as yml file.

prophet_env.yml

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions