🧬 STORM — Spatial Transcriptomics Optimization by Resolution via Matrix-Factorization

STORM is a tensor-factorization framework for reconstructing missing or low-resolution gene expression in spatial transcriptomics data. It combines biologically informed regularization with dynamic λ-scaling to maintain a balanced optimization among multiple data modalities.

📂 Project Folder Structure

.
├── run.sh                              # One‑command runner (bash)
├── requirements.txt
├── src/
│   ├── main_STORM.py                   # CLI entry (used by run.sh)
│   ├── train_STORM.py                  # Training + evaluation pipeline
│   ├── compute_initial_losses.py       # Dynamic λ initialization
│   └── utils_preprocessing.py          # Gene filtering utilities
├── data/
│   ├── gene_name_interactions.npz      # STRING-like interactions (gene1, gene2, combined_score)
│   └── <SAMPLE>/
│       ├── <SAMPLE>.h5ad               # Full-resolution ground truth
│       ├── <SAMPLE>.tif                # Whole-slide image (WSI)
│       └── <DOWN>.h5ad                 # Downsampled input (e.g., MEND90_1234_0.3.h5ad)
└── output/
        └── model_results/
                └── <file_name_root>/   # Metrics + reconstructions per run

🚀 Quick Start

1) Install dependencies

pip install -r requirements.txt

2) Prepare data

Put ground truth and WSI under data/<SAMPLE>/.
Put the downsampled .h5ad and the WSI .tif under the same folder.

3) Run the full pipeline

./run.sh

Edit the top variables inside run.sh to change sample name, file name, or paths.

Direct python alternative:

python -m src.main_STORM ^
    --sample MEND90 ^
    --file_name MEND90_1234_0.3.h5ad ^
    --data_dir .\data ^
    --output_dir .\output ^
    --string_npz_path gene_name_interactions.npz

🧠 Model Details

Tensor factorization (CP) with rank R: factors A ∈ R^{I×R}, B ∈ R^{J×R}, C ∈ R^{K×R}.

Loss = weighted MSE (by mean expression within in‑tissue regions) + λ₁R₁ + λ₂R₂ + λ₃R₃ + λ₄R₄

Term	Description
R₁	L2 regularization
R₂	Alignment between spatial and WSI-derived embeddings
R₃	Spatial Laplacian smoothness regularization
R₄	Gene–gene interaction regularization (STRING network)

Dynamic λ‑scaling computed at epoch 0 to balance loss terms (λᵢ ∝ WMSE / Rᵢ).

📤 Outputs

Saved under output/model_results/<file_name_root>/ (or outputs/... if set):

loss_history.csv — Total loss per epoch
metrics.csv — Pearson, MSE, MSE on non‑zero GT
<file_name_root>_predicted_expression.csv — Reconstructed expression at tissue spots
tensor_hat_full.npy — Full reconstructed tensor (I × J × K)
model_params.pt — Learned factors and projection matrix (A, B, C, U)

📬 Contact

Questions or suggestions? Open an issue or a pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
data		data
figures		figures
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧬 STORM — Spatial Transcriptomics Optimization by Resolution via Matrix-Factorization

📂 Project Folder Structure

🚀 Quick Start

🧠 Model Details

📤 Outputs

📬 Contact

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

denizgurarslan/STORM

Folders and files

Latest commit

History

Repository files navigation

🧬 STORM — Spatial Transcriptomics Optimization by Resolution via Matrix-Factorization

📂 Project Folder Structure

🚀 Quick Start

🧠 Model Details

📤 Outputs

📬 Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages