Skip to content

LuChenLab/GWPIS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

24 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

GWPIS: Geometric Weighted Pathway Interaction Score

GWPIS (Geometric Weighted Pathway Interaction Score) is a framework developed to explore the interaction between SARS-CoV-2 proteins and host immune pathways. The library integrates D-SCRIPT protein-language models (available at https://d-script.readthedocs.io) and weighted pathway activity scores from PROGENy (available at https://saezlab.github.io/progeny/) to quantify how SARS-CoV-2 proteins interact with host immune responses at the pathway level.

๐Ÿ“ฆ Required Packages and Version

This project requires both R (4.0.2) and Python (3.7.12) environments.

R packages required:

  • dplyr (1.1.3)
  • ggplot2 (3.4.4)
  • reshape2 (1.4.4)
  • tidyr (1.3.0)
  • igraph (1.5.1)
  • Matrix (1.6-1.1)
  • DESeq2 (1.42.0)
  • Mfuzz (2.48.0)
  • PROGENy (1.10.0)
  • clusterProfiler (4.9.0.002)
  • GenomicRanges (1.52.1)
  • SummarizedExperiment (1.30.2)
  • e1071 (1.7-4)

You can install the necessary packages with the following commands in R:

install.packages(c("dplyr", "ggplot2", "reshape2", "tidyr", "igraph", "Matrix", "e1071"))
BiocManager::install(c("DESeq2", "Mfuzz", "clusterProfiler", "PROGENy", "GenomicRanges", "SummarizedExperiment"))

Python packages required:

  • D-SCRIPT ๏ผˆ0.2.8๏ผ‰

You can install the necessary packages with the following commands in System:

git clone https://github.com/samsledje/D-SCRIPT.git
cd D-SCRIPT
conda env create --file environment.yml
conda activate dscript
pip install dscript

๐Ÿ“‚ Repository Structure

This repository is organized into three main folders: data, script, and analysis. Below is a breakdown of the files contained in each folder:

1. script Folder ๐Ÿ“

Contains R scripts used for analysis, including data processing, clustering, and pathway analysis.

  • 01_Fig1_Bulk_RNAseq.Rmd: R Markdown script for processing Bulk RNA-seq data.
  • 02_Fig1_Mfuzz.Rmd: R Markdown script for performing gene clustering using the Mfuzz package.
  • 03_Fig1_PROGENy.Rmd: R Markdown script for conducting pathway analysis of the samples using PROGENy.
  • 04_Fig2_GWPIS.Rmd: R Markdown script for calculating the interaction scores between SARS-CoV-2 proteins and immune pathways using the GWPIS method.

2. data Folder ๐Ÿ—‚๏ธ

Contains the data files used in the analysis.

  • 01_RawCount.txt: Raw RNA-seq count data used for Figure 1 (Bulk RNA-seq).
  • 04_Predict.tsv: Protein-protein interaction data generated by the D-SCRIPT method, used for Figure 2 (protein interaction analysis).

3. analysis Folder ๐Ÿ“Š

Contains processed analysis results and intermediate data objects.

  • 01_ddsres.Rds: DESeq2 object created using the raw RNA-seq counts.
  • 02_cl.Rds/02_df.Rds: Results of gene clustering using Mfuzz.
  • 03_Progeny.Rds: PROGENy pathway enrichment results.

๐Ÿ“š References

  1. Sledzieski S, Singh R, Cowen L, Berger B. D-SCRIPT translates genome to phenome with sequence-based, structure-aware, genome-scale predictions of protein-protein interactions. Cell Syst. 2021 Oct 20;12(10):969-982.e6. doi: 10.1016/j.cels.2021.08.010. Epub 2021 Oct 9. PMID: 34536380; PMCID: PMC8586911.

  2. Schubert M, Klinger B, Klรผnemann M, Sieber A, Uhlitz F, Sauer S, Garnett MJ, Blรผthgen N, Saez-Rodriguez J. Perturbation-response genes reveal signaling footprints in cancer gene expression. Nat Commun. 2018 Jan 2;9(1):20. doi: 10.1038/s41467-017-02391-6. PMID: 29295995; PMCID: PMC5750219.

๐Ÿ”— Cite This Repository

If you use the data or code from this repository in your work, please cite this repository:

This repository accompanies a manuscript currently under peer review. Citation details will be updated upon publication.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages