Skip to content

RSC-RP/parers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pipeline Analyzing RNA Editing RNA Sequencing (PARERS)

Authors: Rodshagen, Tyler; Tracy, Maxwell; Davidge, Brittney; Carnes, Jason; Morton, Glenn
Maintainer: Rodshagen, Tyler
Stuart Lab, Center for Global Infectious Disease
Research Scientific Computing
Seattle Childrens Research Institute

Environmental Dependencies

  • Python (3.12+ with pandas, python-docx, biopython, xlsxwriter)
  • R (4.4 with tidyverse)
  • BBMap for BBMerge
  • MUSCLE alignment software

You also need:

  • AmpliconsRaw input file (fasta)
  • R1 and R2 sequencing files
    • There is test_data provided as examples
    • curated contains synthetic examples around the A6 gene to demonstrate each type of editing event.
    • truncated contains 200 of an original ~1,000,000 sequences from MURF2 gene.
  • parers.cfg - a configuration file that informs the pipeline on how to run the sample(s)

Configure parers.cfg

It is recommended to use either of these as a template and modify for your samples:

Note: You may add as many pairs of R1, R2 sequences as you want as long as they are configured the same. i.e. Don't mix genes or primers.

Apptainer

Build

Provided is parers.def which is an Apptainer definition file.

sudo apptainer build parers.sif parers.def

Open interactive shell inside the container

Once built, launch the container and ensure the directories mentioned in parers.cfg are in the bind paths for the container.

apptainer shell --bind /path/to/include parers.sif

Run the pipeline on your configured sample in the container

The PARERS pipeline can be run by going to the directory with your parers.cfg and then running python3 /parers/parers.

cd /path/to/parers.cfg
python3 /parers/parers

Conda/Mamba

Setup

Also provided is a parers.yml so you can create the parers conda or mamba environment from yml file

In order to set up the script to run in your environment, you need to tell it where somethings are:

  • bbmerge.sh script
  • muscle binary
  • Rscript binary
  • R_for_cmd directory

Activate your parers conda environment that was created from the parers.yml in the step. Now you will be able to use whereis to find the full path to bbmerge.sh script and each of the muscle and Rscript binaries.

whereis bbmerge.sh
whereis muscle
whereis Rscript

You also need to note the full path of the R_for_cmd directory.

You will need to edit the following variables in parers.py with the appropriate paths:

path_to_bbmerge = "/path/to/bbmerge.sh"
path_to_muscle = "/path/to/muscle"
path_to_r = "/path/to/Rscript"
path_to_r_scripts = "/path/to/R_for_cmd"

Run the pipeline on your configured sample in the conda/mamba environment

The PARERS pipeline can be run by going to the directory with your parers.cfg and then running python3 /parers/parers.

cd /path/to/parers.cfg
python /path/to/parers.py

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published