Nextflow RNA-seq pipline

Introduction

This pipeline was created as part of a project of the course "Computational Workflows". It aims to provide minimal functionality for the analysis of RNA-seq data, while being reproducible and modular. The pipeline is inspired and uses code from the nf-core/rnaseq pipeline (version 3.16.0).

Steps

The FastQC module of nf-core is used to generate quality reports of the input reads.
The nf-core module for Trim Galore is used to trim the raw reads.
HISAT2-build and HISAT2-align are used to index the reference genome (.fasta and .gtf file) and align the reads to the reference.
The SAMtools-sort module is used to sort the output alignment (.bam) of HISAT2. This step is required for subsequent feature counting.
The nf-core subreads-featurecounts module is used to count the alignments per genomic feature.

Usage

The pipeline can be used with the following parameters:

nextflow run main.nf\
    --input <PATH_TO_SAMPLESHEET.csv> \
    --fasta <PATH_TO_REFERENCE_GENOME.fa> \
    --gtf <PATH_TO_GTF.gtf> \
    --outdir <PATH_TO_OUT_DIRECTORY> \
    --threads <#_OF_THREADS> \
    --ram <#_OF_RAM_GB> \
    -profile <conda/docker> \

--input path to the samplesheet. An exemplary samplesheet can be found in the tests folder and should follow the following format:

sample,fastq_1,fastq_2,strandedness
SRR23195516_oxy_sni,./tests/data/reduced_SRX19144486_SRR23195516_1.fastq.gz,./tests/data/reduced_SRX19144486_SRR23195516_2.fastq.gz,auto
SRR23195511_oxy_sham,./tests/data/reduced_SRX19144488_SRR23195511_1.fastq.gz,./tests/data/reduced_SRX19144488_SRR23195511_2.fastq.gz,auto

--fasta path to the reference genome FASTA file.

--gtf path to the reference genomes feature file.

--outdir path to a folder where all the outputs should be stored.

--threads maximum number of threads to be used for the computation

--ram maximum amount of RAM to use during computation

-profile profile to run the pipeline with.

Example run

Exemplary command to run the pipeline with the included test files in the tests directory. The reference genome files (fasta and gtf) had to be excluded from the github repo, since they are too large. The files can be downloaded from NCBI and have to be extracted.

nextflow run main.nf\
    --input ./tests/samplesheet_test.csv \
    --fasta ./tests/data/GCF_000001635.27_GRCm39_genomic.fna \
    --gtf ./tests/data/GCF_000001635.27_GRCm39_genomic.gtf \
    --outdir ./tests/out_test/ \
    --threads 12 \
    --ram 8GB \
    -profile docker \

Pipline Output

The output of all steps of the pipeline can be found in the folder specified with --outdir. Exemplary output files can be found in the tests/out_test/ folder.

FastQC generates quality reports in HTML format (example)
Trim Galore generates .fasta files with the trimmed reads, as well as trimming report files (example).
The HISAT2 alignments and summary files are stored in the hisat2 folder
The sorted alignment files as generated by SAMtools can be found in the samtools folder
The feature counts from Subread featureCounts can be accessed in the counts folder
The versions of all tools that were used over the nf-core modules can be found in the versions/versions.yml file

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
modules		modules
tests		tests
workflows/rnaseq		workflows/rnaseq
.gitignore		.gitignore
.nf-core.yml		.nf-core.yml
LICENSE		LICENSE
README.md		README.md
diagram.svg		diagram.svg
generate_counts_plot.py		generate_counts_plot.py
main.nf		main.nf
modules.json		modules.json
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nextflow RNA-seq pipline

Introduction

Steps

Usage

Example run

Pipline Output

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

Q-Bach/computational_workflows_project

Folders and files

Latest commit

History

Repository files navigation

Nextflow RNA-seq pipline

Introduction

Steps

Usage

Example run

Pipline Output

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages