Compute sample coverage from multiple input BAM, CRAM, or bedGraph files.
Repository: samplecov
Related project: tiebrush
samplecov is a lightweight toolkit designed to compute per-sample coverage from multiple CRAM/BAM/bedGraph files.
It outputs compressed and indexed bedGraph files suitable for further analysis or visualization.
# Compute per-sample coverage for Tissue_1
samplecov.sh -o Tissue_1.sample.bedGraph.gz Tissue_1/*.cram
# Use a common reference for Tissue_2 (if all samples used the same reference)
samplecov.sh -r ref.ids -o Tissue_2.sample.bedgGaph.gz Tissue_2/*.cram
# Merge sample coverage across multiple tissues
samplecov.sh -r ref.ids -o Tissues.sample.bedGraph.gz -p 16 Tissue_*.sample.bedGraph.gzSample/Tissue files:
*.bam - BAM alignemnt files *.cram - CRAM alignemnt files *.gz - bedGraph coverage files
ref.ids
A file containing a list of reference regions (e.g., from samtools faidx).
Use this file if all input files were aligned to the same reference.
Each line can be:
A chromosome (chr1)
A chromosome region (chr1:100000-200000)
Tissues file:
*.gz - Compressed bedGraph file with total sample coverage
Example:
$ zcat Tissues.sample.bedGraph.gz | head chr1 9999 10003 1 # chr1:9999-10003 region covered by a single sample(multiple reads?) chr1 10003 10004 3 # chr1:10003-10004 region covered by 3 samples chr1 10004 10010 5 # chr1:10004-10010 region covered by 5 samples
The following tools must be installed and available in your system $PATH:
To install most of these on a Debian-based system:
sudo apt update
sudo apt install samtools tabix parallel coreutils pypy3 python3