Skip to content

Not all of the samples requested have provided input #40

@umasstr

Description

@umasstr

After preparing the required input, the pipeline can't seem to find the specified files or output directory. I don't see in the log files whether or not my sample file is recognized. I am hoping that there is an obvious issue with my file paths or the config, but I'm just not seeing it. Any help would be much appreciated.

Also, sample data ran to completion.

Log file:
(snakemake) root@c844f1072fc5:/varCA# cat out/*

/varCA/Snakefile:51: UserWarning: Not all of the samples requested have provided input. Proceeding with as many samples as is possible...
rule all:
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 8
Rules claiming more threads will be scaled down.
Singularity containers: ignored
Job counts:
count jobs
1 all
1
[Thu Sep 9 11:59:06 2021]
localrule all:
jobid: 0
[Thu Sep 9 11:59:06 2021]
Finished job 0.
1 of 1 steps (100%) done
Complete log: /varCA/.snakemake/log/2021-09-09T115906.454263.snakemake.log

My config (note that the output directory I specify is ignored):

(snakemake) root@c844f1072fc5:/varCA# grep -vF '#' configs/config.yaml | sed '/^$/d'

sample_file: DATA/data/samples.tsv
SAMP_NAMES: [2294, 2296]
genome: DATA/bwa/genome.fa
out: DATA/out_01
snp_callers: [gatk-snp, varscan-snp, vardict-snp]
indel_callers: [gatk-indel, varscan-indel, vardict-indel, pindel, illumina-strelka]
snp_filter: ['gatk-snp~DP>10']
indel_filter: ['gatk-indel~DP>10']
snp_model: DATA/data/snp.rda
indel_model: DATA/data/indel.rda

Sample file:

(snakemake) root@c844f1072fc5:/varCA# cat DATA/data/samples.tsv

2294 DATA/2294.dup.fix.bam DATA/2294.bed
2296 DATA/2296.dup.fix.bam DATA/2296.bed

BAM and bed files referenced in samples.tsv are present:

(snakemake) root@c844f1072fc5:/varCA# ls DATA/229* | xargs -n 1 basename

2294.bam
2294.bam.bai
2294.bed
2294.dup.bam
2294.dup.bam.bai
2294.dup.fix.bam
2294.dup.fix.bam.bai
2294_peaks.narrowPeak
2296.bam
2296.bam.bai
2296.bed
2296.dup.bam
2296.dup.bam.bai
2296.dup.fix.bam
2296.dup.fix.bam.bai
2296_peaks.narrowPeak

As are indexes:

root@c844f1072fc5:/varCA# ls DATA/bwa | xargs -n 1 basename

genome.dict
genome.fa
genome.fa.amb
genome.fa.ann
genome.fa.bwt
genome.fa.fai
genome.fa.pac
genome.fa.sa

And models:

(snakemake) root@c844f1072fc5:/varCA# ls DATA/data | xargs -n 1 basename

README.md
indel.rda
indel.tsv.gz
samples.tsv
snp.rda
snp.tsv.gz

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions