Skip to content

mismatched line lengths at line 3 within sequence #147

@vivekruhela

Description

@vivekruhela

I am trying to get somatic mutations using speedseq somatic function. But everytime I am getting an error of mismatch line length. I am using hg19 fasta file from UCSC. I have also tried the referencce file mentuoned in speedseq readme file (human_g1k_v37.fasta.gz) but still getting the same error. The command and detailed error messaafe is shown below:

/home/akansha/speedseq/bin/speedseq somatic /home/akansha/vivekruhela/refs/hg19/ucsc.hg19.fasta /home/akansha/vivekruhela/ega_data_1901/bam_files/CR-MGUS-10_10/CR-MGUS-10_10-PB_dedup.realigned.bam /home/akansha/vivekruhela/ega_data_1901/bam_files/CR-MGUS-10_10/CR-MGUS-10_10-BM_dedup.realigned.bam -o /home/akansha/vivekruhela/ega_data_1901/speedseq_analysis/test

Error Message:

Sourcing executables from /home/akansha/speedseq/bin/speedseq.config ...
Calling somatic variants...

    create temporary directory

    /home/akansha/speedseq//bin/sambamba view -H /home/akansha/vivekruhela/ega_data_1901/sequenza_delly_analysis/CR-MGUS-10_10/CR-MGUS-10_10-PB_dedup_filtered.bam | grep "^@SQ" | cut -f 2- | awk '{ gsub("^SN:","",$1); gsub("^LN:","",$2); print $1"\t0\t"$2; }' > CR-MGUS-10_10-BM_dedup_filtered.bam.CNz74wZ1KYmM/windows.bed

    /home/akansha/speedseq//bin/freebayes -f /home/akansha/vivekruhela/ega_data_1901/sequenza_delly_analysis/CR-MGUS-10_10/CR-MGUS-10_10-BM_dedup_filtered.bam \
        --pooled-discrete \
        --min-repeat-entropy 1 \
        --genotype-qualities \
        --min-alternate-fraction 0.05 \
        --min-alternate-count 2 \
        --region $chrom:$start..$end \
        /home/akansha/vivekruhela/ega_data_1901/sequenza_delly_analysis/CR-MGUS-10_10/CR-MGUS-10_10-PB_dedup_filtered.bam /home/akansha/vivekruhela/ega_data_1901/sequenza_delly_analysis/CR-MGUS-10_10/CR-MGUS-10_10-BM_dedup_filtered.bam \
        | somatic_filter 1e-5 18 0 \
        > CR-MGUS-10_10-BM_dedup_filtered.bam.CNz74wZ1KYmM/CR-MGUS-10_10-BM_dedup_filtered.bam.$chrom:$start..$end.vcf

    cat CR-MGUS-10_10-BM_dedup_filtered.bam.CNz74wZ1KYmM/var_command.txt | /home/akansha/speedseq//bin/parallel -j 1
index file /home/akansha/vivekruhela/ega_data_1901/sequenza_delly_analysis/CR-MGUS-10_10/CR-MGUS-10_10-BM_dedup_filtered.bam.fai not found, generating...
ERROR: mismatched line lengths at line 3 within sequence 
File not suitable for fasta index generation.
index file /home/akansha/vivekruhela/ega_data_1901/sequenza_delly_analysis/CR- MGUS-10_10/CR-MGUS-10_10-BM_dedup_filtered.bam.fai not found, generating...
ERROR: mismatched line lengths at line 3 within sequence 
File not suitable for fasta index generation.
index file /home/akansha/vivekruhela/ega_data_1901/sequenza_delly_analysis/CR-  MGUS-10_10/CR-MGUS-10_10-BM_dedup_filtered.bam.fai not found, generating...
ERROR: mismatched line lengths at line 3 within sequence 
File not suitable for fasta index generation.
index file /home/akansha/vivekruhela/ega_data_1901/sequenza_delly_analysis/CR- MGUS-10_10/CR-MGUS-10_10-BM_dedup_filtered.bam.fai not found, generating...
ERROR: mismatched line lengths at line 3 within sequence 
File not suitable for fasta index generation.
index file /home/akansha/vivekruhela/ega_data_1901/sequenza_delly_analysis/CR-MGUS-10_10/CR-MGUS-10_10-BM_dedup_filtered.bam.fai not found, generating...
ERROR: mismatched line lengths at line 3 within sequence 
File not suitable for fasta index generation.
index file /home/akansha/vivekruhela/ega_data_1901/sequenza_delly_analysis/CR-MGUS-10_10/CR-MGUS-10_10-BM_dedup_filtered.bam.fai not found, generating...
ERROR: mismatched line lengths at line 3 within sequence 
File not suitable for fasta index generation.
index file /home/akansha/vivekruhela/ega_data_1901/sequenza_delly_analysis/CR-MGUS-10_10/CR-MGUS-10_10-BM_dedup_filtered.bam.fai not found, generating...
ERROR: mismatched line lengths at line 3 within sequence 
File not suitable for fasta index generation.
index file /home/akansha/vivekruhela/ega_data_1901/sequenza_delly_analysis/CR-MGUS-10_10/CR-MGUS-10_10-BM_dedup_filtered.bam.fai not found, generating...
ERROR: mismatched line lengths at line 3 within sequence 
File not suitable for fasta index generation.
index file /home/akansha/vivekruhela/ega_data_1901/sequenza_delly_analysis/CR-MGUS-10_10/CR-MGUS-10_10-BM_dedup_filtered.bam.fai not found, generating...
ERROR: mismatched line lengths at line 3 within sequence 
File not suitable for fasta index generation.
index file /home/akansha/vivekruhela/ega_data_1901/sequenza_delly_analysis/CR-MGUS-10_10/CR-MGUS-10_10-BM_dedup_filtered.bam.fai not found, generating...
grep "^##" CR-MGUS-10_10-BM_dedup_filtered.bam.CNz74wZ1KYmM/CR-MGUS-10_10-BM_dedup_filtered.bam.1:0..249250621.vcf \
    | cat - <(echo '##INFO=<ID=SSC,Number=1,Type=Float,Description="Somatic score">') <(grep "^#CHROM" CR-MGUS-10_10-BM_dedup_filtered.bam.CNz74wZ1KYmM/CR-MGUS-10_10-BM_dedup_filtered.bam.1:0..249250621.vcf) > CR-MGUS-10_10-BM_dedup_filtered.bam.CNz74wZ1KYmM/header.txt

    cat CR-MGUS-10_10-BM_dedup_filtered.bam.CNz74wZ1KYmM/CR-MGUS-10_10-BM_dedup_filtered.bam."$chrom:$start..$end".vcf | grep -v "^#" \
        | sort -k1,1 -k2,2n | cat CR-MGUS-10_10-BM_dedup_filtered.bam.CNz74wZ1KYmM/header.txt - \
        | /home/akansha/speedseq//bin/bgzip -c > CR-MGUS-10_10-BM_dedup_filtered.bam.vcf.gz

  /home/akansha/speedseq//bin/tabix -f -p vcf CR-MGUS-10_10-BM_dedup_filtered.bam.vcf.gz
# Make PED file
echo -e "1\tCR-MGUS-10_10-PB\tNone\tNone\t0\t1\n1\tCR-MGUS-10_10-BM\tNone\tNone\t0\t2" > CR-MGUS-10_10-BM_dedup_filtered.bam.ped

    rm -r CR-MGUS-10_10-BM_dedup_filtered.bam.CNz74wZ1KYmM
Done

Kindly Suggest how to deal with this issue. Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions