Skip to content

[BUG] Macaca fascicularis sqanti_qc.py error #501

@sfodel

Description

@sfodel

Is there an existing issue for this?

  • I have searched the existing issues

Have you loaded the SQANTI3.env conda environment?

  • I have loaded the SQANTI3.env conda environment

Problem description

Following Pablo's recommendation to open an issue here, from the original question: #498

In short, my sqanti_qc.py fails for a Macaca fascicularis novel-generated transcriptome (coming from isoseq collapse - original output produced using PacBio kinnex):

conda run sqanti3_qc.py hq_transcripts_CynoH_uniques.fa Macaca_fascicularis.Macaca_fascicularis_6.0.114.chr.gtf Macaca_fascicularis.Macaca_fascicularis_6.0.dna_sm.toplevel.fa -o CynoH_SQANTI -t 60 --force_id_ignore --report skip

my hq_transcripts_CynoH_uniques.fa has all unique entries, formatted like:

transcript/2146606 sample:BioSample_2

That's why I use the --force_id_ignore parameter.

Any idea what might be causing this error?

Thank you in advance,
Stelios

Code sample

sudo docker run -it -u $(id -u):$(id -g) -v /home/Cyno/:/data2 sqanti3 sqanti3_qc.py hq_transcripts_CynoH_uniques.fa Macaca_fascicularis.Macaca_fascicularis_6.0.114.chr.gtf Macaca_fascicularis.Macaca_fascicularis_6.0.dna_sm.toplevel.fa -o CynoH_SQANTI -t 60 --force_id_ignore --report skip

Error

This is the traceback:

**** Predicting ORF sequences...
Running ORF prediction on /data2/CynoH_SQANTI_corrected.fasta
**** Parsing Reference Transcriptome....
./refAnnotation_CynoH_SQANTI.genePred already exists. Using it.
**** Parsing Isoforms....
No short-reads or coverage provided. Skipping short-read coverage calculation.
**** TSS ratio will not be calculated since SR information was not provided
**** Performing Classification of Isoforms....
Traceback (most recent call last):
File "/opt2/sqanti3/5.3.6/SQANTI3-5.3.6/sqanti3_qc.py", line 84, in
main()
File "/opt2/sqanti3/5.3.6/SQANTI3-5.3.6/sqanti3_qc.py", line 68, in main
run(args)
File "/opt2/sqanti3/5.3.6/SQANTI3-5.3.6/src/qc_pipeline.py", line 92, in run
isoforms_info, ratio_TSS_dict = isoform_classification_pipeline(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt2/sqanti3/5.3.6/SQANTI3-5.3.6/src/classification_main.py", line 62, in isoform_classification_pipeline
assign_genomic_coordinates(isoform_hit, rec)
File "/opt2/sqanti3/5.3.6/SQANTI3-5.3.6/src/classification_steps.py", line 99, in assign_genomic_coordinates
isoform_hit.CDS_genomic_start = m[isoform_hit.CDS_start-1] + 1 # make it 1-based
~^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 2482
ERROR conda.cli.main_run:execute(125): conda run sqanti3_qc.py hq_transcripts_CynoH_uniques.fa Macaca_fascicularis.Macaca_fascicularis_6.0.114.chr.gtf Macaca_fascicularis.Macaca_fascicularis_6.0.dna_sm.toplevel.fa -o CynoH_SQANTI -t 60 --force_id_ignore --report skip failed. (See above for error)

Anything else?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    triageFor developers to check

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions