Retrieving the entire query sequence from a blast, not just the 'local' aligned HSP  

Hi there

I am using PyBlast to find similar sequences (obviously) in a different genome. None of these are model organisms.
The queries are fasta files, each with the same gene but from different strains. So I have about 12 genes and 10 strains.
I need to get the length of the entire query, not just the length of the part of the query that aligned to the database. Do you know if that's accessible?

If not, I will have to blast each sequence in each fasta file separately, which will be slower than blasting the whole file at a time.
This is what it looks like now:

````
for q in qdir.glob('*.fasta'):
    bcl = BCLine6("blastn", query=q,
    subject=db, word_size=11, evalue=0.01, outfmt="evalue sstrand")
    res = bcl.run(ncore=8, quiet=True)
    print(f'query length = {len(q.seq)}')
`````
But of course q is not the actual sequence record, but a file name.
And `qlen` is the length of the aligned query, not the length of the whole query sequence.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retrieving the entire query sequence from a blast, not just the 'local' aligned HSP #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Retrieving the entire query sequence from a blast, not just the 'local' aligned HSP #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions