-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
It's unclear to me when to use the --no_force_align option to ProGraph. The README describes this as
do not force alignment of initial Methionine
What's the scientific motivation for skipping initial M by default?
I ask because of a potential bug in the interaction with the --repeat option, which matches the sequences to a T-Reks output alignment. These files reference sequence positions, so they cause an off-by-one error if the M was stripped.
I can think of several possible solutions:
- Default to
--no_force_alignwhen the--repeatoption is also specified - For each sequence, store a flag indicating whether it has been truncated. If so, account for that when reading in the repeats file
- Be more permissive when verifying the FASTA/T-REKS alignment. Automatically recover from off-by-one errors in the coordinates. (This would have the side benefit of supporting malformed T-Reks files that used 0-based indexes rather than the correct 1-based positions.)
Metadata
Metadata
Assignees
Labels
No labels