-
Notifications
You must be signed in to change notification settings - Fork 27
Description
Hi @hasindu2008 ,
Thank you for developing f5c !
I wonder know whether f5c eventalign can output basecalling kmer?
Generally, f5c eventalign will output the following header files.
contig position reference_kmer read_index strand event_index event_level_mean event_stdv event_length model_kmer model_mean model_stdv standardized_level start_idx end_idx samples
transcript 9 TTATA 0 t 27 93.44 2.558 0.00465 TTATA 93.29 2.63 0.05 143742 143756 94.4323,94.7009,93.4921,94.5666,96.3125,94.7009,96.1782,92.9549,93.6264,94.8352,91.4775,94.8352,89.4629,86.6424
Whether a column basecalling kmer can be added at the end?
Like this
contig position reference_kmer read_index strand event_index event_level_mean event_stdv event_length model_kmer model_mean model_stdv standardized_level start_idx end_idx samples basecalling_kmer
transcript 9 TTATA 0 t 27 93.44 2.558 0.00465 TTATA 93.29 2.63 0.05 143742 143756 94.4323,94.7009,93.4921,94.5666,96.3125,94.7009,96.1782,92.9549,93.6264,94.8352,91.4775,94.8352,89.4629,86.6424 TTATA
I think the reason for this is that basecalling error occurs in fastq files. (Like A-->T mutation, or deletion and so on)
Take a example.
Here is my eventalign file.

Here is the tsv file I generated with sam2tsv.

In position 39, the reference kmer is CTTTC.
But its sequence in fastq file is TTTTC.
Best wishes,
Kirito