Skip to content

Base quality score in new NovaSeqX software #658

@meixilin

Description

@meixilin

Hi,

Thank you for making this great software! I am analyzing some data that are sequenced before October 2023 and after. And we found strong batch effect in the PCA (also in the beagle output).

I think this is caused by the maximum base quality score increase from Q37 to Q40 detailed in tihs message here:

link

Other than manually changing the quality scores, what would you recommend to mitigate this effect? Thank you!

example in the new scoring system:

LH00132:319:2257HKLT4:8:2106:18876:9895 163     NW_025814823.1  14495   21      65M     =       14495   65      CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAGCCCTAACCCTAACCCTAAC        IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII-IIIIIIIIIIIIIIIIIII       MC:Z:65M        MD:Z:65 PG:Z:MarkDuplicates      RG:Z:LH00132.319.2257HKLT4.8    NM:i:0  MQ:i:21 UQ:i:0  AS:i:65

example in the old:

LH00132:109:225GLJLT3:8:1102:48618:4653 99      NW_025814823.1  14421   31      141M    =       14439   159     CCCTAACCCTAACCCTAACCCTAGCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAGCCCTAACCCTAACCCTAACCC    FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF5FFFFFFFFFFFFFFF-FFF-FFFFFFFFFFFFFFFFFF5FFFF-F--F    MC:Z:141M       MD:Z:23a117     PG:Z:MarkDuplicates.C    RG:Z:LH00132.109.225GLJLT3.8    NM:i:1  MQ:i:31 UQ:i:37 AS:i:136

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions