-
Notifications
You must be signed in to change notification settings - Fork 56
Open
Description
Hi,
Thank you for making this great software! I am analyzing some data that are sequenced before October 2023 and after. And we found strong batch effect in the PCA (also in the beagle output).
I think this is caused by the maximum base quality score increase from Q37 to Q40 detailed in tihs message here:
Other than manually changing the quality scores, what would you recommend to mitigate this effect? Thank you!
example in the new scoring system:
LH00132:319:2257HKLT4:8:2106:18876:9895 163 NW_025814823.1 14495 21 65M = 14495 65 CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAGCCCTAACCCTAACCCTAAC IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII-IIIIIIIIIIIIIIIIIII MC:Z:65M MD:Z:65 PG:Z:MarkDuplicates RG:Z:LH00132.319.2257HKLT4.8 NM:i:0 MQ:i:21 UQ:i:0 AS:i:65
example in the old:
LH00132:109:225GLJLT3:8:1102:48618:4653 99 NW_025814823.1 14421 31 141M = 14439 159 CCCTAACCCTAACCCTAACCCTAGCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAGCCCTAACCCTAACCCTAACCC FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF5FFFFFFFFFFFFFFF-FFF-FFFFFFFFFFFFFFFFFF5FFFF-F--F MC:Z:141M MD:Z:23a117 PG:Z:MarkDuplicates.C RG:Z:LH00132.109.225GLJLT3.8 NM:i:1 MQ:i:31 UQ:i:37 AS:i:136
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels