Skip to content

Validation errors #733

@lcoutos

Description

@lcoutos

Hi, I'm working with a large variants dataset, and some of them raised some validation bugs. I have four types of issues (I can make a separate issue for each of them if it is easier for you):

NM_012234.7 errors

I have three variants on NM_012234.7 that are normalized as 5'UTR variants by VariantValidator, but missense variants by Mutalyzer and VEP:

Input assembly url output expected (Mutalyzer)
NC_000003.12:g.72446581A>C GRCh38 https://rest.variantvalidator.org/VariantValidator/variantvalidator/GRCh38/NC_000003.12:g.72446581A%3EC/select?content-type=application%2Fjson NM_012234.7:c.-141= NC_000003.12(NM_012234.7):c.43T>G, p.(Trp15Gly)
NC_000003.12:g.72446533A>C GRCh38 https://rest.variantvalidator.org/VariantValidator/variantvalidator/GRCh38/NC_000003.12:g.72446533A%3EC/select?content-type=application%2Fjson NM_012234.7:c.-93= NC_000003.11(NM_012234.7):c.91T>G
NC_000003.12:g.72446553T>C GRCh38 https://rest.variantvalidator.org/VariantValidator/variantvalidator/GRCh38/NC_000003.12:g.72446553T%3EC/select?content-type=application%2Fjson NM_012234.7:c.-113= NC_000003.12(NM_012234.7):c.71A>G

"N" as Ref base

For two variants on NM_001017915.3, the validated result as "N" as Reference sequence instead of the input ref:

Input assembly url output expected (Mutalyzer)
chr2:233998842:C:T GRCh37 https://rest.variantvalidator.org/VariantValidator/variantvalidator/GRCh37/chr2:233998842:C:T/select?content-type=application%2Fjson NC_000002.11:g.234052046N>T - NM_001017915.3:c.739-2749N>T NC_000002.11:g.233998842C>T - NM_001017915.3:c.663+3486C>T
chr2:233998852:G:A GRCh37 https://rest.variantvalidator.org/VariantValidator/variantvalidator/GRCh37/chr2:233998852:G:A/select?content-type=application%2Fjson NC_000002.11:g.234052056N>A - NM_001017915.3:c.739-2739N>A NC_000002.11:g.233998852G>A - NM_001017915.3:c.663+3496G>A

Correct validation by VariantValidator but not VariantFormatter

I have a set of dup variants with a correct HGVSp prediction in VariantValidator, but not in VariantFormatter (Alt amino acids added after "dup"):

Input assembly url output (VariantFormatter) expected (VariantValidator)
NC_000012.12:g.50086346_50086348dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000012.12%3Ag.50086346_50086348dup/refseq/NM_003076.4/False?content-type=application%2Fjson NP_003067.3:p.(Asn122dupN) NP_003067.3:p.(Asn122dup)
NC_000012.12:g.50086346_50086348dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000012.12:g.50086346_50086348dup/refseq/NM_003076.4/False?content-type=application%2Fjson NP_003067.3:p.(Asn122dupN) NP_003067.3:p.(Asn122dup)
NC_000012.12:g.132643541_132643543dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000012.12:g.132643541_132643543dup/refseq/NM_006231.2/False?content-type=application%2Fjson NP_006222.2:p.(Ala1437dupA) NP_006222.2:p.(Ala1437dup)
NC_000001.11:g.26697128_26697130dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000001.11:g.26697128_26697130dup/refseq/NM_006015.4/False?content-type=application%2Fjson NP_006006.3:p.(Gly242dupG) NP_006006.3:p.(Gly242dup)
NC_000005.10:g.168762604_168762612dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000005.10:g.168762604_168762612dup/refseq/NM_003062.3/False?content-type=application%2Fjson NP_003053.1:p.(Gly513_Ile515dupGTI) NP_003053.1:p.(Gly513_Ile515dup)
NC_000023.11:g.67546511_67546519dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000023.11:g.67546511_67546519dup/refseq/NM_000044.3/False?content-type=application%2Fjson NP_000035.2:p.(Gly471_Gly473dupGGG) NP_000035.2:p.(Gly471_Gly473dup)
NC_000013.11:g.109785992_109785994dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000013.11:g.109785992_109785994dup/refseq/NM_003749.2/False?content-type=application%2Fjson NP_003740.2:p.(Asn28dupN) NP_003740.2:p.(Asn28dup)
NC_000014.9:g.37592322_37592324dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000014.9:g.37592322_37592324dup/refseq/NM_004496.3/False?content-type=application%2Fjson NP_004487.2:p.(Gly157dupG) NP_004487.2:p.(Gly157dup)
NC_000001.11:g.26696529_26696531dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000001.11:g.26696529_26696531dup/refseq/NM_006015.4/False?content-type=application%2Fjson NP_006006.3:p.(Ala45dupA) NP_006006.3:p.(Ala45dup)
NC_000012.12:g.25245353_25245358dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000012.12:g.25245353_25245358dup/refseq/NM_004985.3%7CNM_033360.2/False?content-type=application%2Fjson NP_004976.2:p.(Ala11_Gly12dupAG) NP_004976.2:p.(Ala11_Gly12dup)
NC_000012.12:g.45849629_45849631dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000012.12:g.45849629_45849631dup/refseq/NM_152641.2/False?content-type=application%2Fjson NP_689854.2:p.(Asn589dupN) NP_689854.2:p.(Asn589dup)
NC_000007.14:g.55181309_55181317dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000007.14:g.55181309_55181317dup/refseq/NM_005228.3/False?content-type=application%2Fjson NP_005219.2:p.(Ala767_Val769dupASV) NP_005219.2:p.(Ala767_Val769dup)
NC_000023.11:g.67546558_67546566dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000023.11:g.67546558_67546566dup/refseq/NM_000044.3/False?content-type=application%2Fjson NP_000035.2:p.(Gly471_Gly473dupGGG) NP_000035.2:p.(Gly471_Gly473dup)
NC_000017.11:g.7673790_7673792dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000017.11:g.7673790_7673792dup/refseq/NM_001126112.2/False?content-type=application%2Fjson NP_001119584.1:p.(Cys277dupC) NP_001119584.1:p.(Cys277dup)
NC_000023.11:g.154428803_154428808dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000023.11:g.154428803_154428808dup/refseq/NM_001183.4/False?content-type=application%2Fjson NP_001174.2:p.(Ala40_Ala41dupAA) NP_001174.2:p.(Ala40_Ala41dup)
NC_000009.12:g.21974775_21974780dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000009.12:g.21974775_21974780dup/refseq/NM_000077.4/False?content-type=application%2Fjson NP_000068.1:p.(Thr18_Ala19dupTA) NP_000068.1:p.(Thr18_Ala19dup)
NC_000007.14:g.140753340_140753342dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000007.14:g.140753340_140753342dup/refseq/NM_004333.4/False?content-type=application%2Fjson NP_004324.2:p.(Thr599dupT) NP_004324.2:p.(Thr599dup)
NC_000007.14:g.140753338_140753340dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000007.14:g.140753338_140753340dup/refseq/NM_004333.4/False?content-type=application%2Fjson NP_004324.2:p.(Thr599dupT) NP_004324.2:p.(Thr599dup)
NC_000017.11:g.7670666_7670671dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000017.11:g.7670666_7670671dup/refseq/NM_001126112.2/False?content-type=application%2Fjson NP_001119584.1:p.(Ala347_Leu348dupAL) NP_001119584.1:p.(Ala347_Leu348dup)
NC_000007.14:g.98910180_98910212dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000007.14:g.98910180_98910212dup/refseq/NM_001244580.1/False?content-type=application%2Fjson NP_001231509.1:p.(Ala492_Pro502dupAAPGPAPSPAP) NP_001231509.1:p.(Ala492_Pro502dup)
NC_000005.10:g.112841957_112841959dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000005.10:g.112841957_112841959dup/refseq/NM_000038.5/False?content-type=application%2Fjson NP_000029.2:p.(Ala2122dupA) NP_000029.2:p.(Ala2122dup)
NC_000004.12:g.54726014_54726019dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000004.12:g.54726014_54726019dup/refseq/NM_000222.2/False?content-type=application%2Fjson NP_000213.1:p.(Ala502_Tyr503dupAY) NP_000213.1:p.(Ala502_Tyr503dup)
NC_000012.12:g.25245346_25245348dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000012.12:g.25245346_25245348dup/refseq/NM_004985.3%7CNM_033360.2/False?content-type=application%2Fjson NP_004976.2:p.(Gly13dupG) NP_004976.2:p.(Gly13dup)
NC_000017.11:g.7674245_7674250dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000017.11:g.7674245_7674250dup/refseq/NM_001126112.2/False?content-type=application%2Fjson NP_001119584.1:p.(Asn239_Ser240dupNS) NP_001119584.1:p.(Asn239_Ser240dup)
NC_000017.11:g.1656537_1656545dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000017.11:g.1656537_1656545dup/refseq/NM_006445.3/False?content-type=application%2Fjson NP_006436.3:p.(Asn1881_Val1883dupNIV) NP_006436.3:p.(Asn1881_Val1883dup)
NC_000012.12:g.25245355_25245357dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000012.12:g.25245355_25245357dup/refseq/NM_004985.3%7CNM_033360.2/False?content-type=application%2Fjson NP_004976.2:p.(Gly10dupG) NP_004976.2:p.(Gly10dup)
NC_000016.10:g.31185099_31185101dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000016.10:g.31185099_31185101dup/refseq/NM_004960.3/False?content-type=application%2Fjson NP_004951.1:p.(Gly231dupG) NP_004951.1:p.(Gly231dup)
NC_000019.10:g.11012988_11012990dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000019.10:g.11012988_11012990dup/refseq/NM_001128844.1%7CNM_001128849.1%7CNM_003072.3/False?content-type=application%2Fjson NP_001122316.1:p.(Asn772dupN) NP_001122316.1:p.(Asn772dup)
NC_000004.12:g.54726014_54726019dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000004.12:g.54726014_54726019dup/refseq/NM_000222.2/False?content-type=application%2Fjson NP_000213.1:p.(Ala502_Tyr503dupAY) NP_000213.1:p.(Ala502_Tyr503dup)
NC_000004.12:g.54727489_54727527dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000004.12:g.54727489_54727527dup/refseq/NM_000222.2/False?content-type=application%2Fjson NP_000213.1:p.(Thr574_Arg586dupTQLPYDHKWEFPR) NP_000213.1:p.(Thr574_Arg586dup)
NC_000004.12:g.54727487_54727522dup GRCh38 https://rest.variantvalidator.org/VariantFormatter/variantformatter/GRCh38/NC_000004.12:g.54727487_54727522dup/refseq/NM_000222.2/False?content-type=application%2Fjson NP_000213.1:p.(Thr574_Pro585dupTQLPYDHKWEFP) NP_000213.1:p.(Thr574_Pro585dup)

Simple SNP to large delins

The GRCh37 chr11:118650341:C:T variant is normalized as simple SNP for most of transcripts, and very large delins for some of them (included new versions of previous ones): https://rest.variantvalidator.org/VariantValidator/variantvalidator/GRCh37/chr11%3A118650341%3AC%3AT/select?content-type=application%2Fjson
Mutalyzer gives me a correct SNP for all transcripts (input: "GRCh37(chr11):g.118650341C>T" | output: "NC_000011.9(NM_004397.6):c.369G>A")

Those errors are all raised by the python library and REST api.
Thanks a lot for your help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions