This repository was archived by the owner on Jul 10, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 11
This repository was archived by the owner on Jul 10, 2024. It is now read-only.
Non-stereo Encoding Problem with Explicit Hydrogens #7
Copy link
Copy link
Open
Labels
Description
In certain cases, explicit hydrogens seem to cause trouble for the atom-labelling layer of the hash. In these cases, it seems that the smiles generated by the standardizer produces a different hash than the input molfile itself.
Consider the following poorly layed-out structure:
[molfile below]
Direct generation of hash from this Std_SMILES:
[H][C@@]12CC3=C(C(O)=C(OC)C(C)=C3)[C@@]([H])(N1C)[C@@]4([H])N([C@H]2O)[C@@]5([H])COC(=O)[C@]8(CS[C@]4([H])C6=C5C7=C(OCO7)C(C)=C6OC(C)=O)NCCC9=C8C=C(OC)C(O)=C9
And this hash:
DCLRH149F-FGAV2BD6PA-FA8DSLTXL4L-FALJX635AFC5
However, when that same smiles is fed into the standardizer, I get:
DCLRH149F-FFMPLZ16VC-FC1Y2MQMGXU-FCUZ42LBF8VB
If the explicit hydrogens are removed entirely:
The output hash is now compatible with the smiles.
DCLRH149F-FFMPLZ16VC-FC1Y2MQMGXU-FCU1SY5C8458
Molfile for explicit hydrogen version:
Ketcher 12201304332D 1 1.00000 0.00000 0
59 67 0 1 0 999 V2000
-2.2321 -1.8660 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.7321 -1.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.5981 -0.5000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.5981 0.5000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-3.4641 1.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-3.4641 2.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-4.3301 2.5000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.5981 2.5000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.5981 3.5000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
-3.4641 4.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.7321 2.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.8660 2.5000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
-1.7321 1.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.8660 0.5000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.4740 1.2647 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.7321 0.0000 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0
-0.9071 -0.4750 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.0000 1.0000 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
0.0000 -1.0000 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0
-0.8660 -1.5000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.8660 -2.5000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
0.8660 -1.5000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.8561 -2.3746 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
2.4488 -3.1947 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
4.5544 -3.0234 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
5.3132 -1.2000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
6.5741 -1.3179 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
4.8632 0.2250 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
4.0294 0.9234 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.9488 1.1197 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0
0.8660 0.5000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.8811 1.3246 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
1.7321 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.5981 0.5000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.4244 1.4848 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
1.6097 2.0768 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.7419 1.9858 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1.7927 2.9165 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
3.4641 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
3.9301 0.3000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
3.4641 -1.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
4.2072 -1.6691 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
3.8005 -2.5827 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.8060 -2.4781 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
2.5981 -1.5000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1.7321 -1.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
5.6506 -0.2222 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0
6.4172 0.1894 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
6.4966 1.1232 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
5.8342 1.6954 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
6.0136 2.6792 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
5.2512 3.3264 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
5.4306 4.3102 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
4.3096 2.9897 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
3.5473 3.6370 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
3.7266 4.6207 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
4.1303 2.0060 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
4.8676 1.3838 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2 1 1 1 0 0
2 3 1 0 0 0
3 4 1 0 0 0
4 5 1 0 0 0
5 6 2 0 0 0
6 7 1 0 0 0
6 8 1 0 0 0
8 9 1 0 0 0
9 10 1 0 0 0
8 11 2 0 0 0
11 12 1 0 0 0
11 13 1 0 0 0
4 13 2 0 0 0
13 14 1 0 0 0
14 15 1 1 0 0
14 16 1 0 0 0
2 16 1 0 0 0
16 17 1 0 0 0
14 18 1 0 0 0
18 19 1 1 0 0
18 20 1 0 0 0
20 21 1 0 0 0
2 21 1 0 0 0
21 22 1 1 0 0
20 23 1 0 0 0
23 24 1 1 0 0
23 25 1 0 0 0
25 26 1 0 0 0
26 27 1 0 0 0
27 28 2 0 0 0
29 27 1 0 0 0
29 30 1 6 0 0
30 31 1 0 0 0
31 32 1 0 0 0
18 32 1 0 0 0
32 33 1 1 0 0
32 34 1 0 0 0
34 35 1 0 0 0
35 36 1 0 0 0
36 37 1 0 0 0
37 38 1 0 0 0
37 39 2 0 0 0
35 40 2 0 0 0
40 41 1 0 0 0
40 42 1 0 0 0
42 43 1 0 0 0
43 44 1 0 0 0
44 45 1 0 0 0
45 46 1 0 0 0
42 46 2 0 0 0
46 47 1 0 0 0
23 47 1 0 0 0
34 47 2 0 0 0
29 48 1 0 0 0
48 49 1 0 0 0
49 50 1 0 0 0
50 51 1 0 0 0
51 52 1 0 0 0
52 53 2 0 0 0
53 54 1 0 0 0
53 55 1 0 0 0
55 56 1 0 0 0
56 57 1 0 0 0
55 58 2 0 0 0
58 59 1 0 0 0
29 59 1 0 0 0
51 59 2 0 0 0
M END
Reactions are currently unavailable

