Fix some problems with preprocessed datasets #761

beckobert · 2024-12-20T16:25:30Z

There seems to be a problem when using preprocessed datasets in combination with multiheads.
When getting a structure from HDF5Dataset, it is first loaded into a Configuration. When initializing the Configuration, the head is not specified and, therefore, is set to "Default" by default. Currently, the correct head saved to the HDF5Dataset is then only set, if configuration.head is None, which currently is never the case.

This pull request should fix that by always setting the head to the value saved in the HDF5Dataset and to Default, if it isn't specified (in line with how heads are set when turning the configuration into AtomicData).
In principle, this assignment can also be moved into the initialization of the Configuration.

There is also - indepentent of multiheads - a problem with preprocessed test sets, if they are preprocessed with multiple processes. They were, contrary to what the documentation says and run_train.py expects, not saved in their own directory, but instead in the same directory with different file names.

beckobert added 3 commits December 20, 2024 16:22

Correct assignment of head

157c7b3

fix preprocessed test sets

321e0a6

import glob correctly

15c2231

beckobert changed the title ~~Correct assignment of head~~ Fix some problems with preprocessed datasets Dec 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix some problems with preprocessed datasets #761

Fix some problems with preprocessed datasets #761

Uh oh!

beckobert commented Dec 20, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix some problems with preprocessed datasets #761

Are you sure you want to change the base?

Fix some problems with preprocessed datasets #761

Uh oh!

Conversation

beckobert commented Dec 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

beckobert commented Dec 20, 2024 •

edited

Loading