-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
One of the derived parameters is hash_structure, which collects structural information and hashes it for matching of structures. I've noticed that this does not product the same for a structure before and after passing through ASE's extxyz write & read.
This seems to make this hash function somewhat useless, since we can reasonably expect use cases where structures are being read/written/calculated a lot of times.
As demonstration, see a failing test:
import io
import ase
import ase.io
import numpy as np
import pytest
from ase.calculators.lj import LennardJones
from abcd.model import AbstractModel
@pytest.fixture
def rng():
return np.random.default_rng(seed=42)
def test_hash_structure(rng):
# create atoms & add a calculator
atoms = ase.Atoms(
"H3",
positions=rng.random(size=(3, 3)),
pbc=True,
cell=[2, 2, 2],
)
atoms.calc = LennardJones()
atoms.calc.calculate(atoms)
# dump to XYZ
buffer = io.StringIO()
ase.io.write(buffer, atoms, format="extxyz")
# read back
buffer.seek(0)
atoms_read = ase.io.read(buffer, format="extxyz")
# read in both of them
abcd_data = AbstractModel.from_atoms(atoms)
abcd_data_after_read = AbstractModel.from_atoms(atoms_read)
assert abcd_data["hash_structure"] == abcd_data_after_read["hash_structure"]which fails with
> assert abcd_data["hash_structure"] == abcd_data_after_read["hash_structure"]
E AssertionError: assert '8bb52aa8ab4be61a8ad0f76b32a0c810' == '4cf0d1cbde645832830e6f6d60f86c9a'
E
E - 4cf0d1cbde645832830e6f6d60f86c9a
E + 8bb52aa8ab4be61a8ad0f76b32a0c810
Metadata
Metadata
Assignees
Labels
No labels