Skip to content

Conversation

@CompRhys
Copy link
Contributor

@CompRhys CompRhys commented Jun 9, 2025

This is a fairly hefty set of changes that remove the use of global dtype setting from MACE codebase. This is done by adding custom to methods to change the dtypes of buffers. dtype is also made into a kwarg for all dataset objects.

All tests pass apart from those that have expected values that I believe are seed dependent and something must have been affected here that changes those values.

A bunch of minor code consistency changes are also included in this PR. Notable things are using tmp_path fixtures rather than manually creating and cleaning tmp folders, using the cli commands for the cli tests, adding some directories to the tests to make it more clear which tests are cli tests and which are core, adding concurrency to tests to reduce ci minute waste on PRs pushed in quick succession, using consistent docstring styles, using npt.assert_allclose consistently

The one major remaining source of confusion is in the test_calculators.py where using float32 causes everything to return very different/wrong values (perhaps this failure is also related to what we see in TorchSim/torch-sim#93?).

Related: #877, #328

Comment on lines 150 to 197
layout=cueq_config.layout,
shared_weights=shared_weights,
internal_weights=internal_weights,
dtype=torch.get_default_dtype(),
math_dtype=torch.get_default_dtype(),
)
if (
OEQ_AVAILABLE
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure the dtype is necessary here, as the kernels might be different depending on dtype

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PASSIONLab/OpenEquivariance#118 are implementing necessary features now hence why I left it for oeq, I did just assume that it would already be in cueq so removed, will add a test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does turn out to be problematic which is maybe poor design from cuequivariance (especially given PASSIONLab/OpenEquivariance#118 was responsive and implemented a working .to method that does update the kernels according to maintainers)

@CompRhys
Copy link
Contributor Author

This is blocked by getting a solution to NVIDIA/cuEquivariance#124 and then updating the values in various tests that assert against seeded answers.

@CompRhys
Copy link
Contributor Author

.to method has been added in 0.6.0 cueq release so will try revive this at the weekend. It will require updating the pin to be >0.6.0 for cueq which may cascade other issues.

@CompRhys CompRhys force-pushed the isolate-dtypes branch 2 times, most recently from f6fd344 to 6685858 Compare August 17, 2025 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants