Improve batching #409

LucaMantani · 2025-10-09T14:19:55Z

This PR improves batching addressing the issue #404 .

The idea is that it precomputes the inv covmats if batching is fixed. Also, it allows to choose the option to change batches at each epoch.

Data batches can be controlled from the runcard with the following options:

batch_size: 128
batch_seed: 3
shuffle_each_epoch: False

Main implementations:

The stream of DataBatches returns a BatchSpec object which contains both indices and precomputed inv_cov, if available.
the node data_batches now allows to decide whether the batches should be shuffled each epoch or not. In the second case, if a fit_covariance matrix is provided, the inv_cov corresponding to the batch is precomputed.
The likelihood class call method now can receive a BatchSpec object and makes use of it by performing slicing and using the precomputed inv_cov if available. If the batch is not provided, it will behave as before. Note: currently only the MC method uses the batching but with this modification, any method could use it, i.e. the Hessian method might benefit from it in principle since it's also using the gradient_descent.

codecov · 2025-11-10T11:32:21Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.57%. Comparing base (cf194c6) to head (5debb01).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #409      +/-   ##
==========================================
+ Coverage   95.54%   95.57%   +0.02%     
==========================================
  Files          29       29              
  Lines        1438     1468      +30     
==========================================
+ Hits         1374     1403      +29     
- Misses         64       65       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

LucaMantani · 2025-11-10T12:19:49Z

Running the attached card (full DIS) with

les_houches_exe lh_batching.yaml -rep 1

in main takes:

[INFO]: MONTE CARLO RUNNING TIME: 389.940048 s

while in the PR takes

[INFO]: MONTE CARLO RUNNING TIME: 6.005040 s

Giving identical results.

lh_batching.yaml

vschutze-alt

I've made some small changes to the documentation. Other than that, it looks good to me.

LucaMantani added 6 commits October 9, 2025 14:57

Improved data_batches

4762038

Adapted monte_carlo fit to receive data batches provider

fbfb341

Upgraded gradient descent to use inv_cov

0fcf8ea

Fix bug

cf4ad67

Bug fixed

9d8df67

Added training indices

b9d2db3

LucaMantani marked this pull request as draft October 9, 2025 14:21

LucaMantani added 8 commits October 10, 2025 16:49

Merge branch 'one_likelihood' into improve_batching

7188b16

implemented a more compact use of batches

426a6e8

small cosmetics

9bc9871

train loss is jax array

12eaea8

Merge branch 'main' into improve_batching

43949ea

Docstring

37d651e

removed unused n_training_points

2ef3ab4

Fixed tests

ad8f3ce

LucaMantani marked this pull request as ready for review November 10, 2025 11:31

Add coverage

2077471

LucaMantani linked an issue Nov 10, 2025 that may be closed by this pull request

Optimise batching #404

Open

LucaMantani requested a review from comane November 10, 2025 12:19

LucaMantani and others added 5 commits November 25, 2025 16:04

Merge branch 'main' into improve_batching

244ec5d

Moved dataclass to core.py

6809a9d

Added doc

9169315

fix typos in docs and fix a docstring that sphinx didn't like

82f1ffb

run black

706f5bb

vschutze-alt approved these changes Dec 15, 2025

View reviewed changes

clarify settings

5debb01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve batching #409

Improve batching #409

Uh oh!

LucaMantani commented Oct 9, 2025 •

edited

Loading

Uh oh!

codecov bot commented Nov 10, 2025 •

edited

Loading

Uh oh!

LucaMantani commented Nov 10, 2025

Uh oh!

vschutze-alt left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Improve batching #409

Are you sure you want to change the base?

Improve batching #409

Uh oh!

Conversation

LucaMantani commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

LucaMantani commented Nov 10, 2025

Uh oh!

vschutze-alt left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

LucaMantani commented Oct 9, 2025 •

edited

Loading

codecov bot commented Nov 10, 2025 •

edited

Loading