feat: avoid excessive low divergence iteration by spikymoth · Pull Request #73 · p-e-w/heretic

spikymoth · 2025-12-05T22:25:22Z

Technically 2 features and some cleanup:

Make direction scope configurable
Adjust objective to discourage iteration near 0 KL divergence
Enable grouping in sampler
Clean up the objective function

The 1st change is self-explanatory, although compared to the commit I pushed to #52 I simplified it a little by not using an enum. It's a bit more error prone this way because the same strings are repeated multiple times, but the enum also makes things more verbose.

The 2nd change implements your suggestion from #52 (comment), but I then added a utility function to that creates a smooth transition using a sigmoid function. Without this smoothing, best_trials seemed to get a bit confused and it wasn't as effective at avoiding low divergence values. With this change, I see a lot more trials that get the refusals score down, though mostly when turning up n_trials or cheating by manually narrowing the parameter range of direction_index.

The 3rd and 4th changes are together in 1 commit. By enabling grouping, TPESampler will automatically divide the search space across categorical parameters. Since direction_index is only used for direction_scope == "global", that's exactly what we need when both direction scopes are enabled. It also means we don't have to set direction_index unconditionally (although I think that even if you don't set group=True, it still works, it just doesn't divide the search space).

The other changes are mostly cleanup: I introduced variables num_layers and last_layer_index so they aren't computed each time, and we can use one or the other depending on the application - last_layer_index when we want to generate an index, num_layers when we want some fraction of the total size.

I experimented a lot with different parameterizations, like turning max_weight_position into an offset from direction_index or splitting min_weight and min_weight_distance up into forward and backward parts (making the weight asymmetric). I think the latter has some merit, but I'm not sure it's worth the cost of 2 extra parameters to optimize. Either way it was too speculative for this PR, so I limited the objective function changes to just cleanup.

I've tested this independently from #52. I expect the change in scoring to help even more there, but I think it's a good change regardless.

p-e-w

Thanks for the PR!

src/heretic/config.py

src/heretic/main.py

src/heretic/utils.py

spikymoth · 2025-12-09T19:43:09Z

The patch stack was getting very messy, so I rebased and merged the commits into logical units.

Scoring now uses a hard transition from KL divergence to (scaled) refusal count. I renamed the setting to kl_divergence_min and updated the comment.

Before showing the trial results, we now filter out any trials from best_trials that have a worse divergence with the same refusal count.

p-e-w · 2025-12-11T05:26:32Z

I thought about this some more, and we actually can't use study.best_trials at all with this approach.

The problem is that best_trials contains the Pareto front of the study – but that Pareto front is based on an objective that doesn't match our actual preferences. Note that we still want to rank trials by (kld, refusals) for the purpose of picking the best one, we just modify the optimization score to encourage convergence towards lower refusals.

As currently implemented, it's possible for best_trials to exclude trials that are on the Pareto front we care about, because it is based on the Pareto front the optimizer sees.

spikymoth · 2025-12-11T14:43:16Z

Alright, it looks like Pareto front calculation in Optuna (and maybe generally) is actually really simple. These are the relevant parts for our purposes:

Here loss_values are just the scores, since we're minimizing.

So it does the following:

Get the unique, lexicographically sorted values (in our case, sorted by KL divergence score, then by refusal score)
Get the cumulative minima for the 2nd objective
Mask off trials that don't improve the minimum

I implemented the same thing without numpy, which makes it even simpler. Performance should be fine for the number of trials we're dealing with here.

src/heretic/config.py

src/heretic/main.py

src/heretic/config.py

src/heretic/main.py

p-e-w · 2025-12-12T11:56:31Z

Btw, feel free to join the Discord (link in README) for more real-time communication. I can often respond there more quickly than on GitHub.

spikymoth · 2025-12-12T20:16:06Z

Btw, feel free to join the Discord (link in README) for more real-time communication. I can often respond there more quickly than on GitHub.

Sure, joined just now :) I don't think I'll be super active, but I'll try to keep an eye on notifications and such.

src/heretic/evaluator.py

Adjusts the scoring function to avoid targeting meaninglessly low KL divergences. Below a threshold value, the KL divergence score switches to the refusal count. Adds config option kl_divergence_target (defaulting to 0.01).

Create variables for num_layers and last_layer_index * Improves readability and makes choices explicit

p-e-w · 2025-12-14T09:01:04Z

Merged! I appreciate your patience in seeing this through until it was correct in every way, that's super valuable.

At some point in the future, I intend to test whether the kl_divergence_scale parameter actually carries its weight. Scale does matter with multi-objective TPE because it uses hypervolumes to determine overall improvement, but in practice, the KLD is usually on the scale of 1 for the type of optimization we do, so we might not need this to be configurable after all.

spikymoth · 2025-12-14T15:45:06Z

Yes, I also wonder about other changes to the scoring function like applying a power law (with an exponent smaller than 1) to the refusal count, so reducing the number of refusals from 2 to 1 has more weight than a reduction from 20 to 19. But it's quite tricky to go from my little experiments to something production ready.

p-e-w reviewed Dec 6, 2025

View reviewed changes

src/heretic/config.py Outdated Show resolved Hide resolved

src/heretic/main.py Outdated Show resolved Hide resolved

src/heretic/main.py Outdated Show resolved Hide resolved

src/heretic/main.py Outdated Show resolved Hide resolved

src/heretic/utils.py Outdated Show resolved Hide resolved

spikymoth force-pushed the cleanup-objective branch 2 times, most recently from 0d0aeaf to 2b129fe Compare December 9, 2025 19:37

spikymoth force-pushed the cleanup-objective branch from 2b129fe to 0c74546 Compare December 9, 2025 20:45

p-e-w reviewed Dec 11, 2025

View reviewed changes

src/heretic/config.py Outdated Show resolved Hide resolved

src/heretic/main.py Outdated Show resolved Hide resolved

src/heretic/config.py Outdated Show resolved Hide resolved

src/heretic/main.py Show resolved Hide resolved

spikymoth force-pushed the cleanup-objective branch from 869d70f to 67bb7b0 Compare December 12, 2025 19:45

spikymoth changed the title ~~feat: make direction scope configurable, improve scoring~~ feat: avoid excessive low divergence iteration Dec 12, 2025

p-e-w reviewed Dec 13, 2025

View reviewed changes

src/heretic/evaluator.py Outdated Show resolved Hide resolved

spikymoth added 3 commits December 14, 2025 03:31

feat: adjust scoring to avoid useless iteration

47fa8f0

Adjusts the scoring function to avoid targeting meaninglessly low KL divergences. Below a threshold value, the KL divergence score switches to the refusal count. Adds config option kl_divergence_target (defaulting to 0.01).

fix: Clean up parameter selection in objective

27c8f9a

Create variables for num_layers and last_layer_index * Improves readability and makes choices explicit

feat: Print the parameters of the selected model

9cd98b1

spikymoth force-pushed the cleanup-objective branch from 67bb7b0 to 9cd98b1 Compare December 14, 2025 02:35

p-e-w merged commit 9d17348 into p-e-w:master Dec 14, 2025
4 checks passed

spikymoth deleted the cleanup-objective branch January 3, 2026 02:07

Conversation

spikymoth commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

p-e-w left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

spikymoth commented Dec 9, 2025

Uh oh!

p-e-w commented Dec 11, 2025

Uh oh!

spikymoth commented Dec 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

p-e-w commented Dec 12, 2025

Uh oh!

spikymoth commented Dec 12, 2025

Uh oh!

Uh oh!

Uh oh!

p-e-w commented Dec 14, 2025

Uh oh!

spikymoth commented Dec 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

spikymoth commented Dec 5, 2025 •

edited

Loading