Skip to content

Auto-differentiable numerical inverse & efficient materialization of BNAF masks#234

Open
noahewolfe wants to merge 9 commits intodanielward27:mainfrom
noahewolfe:diffable-inverse-bnaf
Open

Auto-differentiable numerical inverse & efficient materialization of BNAF masks#234
noahewolfe wants to merge 9 commits intodanielward27:mainfrom
noahewolfe:diffable-inverse-bnaf

Conversation

@noahewolfe
Copy link

@noahewolfe noahewolfe commented Feb 9, 2026

Here, we make two updates which are particularly relevant for block-neural autoregressive flows:

  1. Auto-differentiable numerical inverse via the implicit function theorem. (Fixes Autodiff problem with block_neural_autoregressive_flow #176)
  • Using the implicit function theorem (I followed https://arxiv.org/abs/2111.00254 in particular), we define a custom jacobian-vector product for NumericalInverse transforms.
  • Adds lineax as a dependency for memory-efficient computation of the JVP.
  • I've tested that this works using the block_neural_autoregressive_flow and the default greedy bisection search (in a publication to be on the arxiv in the next week or two).
  • I added very light unit tests, which pass.
  1. Efficient materialization of block masks with Kronecker products
  • I found that it was slow to initialize (e.g., just put into memory---no training or anything) block-neural autoregressive flows with large nn_block_dim.
  • I traced this to block_diag_mask and block_tril_mask, and reimplemented these with kronecker products.
  • These changes pass the unit tests in test_masks.py

Let me know what you think, happy to answer any questions and take any feedback!

@danielward27
Copy link
Owner

Nice! Cheers. I'm a bit busy at the moment but will try to check it out before the end of the week.

Copy link
Owner

@danielward27 danielward27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much again for the contribution! It looks good to me, I've just left some comments with some minor suggestions, let me know what you think.

(1, block_dim),
]

def make_layer(inp):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems to be unused?

bijection: AbstractBijection,
inverter: Callable[[AbstractBijection, Array, Array | None], Array],
diffable_inverter: bool = False,
raise_old_error: bool = False,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove the raise_old_error. If we don't bother using the legacy behavior for a deprecation cycle then we should probably just not bother including it to simplify the code a bit.

self,
bijection: AbstractBijection,
inverter: Callable[[AbstractBijection, Array, Array | None], Array],
diffable_inverter: bool = False,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it would feel less confusing to replace diffable_inverter argument with use_implicit_diff or use_implicit_differentiation as a key word only argument defaulting to True.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Autodiff problem with block_neural_autoregressive_flow

2 participants