Adding parameterizations of Hurwitz matrices#47
Conversation
…of matrix eigenvalues, modifying skew class to implement a right inverse and using _register_manifold_ instead of register_parameterization. Now skew constructor needs a size argument.
|
Oh, the first external contribution to the project! Sadly, the project is a bit out of date, but given that it's useful for people, I'll try to spend some time one of these days bringing it up to speed (supporting new PyTorch versions, perhaps even torch.compile, updating the action runners, etc) This may take a bit of time, and if I don't answer in a couple weeks feel free to ping me again, but yeah, let's upgrade geotorch a bit! |
|
Alright I won't hesitate to reach out in a few weeks, you can contact me if you want me to do some adjustments on the package I'll be happy to help ! |
|
Hello, Mario ! I wonder if you have any spare time in following weeks to work on the project ? I may be very interested in making the library work with torch.compile but it seems the current main issue is the cache manager but it is so convenient for me which is training rnns that I do not know if it is worth to drop it for compile. Do you want me to look on it ? Cheers Alexandre |
lezcano
left a comment
There was a problem hiding this comment.
First review, will do another pass this weekend
geotorch/constraints.py
Outdated
| r"""Adds an alpha_stability parametrization to the matrix ``module.tensor_name``. | ||
|
|
||
| When accessing ``module.tensor_name``, the module will return the parametrized | ||
| version :math:`A` so that :math:`\min_{\lambda \in Sp(A)} \Re(\lambda)<= -\alpha`. |
There was a problem hiding this comment.
Note that we don't use the spectrum notation here. Can you rewrite this following the notation we use in other manifolds where we discuss eigenvalues=
There was a problem hiding this comment.
Here is the revised docstring :
I only talk about eigenvalues in plain words now.
r"""Adds an :math:`alpha`-stability constraint to the matrix ``module.tensor_name``.
When accessing ``module.tensor_name``, the module will return the parametrized
version :math:`X` so that all of its eigenvalues have a negative real part lower
than :math: `-\alpha`, with :math:`\alpha \geq 0`.
If the tensor has more than two dimensions, the parametrization will be
applied to the last two dimensions.
Examples::
>>> layer = nn.Linear(30, 30)
>>> alpha_stable(layer, "weight", alpha=2)
>>> torch.all(torch.real(torch.linalg.eigvals(layer.weight))<=-layer.alpha)
True
Args:
module (nn.Module): module on which to register the parametrization
tensor_name (string): name of the parameter, buffer, or parametrization
on which the parametrization will be applied. Default: ``"weight"``
alpha (float): Upper bound on the real part of the eigevenalues of :math: `X`
"""
There was a problem hiding this comment.
nit. "Absolute value of the upper bound on the real part of the eigevenalues of :math: X", but otherwise LGTM
geotorch/constraints.py
Outdated
| on which the parametrization will be applied. Default: ``"weight"`` | ||
| alpha (float): Bound on the decay rate and eigevenalues of matrix A | ||
| """ | ||
| return _register_manifold(module, tensor_name, Hurwitz, alpha) No newline at end of file |
There was a problem hiding this comment.
Add an end of line (the linter should have fixed this, but it's not well maintained...)
geotorch/hurwitz.py
Outdated
| def __init__(self, size, alpha: float = 0.0, triv="expm"): | ||
| r""" | ||
| Manifold of matrices with eigenvalues with negative real parts, | ||
| also called Hurwitz matrices. They represend linear stable linear dynamical systems. |
There was a problem hiding this comment.
| also called Hurwitz matrices. They represend linear stable linear dynamical systems. | |
| also called Hurwitz matrices. |
no need to discuss applications in the docs.
geotorch/hurwitz.py
Outdated
| Manifold of matrices with eigenvalues with negative real parts, | ||
| also called Hurwitz matrices. They represend linear stable linear dynamical systems. | ||
|
|
||
| We have the following result : A is Hurwitz with prescribed decay rate \alpha |
There was a problem hiding this comment.
Decay rate has not been defined. Please write self-contained definitions.
geotorch/hurwitz.py
Outdated
|
|
||
| We have the following result : A is Hurwitz with prescribed decay rate \alpha | ||
| if and only if it can be written as | ||
| :math:`A = P^{-1}(-\frac{Q}{2} +S) - \alpha I_n` |
There was a problem hiding this comment.
| :math:`A = P^{-1}(-\frac{Q}{2} +S) - \alpha I_n` | |
| :math:`A = P^{-1}(-\frac{Q}{2} +S) - \alpha \mathrm{I}_n` |
geotorch/hurwitz.py
Outdated
| We have the following result : A is Hurwitz with prescribed decay rate \alpha | ||
| if and only if it can be written as | ||
| :math:`A = P^{-1}(-\frac{Q}{2} +S) - \alpha I_n` | ||
| with $P > 0, Q > 0$ and ^S^skew-symmetric |
There was a problem hiding this comment.
What does P > 0 mean here? Do you mean P \in PSD(n)? Please follow the notation in the rest of the library.
There was a problem hiding this comment.
Also, Fix the skew-symmetric notation.
| with $P > 0, Q > 0$ and ^S^skew-symmetric | ||
| Args: | ||
| size (torch.size): Size of the tensor to be parametrized | ||
| alpha (float): the upper bound on the matrix's eigenvalues real part |
There was a problem hiding this comment.
This should be defined in the main docs, and perhaps here just say it's the upper bound in absolute value of the eigen values or smth
There was a problem hiding this comment.
Here is the new docstring :
r"""
Manifold of matrices with eigenvalues with negative real parts,
also called Hurwitz matrices.
We have the following result : A is Hurwitz with prescribed decay rate \alpha
if and only if it can be written as
..math::
A = P^{-1}(-\frac{Q}{2} +S) - \mathrm{I}_n
with :math:`P \in \operatorname{PSD}(n), Q \in \operatorname{PSD}(n)` and :math: `S \in \operatorname{Skew}(n)`
Args:
size (torch.size): Size of the tensor to be parametrized
alpha (float): the upper bound on the matrix's eigenvalues real part
triv (str or callable): Optional.
A map that maps skew-symmetric matrices onto the orthogonal matrices
surjectively. This is used to optimize the :math:`Q` in the eigenvalue
decomposition. It can be one of ``["expm", "cayley"]`` or a custom
callable. Default: ``"expm"``
"""
I am not yet very familiar with how Sphinx works so I added \oepratorname to define the manifold as you did in the docs.
There was a problem hiding this comment.
When you mean in the main docs do you want me to write something in Sphinx somewhere in the docs too ?
There was a problem hiding this comment.
by "the main doc" I meant "not in the Args part, but above it".
The docs above LGTM, just drop the " We have the following result :" part, and surround in :math: the first A in "A is Hurwitz with prescribed..."
There was a problem hiding this comment.
Also, in this equation you are missing a dependency on \alpha?
There was a problem hiding this comment.
Side note, generally it's better to just push a commit with the changes, as it's easier for me to review them in github rather than in the comments.
geotorch/hurwitz.py
Outdated
| """ | ||
| with torch.no_grad(): | ||
|
|
||
| X_p = torch.empty((self.n, self.n)) |
There was a problem hiding this comment.
Any chance you can reuse the samplers for each of the other manifolds and simply compose them via the trivialisation?
There was a problem hiding this comment.
Honestly very good question ! In terms of code I think we can yes indeed.
In terms of sampling I do not know... my knowledge of random matrix theory is limited and I have not found (yet !) a result on sampling in this manifold. I'll change the code to use the previous samplers.
There was a problem hiding this comment.
There is not much random matrix theory involved here. The way we sample in geotorch is by sampling on the base spaces and mapping these samples by the trivialisation, so I'd hope this can be done compositionally (to be fair, now that I think about it, we should have been able to write this generically, in the base class. Not sure why I didn't do this).
Anyway, could you do this for this one class?
In terms of what this means mathematically, you are just choosing the pushforward of the measures on the base manifolds along the trivialisation. Not that this is super important here, but yeah.
geotorch/skew.py
Outdated
| *(self.tensorial_size + (self.n, self.n))) | ||
| init_(X) | ||
| if lower: | ||
| tril_X = X.tril(-1) |
There was a problem hiding this comment.
might want to do this in-place iwth tril_? Same for the triu_ below.
|
Also an issue I did not talk about : tensorial_size is not supported since the resolution of the vectorized Lyapunov equation is annoying the kroenecker product I would have to do a for loop I think, so the submersion_inv function may take time. If I have time to think about I will work this arround (maybe with einsum or so) |
and Hurwitz class. Hurwitz.sample now uses PSD.sample() and Skew.sample() internally.
geotorch/hurwitz.py
Outdated
|
|
||
|
|
||
| P_inv = torch.inverse(P) | ||
| A = P_inv @ (-0.5 * Q + S) - self.alpha * self.In |
There was a problem hiding this comment.
can you simply call self.submersion here?
|
tensorial_size is not supported since the resolution of the vectorized Lyapunov equation is annoying the kroenecker product I am not sure how this is related? |
Co-authored-by: Mario Lezcano Casado <3291265+lezcano@users.noreply.github.com>
lezcano
left a comment
There was a problem hiding this comment.
Looking good, just two minor points.
Tomorrow I'll do a final pass, and I think we can merge. Thank you!
geotorch/hurwitz.py
Outdated
| I_single = torch.eye(self.n, device=A.device, dtype=A.dtype) | ||
| return torch.kron(I_single, A_single) + torch.kron(A_single, I_single) | ||
|
|
||
| flat_batch_size = torch.prod(torch.tensor(self.tensorial_size)) if self.tensorial_size else 1 |
There was a problem hiding this comment.
| flat_batch_size = torch.prod(torch.tensor(self.tensorial_size)) if self.tensorial_size else 1 | |
| flat_batch_size = math.prod(self.tensorial_size) |
lezcano
left a comment
There was a problem hiding this comment.
At last I had time to look into this and it looks great!
Thank you for the contribution! I just left a tiny perf comment, but otherwise LGTM
I added a parameterizations of Hurwitz matrices, i.e. the one that have eigenvalues with negative real parts they represent stable linear dynamical systems. Optionally one can enforce a specific bound on the matrix decay rate named alpha, it is related to the concept of alpha-stability in Lyapunov linear theory.
In order to build this parameterization, I changed the constructor of the Skew class that now takes a size parameter.
Now the parameterization is instantiated with _register_manifold instead of register_parameterization.
The test_skey.py file has been modified accordingly.
(This is my first PR ever do not hesitate to come back to me if something does not meet the requirements)