-
Notifications
You must be signed in to change notification settings - Fork 16
Change UnitVector transform to use normalization #138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: David Widmann <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! The docs also needs to be updated but I can take care of that myself.
If you have the time I would appreciate an explanation of why we need to correct by the Chi distribution.
lj_transform = logdet(J' * J) / 2 | ||
# lp_prior | ||
# un-normalized Chi distribution prior on r | ||
lp_prior = (K - 1) * log(r) - r^2 / 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please explain why this correction is needed?
x = randn(dimension(tt)) | ||
y = transform(tt, x) | ||
x′ = inverse(tt, y) | ||
@test x ≈ x′ | ||
m = sum(2:N) | ||
@test x[1:end-m] ≈ x′[1:end-m] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might as well remove these two lines, the one below is sufficient.
Apologies is the question is stupid, but could this be a bijection by adding the norm? Ie The user would need to put a prior on |
Yes that's exactly what's implicitly going on here. There are 2 equivalent perspectives we can take, but I'll focus on the one that I find most intuitive. To make the transform bijective almost everywhere, we expand the parameter space so the transform is After applying the transform we have an extra parameter That choice of prior on
Yes, that's right. Cases like this were my motivation for the syntax A practical issue with the user picking a prior themselves is that |
One more comment: If the approach in this PR is taken, then not every transform here will be bijective/invertible, and the log-density correction is not just a logdetjac. So maybe some documentation would need to be updated. It would still be the case that every transform should be right-invertible. |
Thanks for the intuitive explanation, this now makes sense.
Isn't there an extra
Yes, I see your point. Also, in the future, we might consider adding similar mappings where an unconstrained
But returning
I am thinking about the following API:
Comments welcome. |
No the
Yeah that makes sense. I also like it's explicitness.
A simpler alternative would be a boolean flag defaulting to |
Suggestion: If also returning |
This fixes #66 and relates #86
Length 1 unit vectors are no longer supported. While they are technically possible, they will likely suffer from numerical issues, since the transform is undefined at
x=0
, and for a Markov chain to travel fromy=[-1]
toy[1]
, it would have to leap over the origin, which is only even possible due to discretization and likely will often not work.Note the current implementation is bare-bones and has the limitation that it's not appropriate for optimization when the target distribution may include a unit-vector whose distribution is uniform on the sphere. I think it would be useful to accept a
logpr
function corresponding tolog(r^(1-n) p(r))
wherep(r)
is an (unnormalized) prior on the discarded normr
. As noted in #86 (comment) and https://discourse.mc-stan.org/t/a-better-unit-vector/26989/30, this would support providing an alternative prior in these rare cases whoselogpr
is not maximized atr=0
. If this is welcome, I'll add it to the PR.