Add Riemannian manifold HMC #439

ErikQQY · 2025-05-05T15:58:32Z

Add the Riemannian manifold HMC from "Riemann manifold Langevin and Hamiltonian Monte Carlo methods" which is implemented in the research directory but has not been merged into the package. Supersede #432 #437

github-actions · 2025-05-10T07:23:45Z

AdvancedHMC.jl documentation for PR #439 is available at:
https://TuringLang.github.io/AdvancedHMC.jl/previews/PR439/

codecov · 2025-06-29T17:32:56Z

Codecov Report

Attention: Patch coverage is 0% with 33 lines in your changes missing coverage. Please review.

Project coverage is 78.31%. Comparing base (5a562e0) to head (567e2a8).

Files with missing lines	Patch %	Lines
src/riemannian/metric.jl	0.00%	21 Missing ⚠️
src/riemannian/hamiltonian.jl	0.00%	12 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #439      +/-   ##
==========================================
+ Coverage   75.44%   78.31%   +2.86%     
==========================================
  Files          21       22       +1     
  Lines        1230     1185      -45     
==========================================
  Hits          928      928              
+ Misses        302      257      -45

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

nsiccha · 2025-12-04T09:02:00Z

I've had a quick look - this PR clearly still needs some work. The sampling of the random momentum seems to be numerically unstable and inefficient, see (this PR)

AdvancedHMC.jl/src/riemannian/metric.jl

Lines 52 to 63 in c9e6b0a

    
           function rand_momentum( 
        
               rng::Union{AbstractRNG,AbstractVector{<:AbstractRNG}}, 
        
               metric::DenseRiemannianMetric{T}, 
        
               kinetic, 
        
               θ::AbstractVecOrMat, 
        
           ) where {T} 
        
               r = _randn(rng, T, size(metric)...) 
        
               G⁻¹ = inv(metric.map(metric.G(θ))) 
        
               chol = cholesky(Symmetric(G⁻¹)) 
        
               ldiv!(chol.U, r) 
        
               return r 
        
           end

and maybe also compare with (that PR)

AdvancedHMC.jl/src/riemannian/metric.jl

Lines 63 to 73 in 9cac513

    
           function rand_momentum( 
        
               rng::Union{AbstractRNG,AbstractVector{<:AbstractRNG}}, 
        
               metric::DenseRiemannianMetric{T}, 
        
               kinetic, 
        
               θ::AbstractVecOrMat, 
        
           ) where {T} 
        
               r = _randn(rng, T, size(metric)...) 
        
               chol = cholesky(Symmetric(metric.map(metric.G(θ)))) 
        
               r = chol.L * r 
        
               return r 
        
           end

.

For the Softabs map, which internally does an eigendecomposition (see

AdvancedHMC.jl/src/riemannian/metric.jl

Lines 13 to 19 in c9e6b0a

    
           function softabs(X, α=20.0) 
        
               F = eigen(X) # ReverseDiff cannot diff through `eigen` 
        
               Q = hcat(F.vectors) 
        
               λ = F.values 
        
               softabsλ = λ .* coth.(α * λ) 
        
               return Q * diagm(softabsλ) * Q', Q, λ, softabsλ 
        
           end

), not even the cholesky decomposition will be required if we hold onto the eigendecomposition.

THargreaves · 2025-12-04T10:08:46Z

not even the cholesky decomposition will be required if we hold onto the eigendecomposition.

This is a good point. Though for code modularity it would be nice if we didn't have to redefine H and its derivatives specifically for the eigendecomposition case.

I have two thoughts on this:

We could write H and its derivatives in a generic way using solves — the only issue is that E = eigen(A) isn't treated as a proper factorisation in Julia so things like E \ x don't work.
You can compute the Cholesky of G from the decomposition by taking the QR decomposition of the eigenvector matrix. I did a back of the envelope FLOP calculation and this is ends up being slower if you only using it to perform vector solves and is a bit faster (but both O(D^3)) when performing matrix solves.

nsiccha · 2025-12-04T10:25:36Z

We could write H and its derivatives in a generic way using solves — the only issue is that E = eigen(A) isn't treated as a proper factorisation in Julia so things like E \ x don't work.

Yeah, we'd either have to find a wrapper for this somewhere in the wild (I'm sure it exists somewhere) or write our own.

You can compute the Cholesky of G from the decomposition by taking the QR decomposition of the eigenvector matrix.

What would you compute the cholesky factor for? You can just compute the matrix's square root using the eigendecompoisition, which you already have, which should be all that you need.

THargreaves · 2025-12-04T10:29:58Z

What would you compute the cholesky factor for? You can just compute the matrix's square root using the eigendecompoisition, which you already have, which should be all that you need.

Hypothetically, if you could convert eigen to Cholesky for free, it would be more efficient to compute H and its derivatives using the low triangular L than the dense V from the eigendecomposition. But FLOPS-wise the overhead of doing the conversion only ends up being worth it if you are doing O(D) vector multiplies/solves.

This doesn't apply to us in the dHdp. It sort of does for the trace term in dHdθ though.

nsiccha · 2025-12-04T10:34:30Z

This doesn't apply to us in the dHdp. It sort of does for the trace term in dHdθ though.

The one in here?

AdvancedHMC.jl/src/riemannian/hamiltonian.jl

Lines 12 to 34 in c9e6b0a

    
           function ∂H∂θ( 
        
               h::Hamiltonian{<:DenseRiemannianMetric{T,<:IdentityMap},<:GaussianKinetic}, 
        
               θ::AbstractVecOrMat{T}, 
        
               r::AbstractVecOrMat{T}, 
        
           ) where {T} 
        
               ℓπ, ∂ℓπ∂θ = h.∂ℓπ∂θ(θ) 
        
               G = h.metric.map(h.metric.G(θ)) 
        
               invG = inv(G) 
        
               ∂G∂θ = h.metric.∂G∂θ(θ) 
        
               d = length(∂ℓπ∂θ) 
        
               return DualValue( 
        
                   ℓπ, 
        
                   #! Eq (15) of Girolami & Calderhead (2011) 
        
                   -mapreduce(vcat, 1:d) do i 
        
                       ∂G∂θᵢ = ∂G∂θ[:, :, i] 
        
                       ∂ℓπ∂θ[i] - 1 / 2 * tr(invG * ∂G∂θᵢ) + 1 / 2 * r' * invG * ∂G∂θᵢ * invG * r 
        
                       # Gr = G \ r 
        
                       # ∂ℓπ∂θ[i] - 1 / 2 * tr(G \ ∂G∂θᵢ) + 1 / 2 * Gr' * ∂G∂θᵢ * Gr 
        
                       # 1 / 2 * tr(invG * ∂G∂θᵢ) 
        
                       # 1 / 2 * r' * invG * ∂G∂θᵢ * invG * r 
        
                   end, 
        
               ) 
        
           end

(Edit for in place display of code)

nsiccha · 2025-12-04T10:46:51Z

The trace of the product of two (square) matrices can be computed in O(n^2) without computing the O(n^3) matrix product, see e.g. https://en.wikipedia.org/wiki/Trace_(linear_algebra)#Trace_of_a_product.

Furthermore, do we think that Julia is clever enough to reorder this expression: r' * invG * ∂G∂θᵢ * invG * r? It's in any case unncessary to compute it as it's implemented there. Do the tmp = invG * r once, then compute the tmp' * ∂G∂θᵢ * tmp in one go - I'm sure there's a function for that.

THargreaves · 2025-11-05T12:46:52Z

src/riemannian/metric.jl

+    θ::AbstractVecOrMat,
+) where {T}
+    r = _randn(rng, T, size(metric)...)
+    G⁻¹ = inv(metric.map(metric.G(θ)))


It's very inefficient and numerically unstable to compute a direct inverse. Worse, you're not taking advantage of the fact that G is PSD. Instead, you should compute the Cholesky decomposition first.

Let G = U'U where U are the upper triangles from the Cholesky decomposition.

You're computing the Cholesky of G⁻¹ which is (U⁻¹)(U⁻¹)'.

You then do a ldiv to compute r = (U'⁻¹)⁻¹ * r, but that's just U' * r.

Instead, you can just do lmul!(chol.L, r) where chol is the Cholesky of G directly.

THargreaves · 2025-11-21T12:43:02Z

src/riemannian/hamiltonian.jl

-
-# Negative kinetic energy
-#! Eq (13) of Girolami & Calderhead (2011)
-function neg_energy(


Is this correct? Eq (13) of Girolami & Calderhead (2011) also contains the log-likelihood term, -L(θ). Why is this not included here?

Even if this is correct, we should clarify the naming conventions as it's quite hard to follow.

THargreaves · 2025-11-25T01:50:21Z

src/riemannian/hamiltonian.jl

-    kinetic,
+#! Eq (14) of Girolami & Calderhead (2011)
+function ∂H∂r(
+    h::Hamiltonian{<:DenseRiemannianMetric,<:GaussianKinetic},


AbstractRiemannianMetric?

This logic is not unique to the dense metric.

THargreaves · 2025-12-04T10:58:31Z

I made this review weeks ago but not sure it was actually showing up for others until I marked it as "complete".

nsiccha · 2025-12-04T11:09:18Z

I made this review weeks ago but not sure it was actually showing up for others until I marked it as "complete".

No haha, I hadn't seen this before 😅

nsiccha · 2025-12-04T11:15:51Z

I've got another, maybe silly, conceptual question.

If we're using the softabs map, then the Riemannian metric (right?) G is not the Fisher Information Metric, and it's also not the Hessian (which after all is not guaranteed to always be SPD and thus no metric), but G(theta) = softabs(Hessian(theta)).

Shouldn't the G field in the DenseRiemannianMetric type then already include the map? And the ∂G∂θ should then compute the derivatives wrt to softabs(Hessian(theta))?

Edit: maybe all that modulo some inverses? I'm really not familiar with this algorithm.

THargreaves · 2025-12-04T11:33:16Z

Shouldn't the G field in the DenseRiemannianMetric type then already include the map? And the ∂G∂θ should then compute the derivatives wrt to softabs(Hessian(theta))?

That would make sense to me. All computations in the RHMC integrator are wrt G = softabs(H). H is never used directly.

This is the point I was alluding to when talking about separating the logic for the DenseRiemannanMetric in particular (softabs/eigen stuff) from the AbstractRiemannianMetric stuff.

Add Riemannian manifold HMC

7883d49

github-actions bot assigned ErikQQY May 5, 2025

ErikQQY mentioned this pull request May 5, 2025

Add Riemannian manifold HMC #437

Closed

ErikQQY requested a review from yebai May 9, 2025 10:03

Merge branch 'main' into qqy/NEW_RMHMC

d0191cf

Format

1fcdd09

ErikQQY mentioned this pull request Jun 29, 2025

Make ∂H∂r consistent with ∂H∂θ #458

Open

Merge branch 'main' into qqy/NEW_RMHMC

5f9592f

format

567e2a8

ErikQQY closed this Jun 29, 2025

ErikQQY reopened this Jun 29, 2025

ErikQQY and others added 2 commits November 21, 2025 23:09

Merge branch 'main' into qqy/NEW_RMHMC

6df40b5

Include Riemannian HMC tests

c9e6b0a

THargreaves mentioned this pull request Dec 4, 2025

Efficient Computation of Hamiltonian Derivatives in RHMC #482

Open

THargreaves requested changes Dec 4, 2025

View reviewed changes

Add Riemannian manifold HMC #439

Are you sure you want to change the base?

Add Riemannian manifold HMC #439

Conversation

ErikQQY commented May 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 10, 2025

Uh oh!

codecov bot commented Jun 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

nsiccha commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

THargreaves commented Dec 4, 2025

Uh oh!

nsiccha commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

THargreaves commented Dec 4, 2025

Uh oh!

nsiccha commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nsiccha commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

THargreaves Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

THargreaves Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

THargreaves Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

THargreaves commented Dec 4, 2025

Uh oh!

nsiccha commented Dec 4, 2025

Uh oh!

nsiccha commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

THargreaves commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ErikQQY commented May 5, 2025 •

edited

Loading

codecov bot commented Jun 29, 2025 •

edited

Loading

nsiccha commented Dec 4, 2025 •

edited

Loading

nsiccha commented Dec 4, 2025 •

edited

Loading

nsiccha commented Dec 4, 2025 •

edited

Loading

nsiccha commented Dec 4, 2025 •

edited

Loading

nsiccha commented Dec 4, 2025 •

edited

Loading