Make the constructor of `LogDensityFunction` implicit #1156

Red-Portal · 2025-11-26T15:07:40Z

This PR makes the default of LogDensityFunction implicit. This is necessary to mutate-construct a LogDensityFunction (for example, by calling Accessors.@set) without worrying about DI.prepare being called each time.

codecov · 2025-11-26T15:22:15Z

Codecov Report

❌ Patch coverage is 90.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 78.93%. Comparing base (052bc19) to head (ed1db96).

Files with missing lines	Patch %	Lines
src/logdensityfunction.jl	90.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1156      +/-   ##
==========================================
- Coverage   81.58%   78.93%   -2.66%     
==========================================
  Files          40       40              
  Lines        3845     3836       -9     
==========================================
- Hits         3137     3028     -109     
- Misses        708      808     +100

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2025-11-26T15:31:31Z

Benchmark Report

this PR's head: ed1db96795650d96b921a322266ba0f3cfd65f10
base branch: 052bc1950df3a42e14b56eae51af236881092f90

Computer Information

Julia Version 1.11.7
Commit f2b3dbda30a (2025-09-08 12:10 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 4 × AMD EPYC 7763 64-Core Processor
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Benchmark Results

┌───────────────────────┬───────┬─────────────┬───────────────────┬────────┬─────────────────────────────────┬────────────────────────────┐
│                       │       │             │                   │        │        t(eval) / t(ref)         │     t(grad) / t(eval)      │
│                       │       │             │                   │        │ ──────────┬───────────┬──────── │ ───────┬─────────┬──────── │
│                 Model │   Dim │  AD Backend │           VarInfo │ Linked │      base │   this PR │ speedup │   base │ this PR │ speedup │
├───────────────────────┼───────┼─────────────┼───────────────────┼────────┼───────────┼───────────┼─────────┼────────┼─────────┼─────────┤
│               Dynamic │    10 │    mooncake │             typed │   true │    425.09 │    503.37 │    0.84 │  12.35 │    9.35 │    1.32 │
│                   LDA │    12 │ reversediff │             typed │   true │   2835.36 │   3092.92 │    0.92 │   2.20 │    2.13 │    1.03 │
│   Loop univariate 10k │ 10000 │    mooncake │             typed │   true │ 170679.88 │ 166400.65 │    1.03 │   5.14 │    5.20 │    0.99 │
│    Loop univariate 1k │  1000 │    mooncake │             typed │   true │  14045.81 │  15052.69 │    0.93 │   5.67 │    5.90 │    0.96 │
│      Multivariate 10k │ 10000 │    mooncake │             typed │   true │  31489.80 │  35426.65 │    0.89 │   9.78 │    9.42 │    1.04 │
│       Multivariate 1k │  1000 │    mooncake │             typed │   true │   3585.06 │   4058.67 │    0.88 │   8.91 │    8.59 │    1.04 │
│ Simple assume observe │     1 │ forwarddiff │             typed │  false │     17.22 │     19.73 │    0.87 │   1.86 │    2.06 │    0.91 │
│           Smorgasbord │   201 │      enzyme │             typed │   true │   2890.37 │   2759.74 │    1.05 │   4.08 │    4.80 │    0.85 │
│           Smorgasbord │   201 │ forwarddiff │       simple_dict │   true │  29014.16 │  25094.92 │    1.16 │  19.66 │   24.84 │    0.79 │
│           Smorgasbord │   201 │ forwarddiff │ simple_namedtuple │   true │   1035.49 │   1135.71 │    0.91 │ 170.10 │   80.49 │    2.11 │
│           Smorgasbord │   201 │ forwarddiff │             typed │  false │   2754.01 │   2736.21 │    1.01 │  41.66 │   85.39 │    0.49 │
│           Smorgasbord │   201 │ forwarddiff │      typed_vector │   true │   2632.27 │   2805.23 │    0.94 │  43.92 │   43.16 │    1.02 │
│           Smorgasbord │   201 │ forwarddiff │           untyped │   true │   2374.02 │   2494.01 │    0.95 │  43.83 │   48.11 │    0.91 │
│           Smorgasbord │   201 │ forwarddiff │    untyped_vector │   true │   2324.80 │   2509.85 │    0.93 │  45.31 │   45.12 │    1.00 │
│           Smorgasbord │   201 │    mooncake │             typed │   true │   2915.12 │   2780.13 │    1.05 │   5.04 │    5.63 │    0.90 │
│           Smorgasbord │   201 │ reversediff │             typed │   true │   2699.00 │   2866.57 │    0.94 │  53.75 │   56.16 │    0.96 │
│              Submodel │     1 │    mooncake │             typed │   true │     25.89 │     27.89 │    0.93 │   5.22 │    5.25 │    0.99 │
└───────────────────────┴───────┴─────────────┴───────────────────┴────────┴───────────┴───────────┴─────────┴────────┴─────────┴─────────┘

penelopeysm · 2025-11-26T21:18:36Z

This is necessary to mutate-construct a LogDensityFunction (for example, by calling Accessors.@set) without worrying about DI.prepare being called each time.

What are you trying to accomplish with this?

Red-Portal · 2025-11-26T22:06:05Z

Hi! I would like to do @set logdensityprob.varinfo = varinfo_new. But the explicit constructor prevents the use of @set.

penelopeysm · 2025-11-26T22:10:01Z

Yes, but why are you trying to do that? Any meaningful change to the varinfo will require the prep to be recalculated anyway. In the next version of DynamicPPL, LogDensityFunction won't even carry a varinfo with it.

Red-Portal · 2025-11-26T22:10:45Z

Okay, to be more explicit, I am trying to implement subsampling, which requires replacing the LogLikelihoodAccumulator at each step of variational inference.

penelopeysm · 2025-11-26T22:16:23Z

LogDensityFunction is pretty much just a thing that takes a vector and returns a float, anything beyond that is best treated as an implementation detail that shouldn't be relied on.

If you need to manipulate a varinfo, I would suggest that you create your own struct that contains the model + varinfo, and then manipulate that varinfo as desired. There might be other alternatives too, I don't mind taking a look at what you're trying to do and seeing how it can be done, but I don't think that setting varinfo in a LDF is the right way.

Red-Portal · 2025-11-26T22:20:43Z

Basically we have to implement the subsample function explained in this tutorial. So the model underlying a LogDensityProblem has to be updated, and the log-likelihood needs to be adjusted by a constant (the long-gone MinibatchContext was supposed to do that). Of course, a LogDensityProblem is just an interface, so I am relying on DynamicPPL specifics here.

penelopeysm · 2025-11-26T22:30:24Z

Do you have a branch / PR that does this for DynamicPPL?

I am almost 100% certain that modifying the varinfo will give you lots of headaches. Please don't do that.

LDF has this argument, called getlogdensity, which by default is getlogjoint_internal which in turn is logprior + loglikelihood - logjac. To scale the likelihood, all you need to do is to define a new function or callable struct

struct ScaledLogJoint{F<:Real}
    scale::F
end
(s::ScaledLogJoint)(vi::AbstractVarInfo) = getlogprior(vi) + (s.scale * getloglikelihood(vi)) - getlogjac(vi)

and then do

updated_model = # model conditioned on new subset of data...

ldf = LogDensityFunction(updated_model, ScaledLogJoint(scale); adtype=adtype)

But note that you are providing a different function to differentiate — so you can't reuse the same prep.

https://juliadiff.org/DifferentiationInterface.jl/DifferentiationInterface/stable/api/#DifferentiationInterface.prepare_gradient

The preparation result prep is only reusable as long as the arguments to gradient do not change type or size, and the function and backend themselves are not modified.

That's what I meant when I said that any meaningful change to the varinfo will require the prep to be updated anyway. If you edited the varinfo without updating the prep I assume that you would potentially get correctness errors.

Red-Portal · 2025-11-27T21:05:53Z

Do you have a branch / PR that does this for DynamicPPL?

You can find my partial attempt here.

But note that you are providing a different function to differentiate — so you can't reuse the same prep.

https://juliadiff.org/DifferentiationInterface.jl/DifferentiationInterface/stable/api/#DifferentiationInterface.prepare_gradient

Huh, I vaguely remember @gdalle told that prep should be fine for this use case where only the dimensions of some auxiliary variables would change. I guess the policy changed since then?

penelopeysm · 2025-11-27T21:48:21Z

I mean, what you're doing is basically this:

using DifferentiationInterface
using ForwardDiff, ReverseDiff, Mooncake

adtypes = [
    AutoForwardDiff(),
    AutoReverseDiff(),
    AutoReverseDiff(; compile=true),
    AutoMooncake()
]
for adtype in adtypes
    @info "AD type: ", adtype
    x = [1.0, 2.0]
    f(x) = sum(x)
    prep = prepare_gradient(f, adtype, x)
    f(x) = 1.5 * sum(x)
    @info "actual:   ", gradient(f, prep, adtype, x)
    @info "expected: ", ForwardDiff.gradient(f, x)
    println()
end

[ Info: ("AD type: ", AutoForwardDiff())
[ Info: ("actual:   ", [1.5, 1.5])
[ Info: ("expected: ", [1.5, 1.5])

[ Info: ("AD type: ", AutoReverseDiff())
[ Info: ("actual:   ", [1.5, 1.5])
[ Info: ("expected: ", [1.5, 1.5])

[ Info: ("AD type: ", AutoReverseDiff(compile=true))
[ Info: ("actual:   ", [1.0, 1.0])
[ Info: ("expected: ", [1.5, 1.5])

[ Info: ("AD type: ", AutoMooncake())
[ Info: ("actual:   ", [1.0, 1.0])
[ Info: ("expected: ", [1.5, 1.5])

gdalle · 2025-11-27T21:54:32Z

No @Red-Portal, auxiliary variables are among the arguments in the docstring you mentioned, so they are not allowed to change type or size. The "operators" page of the DI docs gives more details on preparation and its semantics, but happy to clarify further if necessary

Red-Portal · 2025-11-27T21:58:00Z

Thank you for the clarification. I think we can make the size not to change. The values will definitely have to change though. According to the docstring, that should be fine, correct?

But I recall we had this discussion before. What I am trying to do is very standard for any statistical application involving minibatching. Surely, there must be a way to use prep without issue right?

Red-Portal · 2025-11-27T22:03:22Z

@penelopeysm I think what we're trying to do is more like:

using DifferentiationInterface
using LinearAlgebra
using ForwardDiff, ReverseDiff, Mooncake

adtypes = [
    AutoForwardDiff(),
    AutoReverseDiff(),
    AutoReverseDiff(; compile=true),
    AutoMooncake()
]
for adtype in adtypes
    @info "AD type: ", adtype
    x = [1.0, 2.0]
    
    y = randn(2)
    f(x) = dot(x,y)

    prep = prepare_gradient(f, adtype, x)

    g = gradient(f, prep, adtype, x)
    
    y[:] = randn(2)  
    
    g_new = gradient(f, prep, adtype, x)

    @info "not the same: ", norm(g - g_new) > 1e-10
end

[ Info: ("AD type: ", AutoForwardDiff())
[ Info: ("expected true: ", true)
[ Info: ("AD type: ", AutoReverseDiff())
[ Info: ("expected true: ", true)
[ Info: ("AD type: ", AutoReverseDiff(compile=true))
[ Info: ("expected true: ", false)
[ Info: ("AD type: ", AutoMooncake())
[ Info: ("expected true: ", true)

penelopeysm · 2025-11-27T22:21:45Z

Guillaume is the authority on this, but I feel like that just proves my point. You're relying on the fact that it happens to work for some backends when that isn't a guarantee that's provided by DI.

In any case, as far as this PR is concerned, I'm not willing to change the implementation of LDF. If you really want to go down this route, you should duplicate the code as necessary. This is especially so because LDF's implementation is very different in 0.39 and so your code will immediately break if you are relying on its internal structure.

I would very strongly suggest that you change the approach to do something like this instead:

using DifferentiationInterface
using ForwardDiff, ReverseDiff, Mooncake
using LinearAlgebra: dot

adtypes = [
   AutoForwardDiff(),
   AutoReverseDiff(),
   AutoReverseDiff(; compile=true),
   AutoMooncake()
]
for adtype in adtypes
   @info "AD type: ", adtype
   x = [1.0, 2.0]
   y = randn(2)
   prep = prepare_gradient(dot, adtype, x, Constant(y))
   # calculate with the old prep
   y = randn(2)
   grad = gradient(dot, prep, adtype, x, Constant(y))
   @info "actual:   ", grad
   # calculate without the prep
   @info "expected: ", gradient(dot, adtype, x, Constant(y))
   println()
end

[ Info: ("AD type: ", AutoForwardDiff())
[ Info: ("actual:   ", [-0.10652485041993587, -0.3575589022616598])
[ Info: ("expected: ", [-0.10652485041993587, -0.3575589022616598])

[ Info: ("AD type: ", AutoReverseDiff())
[ Info: ("actual:   ", [0.11226749448821424, -0.0700900922876089])
[ Info: ("expected: ", [0.11226749448821424, -0.0700900922876089])

[ Info: ("AD type: ", AutoReverseDiff(compile=true))
[ Info: ("actual:   ", [-0.6767470277992602, 1.6638846489301709])
[ Info: ("expected: ", [-0.6767470277992602, 1.6638846489301709])

[ Info: ("AD type: ", AutoMooncake())
[ Info: ("actual:   ", [2.437615526525269, -1.0598882129119007])
[ Info: ("expected: ", [2.437615526525269, -1.0598882129119007])

And for DynamicPPL, it's not all that difficult to adapt to this scheme:

function evaluate_with_new_data(x, model, varinfo, loglike_scale, new_data)
    new_model = decondition(model, @varname(datapoints)) | (; datapoints = new_data)
    vi = DynamicPPL.unflatten(varinfo, x)
    _, vi = DynamicPPL.evaluate!!(new_model, vi)
    return getlogprior(vi) + loglike_scale*getloglikelihood(vi) - getlogjac(vi)
end

(I definitely recommend using condition / decondition because model.defaults is also an internal detail. Conditioning is a public API.)

Then you can build a single prep for that function and reuse it to your heart's content as long as the size and type of new_data doesn't change.

Red-Portal · 2025-11-27T22:36:42Z

Okay, so what you're suggesting here is to have a downstream customized version of LogDensityFunction. I can definitely do that. However, in the long run, subsampling as a feature will have to be implemented elsewhere throughout the Turing ecosystem. For example, stochastic gradient MCMC algorithms have existed for a while but were basically dormant.

Given the previous agreement with Hong that subsampling-related features should be implemented and tested in AdvancedVI first and then moved upstream, we could punt this for now and later think about how to move all of this upstream. Shall we go this route?

By the way, @gdalle could you confirm that there is no valid way to use prep for this use case, given that AutoReverseDiff(; compile=true) doesn't return a correct result? (For reference, I remember that you mentioned this shouldn't happen, although that discussion is pretty old. Wow time flies!)

penelopeysm · 2025-11-27T22:45:54Z

Yeah, I think that makes sense. You could stick that evaluate_with_new_data function in the DynamicPPLExt that you're doing, and then make use of it in AdvancedVI.subsample, and when you feel happy with it, shift it to DynamicPPL itself.

gdalle · 2025-11-28T05:50:02Z

@Red-Portal the right way to change the behavior of a function is to encapsulate the changing part inside a Constant, as Penelope demonstrated. That will work even with ReverseDiff because we make sure not to bake that constant into the compiled tape. It won't work with the upcoming Reactant support though, but let's not worry about that yet

fix make the constructor of LogDensityFunction implicit

ed1db96

github-actions bot assigned Red-Portal Nov 26, 2025

Red-Portal requested a review from penelopeysm November 26, 2025 21:14

Red-Portal closed this Nov 27, 2025

Make the constructor of LogDensityFunction implicit #1156

Make the constructor of LogDensityFunction implicit #1156

Uh oh!

Conversation

Red-Portal commented Nov 26, 2025

Uh oh!

codecov bot commented Nov 26, 2025

Codecov Report

Uh oh!

github-actions bot commented Nov 26, 2025

Benchmark Report

Computer Information

Benchmark Results

Uh oh!

penelopeysm commented Nov 26, 2025

Uh oh!

Red-Portal commented Nov 26, 2025

Uh oh!

penelopeysm commented Nov 26, 2025

Uh oh!

Red-Portal commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

penelopeysm commented Nov 26, 2025

Uh oh!

Red-Portal commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

penelopeysm commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Red-Portal commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

penelopeysm commented Nov 27, 2025

Uh oh!

gdalle commented Nov 27, 2025

Uh oh!

Red-Portal commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Red-Portal commented Nov 27, 2025

Uh oh!

penelopeysm commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Red-Portal commented Nov 27, 2025

Uh oh!

penelopeysm commented Nov 27, 2025

Uh oh!

gdalle commented Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Make the constructor of `LogDensityFunction` implicit #1156

Make the constructor of `LogDensityFunction` implicit #1156

Red-Portal commented Nov 26, 2025 •

edited

Loading

Red-Portal commented Nov 26, 2025 •

edited

Loading

penelopeysm commented Nov 26, 2025 •

edited

Loading

Red-Portal commented Nov 27, 2025 •

edited

Loading

Red-Portal commented Nov 27, 2025 •

edited

Loading

penelopeysm commented Nov 27, 2025 •

edited

Loading