TuringLang · Red-Portal · Sep 8, 2025 · Aug 20, 2025 · Aug 20, 2025 · Aug 20, 2025
diff --git a/README.md b/README.md
@@ -19,7 +19,7 @@ For a dataset $(X, y)$ with the design matrix $X \in \mathbb{R}^{n \times d}$ an
 
 $$
 \begin{aligned}
-\sigma &\sim \text{Student-t}_{3}(0, 1) \\
+\sigma &\sim \text{LogNormal}(0, 1) \\
 \beta &\sim \text{Normal}\left(0_d, \sigma \mathrm{I}_d\right) \\
 y &\sim \mathrm{BernoulliLogit}\left(X \beta\right)
 \end{aligned}
@@ -28,7 +28,7 @@ $$
 The `LogDensityProblem` corresponding to this model can be constructed as
 
 ```julia
-import LogDensityProblems
+using LogDensityProblems: LogDensityProblems
 using Distributions
 using FillArrays
 
@@ -43,10 +43,10 @@ function LogDensityProblems.logdensity(model::LogReg, θ)
     β, σ = θ[1:size(X, 2)], θ[end]
 
     logprior_β = logpdf(MvNormal(Zeros(d), σ*I), β)
-    logprior_σ = logpdf(truncated(TDist(3.0); lower=0), σ)
+    logprior_σ = logpdf(LogNormal(0, 3), σ)
 
     logit = X*β
-    loglike_y = sum(@. logpdf(BernoulliLogit(logit), y))
+    loglike_y = mapreduce((li, yi) -> logpdf(BernoulliLogit(li), yi), +, logit, y)
     return loglike_y + logprior_β + logprior_σ
 end
 
@@ -56,23 +56,23 @@ end
 
 function LogDensityProblems.capabilities(::Type{<:LogReg})
     return LogDensityProblems.LogDensityOrder{0}()
-end
+end;
 ```
 
-Since the support of `σ` is constrained to be positive and most VI algorithms assume an unconstrained Euclidean support, we need to use a *bijector* to transform `θ`. 
+Since the support of `σ` is constrained to be positive and most VI algorithms assume an unconstrained Euclidean support, we need to use a *bijector* to transform `θ`.
 We will use [`Bijectors`](https://github.com/TuringLang/Bijectors.jl) for this purpose.
 This corresponds to the automatic differentiation variational inference (ADVI) formulation[^KTRGB2017].
 
 ```julia
-import Bijectors
+using Bijectors: Bijectors
 
 function Bijectors.bijector(model::LogReg)
     d = size(model.X, 2)
     return Bijectors.Stacked(
-        Bijectors.bijector.([MvNormal(Zeros(d), 1.0), truncated(TDist(3.0); lower=0)]),
+        Bijectors.bijector.([MvNormal(Zeros(d), 1.0), LogNormal(0, 3)]),
         [1:d, (d + 1):(d + 1)],
     )
-end
+end;
 ```
 
 A simpler approach would be to use [`Turing`](https://github.com/TuringLang/Turing.jl), where a `Turing.Model` can be automatically be converted into a `LogDensityProblem` and a corresponding `bijector` is automatically generated.
@@ -82,63 +82,72 @@ This can be automatically downloaded using [`OpenML`](https://github.com/JuliaAI
 The sonar dataset corresponds to the dataset id 40.
 
 ```julia
-import OpenML
-import DataFrames
+using OpenML: OpenML
+using DataFrames: DataFrames
 data = Array(DataFrames.DataFrame(OpenML.load(40)))
 X = Matrix{Float64}(data[:, 1:(end - 1)])
-y = Vector{Bool}(data[:, end] .== "Mine")
+y = Vector{Bool}(data[:, end] .== "Mine");
 ```
+
 Let's apply some basic pre-processing and add an intercept column:
+
 ```julia
 X = (X .- mean(X; dims=2)) ./ std(X; dims=2)
-X = hcat(X, ones(size(X, 1)))
+X = hcat(X, ones(size(X, 1)));
 ```
+
 The model can now be instantiated as follows:
+
 ```julia
-model = LogReg(X, y)
+model = LogReg(X, y);
 ```
 
-For the VI algorithm, we will use the following:
+For the VI algorithm, we will use `KLMinRepGradDescent`:
+
 ```julia
 using ADTypes, ReverseDiff
 using AdvancedVI
 
-alg = KLMinRepGradDescent(ADTypes.AutoReverseDiff())
+alg = KLMinRepGradDescent(ADTypes.AutoReverseDiff());
 ```
+
 This algorithm minimizes the exclusive/reverse KL divergence via stochastic gradient descent in the (Euclidean) space of the parameters of the variational approximation with the reparametrization gradient[^TL2014][^RMW2014][^KW2014].
 This is also commonly referred as automatic differentiation VI, black-box VI, stochastic gradient VI, and so on.
 
-This `KLMinRepGradDescent`, in particular, assumes that the target `LogDensityProblem` has gradients.
-For this, it is straightforward to use `LogDensityProblemsAD`:
+`KLMinRepGradDescent`, in particular, assumes that the target `LogDensityProblem` is differentiable.
+If the `LogDensityProblem` has a differentiation [capability](https://www.tamaspapp.eu/LogDensityProblems.jl/dev/#LogDensityProblems.capabilities) of at least first-order, we can take advantage of this.
+For this example, we will use `LogDensityProblemsAD` to equip our problem with a first-order capability:
+
 ```julia
-import DifferentiationInterface
-import LogDensityProblemsAD
+using DifferentiationInterface: DifferentiationInterface
+using LogDensityProblemsAD: LogDensityProblemsAD
 
-model_ad = LogDensityProblemsAD.ADgradient(ADTypes.AutoReverseDiff(), model)
+model_ad = LogDensityProblemsAD.ADgradient(ADTypes.AutoReverseDiff(), model);
 ```
 
 For the variational family, we will consider a `FullRankGaussian` approximation:
+
 ```julia
 using LinearAlgebra
 
 d = LogDensityProblems.dimension(model_ad)
-q = MeanFieldGaussian(zeros(d), Diagonal(ones(d)))
+q = FullRankGaussian(zeros(d), LowerTriangular(Matrix{Float64}(I, d, d)))
+q = MeanFieldGaussian(zeros(d), Diagonal(ones(d)));
 ```
+
 The bijector can now be applied to `q` to match the support of the target problem.
+
 ```julia
 b = Bijectors.bijector(model)
 binv = Bijectors.inverse(b)
-q_transformed = Bijectors.TransformedDistribution(q, binv)
+q_transformed = Bijectors.TransformedDistribution(q, binv);
 ```
+
 We can now run VI:
+
 ```julia
 max_iter = 10^3
-q_avg, info, _ = AdvancedVI.optimize(
-    alg,
-    max_iter,
-    model_ad,
-    q_transformed;
-)
+q, info, _ = AdvancedVI.optimize(alg, max_iter, model_ad, q_transformed;);
 ```
 
 For more examples and details, please refer to the documentation.

diff --git a/docs/Project.toml b/docs/Project.toml
@@ -2,29 +2,41 @@
 ADTypes = "47edcb42-4c32-4615-8424-f2b9edc5f35b"
 AdvancedVI = "b5ca4192-6429-45e5-a2d9-87aec30a685c"
 Bijectors = "76274a88-744f-5084-9051-94815aaf08c4"
+DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
+DifferentiationInterface = "a0c0ee7d-e4b9-4e03-894e-1c5f64a51d63"
 Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
 Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
 FillArrays = "1a297f60-69ca-5386-bcde-b61e274b549b"
 ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
+JSON = "682c06a0-de6a-54ab-a142-c8b1cf79cde6"
 LogDensityProblems = "6fdf6af0-433a-55f7-b3ed-c6c6e0b8df7c"
+LogDensityProblemsAD = "996a588d-648d-4e1f-a8f0-a84b347e47b1"
+OpenML = "8b6db2d4-7670-4922-a472-f9537c81ab66"
 Optimisers = "3bd65402-5787-11e9-1adc-39752487f4e2"
 Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
 QuasiMonteCarlo = "8a4e6c94-4038-4cdc-81c3-7e6ffdb2a71b"
 ReverseDiff = "37e2e3b7-166d-5795-8a7a-e32c996b4267"
+StanLogDensityProblems = "a545de4d-8dba-46db-9d34-4e41d3f07807"
 StatsFuns = "4c63d2b9-4356-54db-8cca-17b64c39e42c"
 
 [compat]
 ADTypes = "1"
 AdvancedVI = "0.5, 0.4"
 Bijectors = "0.13.6, 0.14, 0.15"
+DataFrames = "1"
+DifferentiationInterface = "0.7"
 Distributions = "0.25"
 Documenter = "1"
 FillArrays = "1"
 ForwardDiff = "0.10, 1"
+JSON = "0.21"
 LogDensityProblems = "2.1.1"
+LogDensityProblemsAD = "1"
+OpenML = "0.3"
 Optimisers = "0.3, 0.4"
 Plots = "1"
 QuasiMonteCarlo = "0.3"
 ReverseDiff = "1"
+StanLogDensityProblems = "0.1"
 StatsFuns = "1"
 julia = "1.10, 1.11.2"
diff --git a/docs/make.jl b/docs/make.jl
@@ -17,7 +17,8 @@ makedocs(;
         "AdvancedVI" => "index.md",
         "General Usage" => "general.md",
         "Tutorials" => [
-            "tutorials/basic.md",
+            "Basic Example" => "tutorials/basic.md",
+            "Ussage with Stan" => "tutorials/stan.md",
         ],
         "Algorithms" => [
             "KLMinRepGradDescent" => "paramspacesgd/klminrepgraddescent.md",

diff --git a/docs/src/paramspacesgd/repgradelbo.md b/docs/src/paramspacesgd/repgradelbo.md
@@ -82,7 +82,7 @@ q_transformed = Bijectors.TransformedDistribution(q, binv)
 ```
 
 By passing `q_transformed` to `optimize`, the Jacobian adjustment for the bijector `b` is automatically applied.
-(See [Examples](@ref examples) for a fully working example.)
+(See the [Basic Example](@ref basic) for a fully working example.)
 
 [^KTRGB2017]: Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., & Blei, D. M. (2017). Automatic differentiation variational inference. *Journal of Machine Learning Research*.
 [^DLTBV2017]: Dillon, J. V., Langmore, I., Tran, D., Brevdo, E., Vasudevan, S., Moore, D., ... & Saurous, R. A. (2017). Tensorflow distributions. arXiv.
@@ -177,7 +177,6 @@ function Bijectors.bijector(model::NormalLogNormal)
 end
 ```
 
-Let us come back to the example in [Examples](@ref examples), where a `LogDensityProblem` is given as `model`.
 In this example, the true posterior is contained within the variational family.
 This setting is known as "perfect variational family specification."
 In this case, the `RepGradELBO` estimator with `StickingTheLandingEntropy` is the only estimator known to converge exponentially fast ("linear convergence") to the true solution.