Skip to content

Commit 03338d6

Browse files
committed
add docs for ProjectScale
1 parent 238128e commit 03338d6

File tree

4 files changed

+27
-4
lines changed

4 files changed

+27
-4
lines changed

docs/src/elbo/repgradelbo.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -219,7 +219,7 @@ _, _, stats_cfe, _ = AdvancedVI.optimize(
219219
max_iter;
220220
show_progress = false,
221221
adtype = AutoForwardDiff(),
222-
optimizer = Optimisers.Adam(3e-3),
222+
optimizer = ProjectScale(Optimisers.Adam(3e-3)),
223223
callback = callback,
224224
);
225225
@@ -230,7 +230,7 @@ _, _, stats_stl, _ = AdvancedVI.optimize(
230230
max_iter;
231231
show_progress = false,
232232
adtype = AutoForwardDiff(),
233-
optimizer = Optimisers.Adam(3e-3),
233+
optimizer = ProjectScale(Optimisers.Adam(3e-3)),
234234
callback = callback,
235235
);
236236
@@ -265,6 +265,7 @@ Furthermore, in a lot of cases, a low-accuracy solution may be sufficient.
265265

266266
[^RWD2017]: Roeder, G., Wu, Y., & Duvenaud, D. K. (2017). Sticking the landing: Simple, lower-variance gradient estimators for variational inference. Advances in Neural Information Processing Systems, 30.
267267
[^KMG2024]: Kim, K., Ma, Y., & Gardner, J. (2024). Linear Convergence of Black-Box Variational Inference: Should We Stick the Landing?. In International Conference on Artificial Intelligence and Statistics (pp. 235-243). PMLR.
268+
268269
## Advanced Usage
269270

270271
There are two major ways to customize the behavior of `RepGradELBO`
@@ -317,7 +318,7 @@ _, _, stats_qmc, _ = AdvancedVI.optimize(
317318
max_iter;
318319
show_progress = false,
319320
adtype = AutoForwardDiff(),
320-
optimizer = Optimisers.Adam(3e-3),
321+
optimizer = ProjectScale(Optimisers.Adam(3e-3)),
321322
callback = callback,
322323
);
323324

docs/src/examples.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -118,11 +118,14 @@ q_avg_trans, q_trans, stats, _ = AdvancedVI.optimize(
118118
n_max_iter;
119119
show_progress=false,
120120
adtype=AutoForwardDiff(),
121-
optimizer=Optimisers.Adam(1e-3),
121+
optimizer=ProjectScale(Optimisers.Adam(1e-3)),
122122
);
123123
nothing
124124
```
125125

126+
`ProjectScale` is a wrapper around an optimization rule such that the variational approximation stays within a stable region of the variational family.
127+
For more information see [this section](@ref projectscale).
128+
126129
`q_avg_trans` is the final output of the optimization procedure.
127130
If a parameter averaging strategy is used through the keyword argument `averager`, `q_avg_trans` is be the output of the averaging strategy, while `q_trans` is the last iterate.
128131

docs/src/families.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,16 @@ FullRankGaussian
5656
MeanFieldGaussian
5757
```
5858

59+
### [Scale Projection Operator](@id projectscale)
60+
For the location scale, it is often the case that optimization is stable only when the smallest eigenvalue of the scale matrix is strictly positive[^D2020].
61+
To ensure this, we provide the following wrapper around optimization rule:
62+
63+
```@docs
64+
ProjectScale
65+
```
66+
67+
[^D2020]: Domke, J. (2020). Provable smoothness guarantees for black-box variational inference. In *International Conference on Machine Learning*.
68+
5969
### Gaussian Variational Families
6070

6171
```julia

src/families/location_scale.jl

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,15 @@ function MeanFieldGaussian(μ::AbstractVector{T}, L::Diagonal{T}) where {T<:Real
140140
return MvLocationScale(μ, L, Normal{T}(zero(T), one(T)))
141141
end
142142

143+
"""
144+
ProjectScale(rule, scale_eps)
145+
146+
Compose an optimization `rule` with a projection, where the projection ensures that a `LocationScale` or `LocationScaleLowRank` has a scale with eigenvalues larger than `scale_eps`.
147+
148+
# Arguments
149+
- `rule::Optimisers.AbstractRule`: Optimization rule to compose with the projection.
150+
- `scale_eps::Real`: Lower bound on the eigenvalues of the scale matrix of the projection.
151+
"""
143152
struct ProjectScale{Rule<:Optimisers.AbstractRule,F<:Real} <: Optimisers.AbstractRule
144153
rule::Rule
145154
scale_eps::F

0 commit comments

Comments
 (0)