Skip to content

Conversation

@devmotion
Copy link
Member

This PR fixes TuringLang/DynamicPPL.jl#104 and transfers the corresponding tests to Turing. The main problem is that the formula that I based the computation of the log evidence in #1237 on is only valid if resampling is performed in every time step, or rather if the weights are reset to 1/N before every reweighting step. A reference for the more general formula (which was used before) is eq (14) in Sequential Monte Carlo Samplers by P. Del Moral, A. Doucet and A. Jasra. The fix is particularly important since by default we don't perform resampling in every step but only in an adaptive way based on the estimated ESS.

More formally, we save only the unnormalized logarithmic weights, and accumulate them until the next resampling is performed, in which case the are reset to 0. Let logw_k^i denote the unnormalized logarithmic weight of the ith particle in the kth step. Hence in the notation of Del Moral, Doucet, and Jasra we have logw_k^i = logw_{k-1}^i + log(w_k^i), and hence by induction normalizing logw_k^i yields exp(logw_k^i) / \sum_{j=1}^N exp(logw_k^j) = w_k^i * exp(logw_{k-1}^i) / \sum_{j=1}^N w_k^j * exp(logw_{k-1}^j) = w_k^i * W_{k-1}^i / \sum_{j=1}^N w_k^j W_{k-1}^j = W_k^i, i.e., the unnormalized weights in algorithm 3.1.1, as desired. Hence from eq. (14) we can compute the increase of the log evidence by log(Z_k) - log(Z_{k-1}) = log(Z_k / Z_{k-1}) = log(\sum_{i=1}^N W_{k-1}^i * w_{k-1}^i) = log(\sum_{i=1}^N exp(logw_{k-1}^i) * w_{k-1}^i) - log(\sum_{i=1}^N exp(logw_{k-1}^i)) = log(\sum_{i=1}^N \exp(logw_{k-1}^i + log(w_{k-1}^i))) - log(\sum_{i=1}^N exp(logw_{k-1}^i)) = log(\sum_{i=1}^N exp(logw_k^i)) - log(\sum_{i=1}^N exp(logw_{k-1}^i)).

Thus only if logw_{k-1}^i are all 0 (such as in the initial step and after resampling) we obtain the formula log(Z_k) - log(Z_{k-1}) = log(\sum_{i=1}^N exp(logw_k^i)) - log(N) from the reference on which the linked PR was based on.

As a final remark, I'm a bit unsatisfied with the function name logZ since, as shown above, it does not compute log(Z) but the logarithm of the normalization factor of the current unnormalized weights.

@codecov
Copy link

codecov bot commented May 6, 2020

Codecov Report

Merging #1266 into master will increase coverage by 0.24%.
The diff coverage is 94.44%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1266      +/-   ##
==========================================
+ Coverage   66.84%   67.08%   +0.24%     
==========================================
  Files          25       25              
  Lines        1327     1343      +16     
==========================================
+ Hits          887      901      +14     
- Misses        440      442       +2     
Impacted Files Coverage Δ
src/core/Core.jl 100.00% <ø> (ø)
src/core/container.jl 91.58% <93.02%> (-1.60%) ⬇️
src/inference/AdvancedSMC.jl 98.55% <100.00%> (+0.67%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 16fd11c...c9d3eff. Read the comment docs.

@devmotion
Copy link
Member Author

The test error on Windows is still the HMC error that #1264 is supposed to fix.

Copy link
Member

@yebai yebai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @devmotion for fixing this - it is a very subtle issue and I'm glad that we now got it correct.

else
# Increase the unnormalized logarithmic weights, accounting for the variables
# of other samplers.
increase_logweight!(pc, i, score + getlogp(p.vi))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a side note, getlotp(p.vi) will always return 0, since the assume and observe functions for particle samplers does not modify vi.logp by default. This doesn't affect correctness, but worth to pay attention.

See:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not completely true (or maybe I misunderstand you), in this line getlogp can actually return nonzero values due to

acclogp!(vi, logpdf_with_trans(dist, r, istrans(vi, vn)))
. However, since we call resetlogp! in one of the following lines, this won't show up in the saved transitions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must have missed that line, thanks for the pointer!

params = tonamedtuple(particle.vi)

# This is pretty useless since we reset the log probability continuously in the
# particle sweep.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the note, also see my comment above about assume and observe functions.

@yebai yebai merged commit 1218317 into TuringLang:master May 6, 2020
phipsgabler pushed a commit to phipsgabler/Turing.jl that referenced this pull request May 12, 2020
* Fix particle filters with adaptive resampling and add documentation

Fixes TuringLang/DynamicPPL.jl#104

* Fix and extend tests of `ParticleContainer`

* Move logevidence tests from DynamicPPL

* Add more convenient constructors for Particle Gibbs

* Relax type annotations

* Check for approximate equality only

* Add docstring and reference
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SMC test fails with Turing master

2 participants