-
Notifications
You must be signed in to change notification settings - Fork 104
Add a page on predictive distributions #658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+155
−0
Merged
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
e051ca4
Add a page on predictive distributions
penelopeysm 3fc1674
Text
penelopeysm 2a45427
More explanation
penelopeysm 2afb11e
a word
penelopeysm 55c11f1
Merge branch 'main' into py/predict
penelopeysm 5f85e62
Merge branch 'main' into py/predict
penelopeysm 48cdd89
Changes from review
penelopeysm 80d9954
add notes
penelopeysm 6b19aca
Merge branch 'main' into py/predict
penelopeysm ef2c774
Merge branch 'main' into py/predict
penelopeysm 4b1eba1
Fix data plurality
penelopeysm File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,153 @@ | ||
| --- | ||
| title: Predictive Distributions | ||
| engine: julia | ||
| --- | ||
|
|
||
| ```{julia} | ||
| #| echo: false | ||
| #| output: false | ||
| using Pkg; | ||
| Pkg.instantiate(); | ||
| ``` | ||
|
|
||
| Standard MCMC sampling methods return values of the parameters of the model. | ||
| However, it is often also useful to generate new data points using the model, given a distribution of the parameters. | ||
| Turing.jl allows you to do this using the `predict` function, along with conditioning syntax. | ||
|
|
||
| Consider the following simple model, where we observe some normally-distributed data `X` and want to learn about its mean `m`. | ||
|
|
||
| ```{julia} | ||
| using Turing | ||
| @model function f(N) | ||
| m ~ Normal() | ||
| X ~ filldist(Normal(m), N) | ||
| end | ||
| ``` | ||
|
|
||
| Notice first how we have not specified `X` as an argument to the model. | ||
| This allows us to use Turing's conditioning syntax to specify whether we want to provide observed data or not. | ||
|
|
||
| ::: {.callout-note} | ||
| If you want to specify `X` as an argument to the model, then to mark it as being unobserved, you have to instantiate the model again with `X = missing` or `X = fill(missing, N)`. | ||
| Whether you use `missing` or `fill(missing, N)` depends on whether `X` is treated as a single distribution (e.g. with `filldist` or `product_distribution`), or as multiple independent distributions (e.g. with `.~` or a for loop over `eachindex(X)`). | ||
| This is rather finicky, so we recommend using the current approach: conditioning and deconditioning `X` as a whole should work regardless of how `X` is defined in the model. | ||
| ::: | ||
|
|
||
| ```{julia} | ||
| # Generate some synthetic data | ||
| N = 5 | ||
| true_m = 3.0 | ||
| X = rand(Normal(true_m), N) | ||
|
|
||
| # Instantiate the model with observed data | ||
| model = f(N) | (; X = X) | ||
|
|
||
| # Sample from the posterior | ||
| chain = sample(model, NUTS(), 1_000; progress=false) | ||
| mean(chain[:m]) | ||
| ``` | ||
|
|
||
| ## Posterior predictive distribution | ||
|
|
||
| `chain[:m]` now contains samples from the posterior distribution of `m`. | ||
| If we use these samples of the parameters to generate new data points, we obtain the *posterior predictive distribution*. | ||
| Statistically, this is defined as | ||
|
|
||
| $$ | ||
| p(\tilde{x} | \mathbf{X}) = \int p(\tilde{x} | \theta) p(\theta | \mathbf{X}) d\theta, | ||
| $$ | ||
|
|
||
| where $\tilde{x}$ are the new data which you wish to draw, $\theta$ are the model parameters, and $\mathbf{X}$ are the observed data. | ||
| $p(\tilde{x} | \theta)$ is the distribution of the new data given the parameters, which is specified in the Turing.jl model (the `X ~ ...` line); and $p(\theta | \mathbf{X})$ is the posterior distribution, as given by the Markov chain. | ||
|
|
||
| To obtain samples of $\tilde{x}$, we need to first remove the observed data from the model (or 'decondition' the model). | ||
| This means that when the model is evaluated, it will sample a new value for `X`. | ||
| If you don't decondition the model, then `X` will remain fixed to the observed data, and no new samples will be generated. | ||
|
|
||
| ```{julia} | ||
| predictive_model = decondition(model) | ||
| ``` | ||
|
|
||
| ::: {.callout-tip} | ||
| ## Selective deconditioning | ||
|
|
||
| If you only want to decondition a single variable `X`, you can use `decondition(model, @varname(X))`. | ||
| ::: | ||
|
|
||
| To demonstrate how this deconditioned model can generate new data, we can fix the value of `m` to be its mean and evaluate the model: | ||
|
|
||
| ```{julia} | ||
| predictive_model_with_mean_m = predictive_model | (; m = mean(chain[:m])) | ||
| rand(predictive_model_with_mean_m) | ||
| ``` | ||
|
|
||
| This has given us a single sample of `X` given the mean value of `m`. | ||
| Of course, to take our Bayesian uncertainty into account, we want to use the full posterior distribution of `m`, not just its mean. | ||
| To do so, we use `predict`, which _effectively_ does the same as above but for every sample in the chain: | ||
|
|
||
| ```{julia} | ||
| predictive_samples = predict(predictive_model, chain) | ||
| ``` | ||
|
|
||
| ::: {.callout-tip} | ||
| ## Reproducibility | ||
|
|
||
| `predict`, like many other Julia functions that involve randomness, takes an optional `rng` as its first argument. | ||
| This controls the generation of new `X` samples, and makes your results reproducible. | ||
| ::: | ||
|
|
||
| ::: {.callout-note} | ||
| `predict` returns a Chains object itself, which will only contain the newly predicted variables. | ||
| If you want to also retain the original parameters, you can use `predict(rng, predictive_model, chain; include_all=true)`. | ||
| ::: | ||
|
|
||
| We can visualise the predictive distribution by combining all the samples and making a density plot: | ||
|
|
||
| ```{julia} | ||
| using StatsPlots: density, density!, vline! | ||
|
|
||
| predicted_X = vcat([predictive_samples[Symbol("X[$i]")] for i in 1:N]...) | ||
| density(predicted_X, label="Posterior predictive") | ||
| ``` | ||
|
|
||
| Depending on your data, you may naturally want to create different visualisations. | ||
| For example, perhaps `X` contains some time-series data, in which case you can plot each prediction individually as a line against time. | ||
|
|
||
| ## Prior predictive distribution | ||
|
|
||
| Alternatively, if we use the prior distribution of the parameters $p(\theta)$, we obtain the *prior predictive distribution*: | ||
|
|
||
| $$ | ||
| p(\tilde{x}) = \int p(\tilde{x} | \theta) p(\theta) d\theta, | ||
| $$ | ||
|
|
||
| In an exactly analogous fashion to above, you could sample from the prior distribution of the conditioned model, and _then_ pass that to `predict`: | ||
|
|
||
| ```{julia} | ||
| prior_params = sample(model, Prior(), 1_000; progress=false) | ||
| prior_predictive_samples = predict(predictive_model, prior_params) | ||
| ``` | ||
|
|
||
| In fact there is a simpler way: you can directly sample from the deconditioned model, using Turing's `Prior` sampler. | ||
| This will, in a single call, generate prior samples for both the parameters as well as the new data. | ||
|
|
||
| ```{julia} | ||
| prior_predictive_samples = sample(predictive_model, Prior(), 1_000; progress=false) | ||
| ``` | ||
|
|
||
| We can visualise the prior predictive distribution in the same way as before. | ||
| Let's compare the two predictive distributions: | ||
|
|
||
| ```{julia} | ||
| prior_predicted_X = vcat([prior_predictive_samples[Symbol("X[$i]")] for i in 1:N]...) | ||
| density(prior_predicted_X, label="Prior predictive") | ||
| density!(predicted_X, label="Posterior predictive") | ||
| vline!([true_m], label="True mean", linestyle=:dash, color=:black) | ||
| ``` | ||
|
|
||
| We can see here that the prior predictive distribution is: | ||
|
|
||
| 1. Wider than the posterior predictive distribution; | ||
| 2. Centred on the prior mean of `m` (which is 0), rather than the posterior mean (which is close to the true mean of `3`). | ||
|
|
||
| Both of these are because the posterior predictive distribution has been informed by the observed data. | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if we don't decondition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't decondition then it'll just be a plain old observation, so it won't be sampled again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a line to make that super clear!