Skip to content

Commit 56a3b78

Browse files
authored
Add documentation to README (#43)
1 parent a08d1f8 commit 56a3b78

File tree

1 file changed

+178
-0
lines changed

1 file changed

+178
-0
lines changed

README.md

Lines changed: 178 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,180 @@
11
# AbstractMCMC.jl
2+
23
Abstract types and interfaces for Markov chain Monte Carlo methods.
4+
5+
[![Build Status](https://travis-ci.com/TuringLang/AbstractMCMC.jl.svg?branch=master)](https://travis-ci.com/TuringLang/EllipticalSliceSampling.jl)
6+
[![Codecov](https://codecov.io/gh/TuringLang/AbstractMCMC.jl/branch/master/graph/badge.svg)](https://codecov.io/gh/TuringLang/AbstractMCMC.jl)
7+
[![Coveralls](https://coveralls.io/repos/github/TuringLang/AbstractMCMC.jl/badge.svg?branch=master)](https://coveralls.io/github/TuringLang/AbstractMCMC.jl?branch=master)
8+
9+
## Overview
10+
11+
AbstractMCMC defines an interface for sampling and combining Markov chains.
12+
It comes with a default sampling algorithm that provides support of progress
13+
bars, parallel sampling (multithreaded and multicore), and user-provided callbacks
14+
out of the box. Typically developers only have to define the sampling step
15+
of their inference method in an iterator-like fashion to make use of this
16+
functionality. Additionally, the package defines an iterator and a transducer
17+
for sampling Markov chains based on the interface.
18+
19+
## User-facing API
20+
21+
The user-facing sampling API consists of
22+
```julia
23+
StatsBase.sample(
24+
[rng::Random.AbstractRNG,]
25+
model::AbstractMCMC.AbstractModel,
26+
sampler::AbstractMCMC.AbstractSampler,
27+
nsamples[;
28+
kwargs...]
29+
)
30+
```
31+
and
32+
```julia
33+
StatsBase.sample(
34+
[rng::Random.AbstractRNG,]
35+
model::AbstractMCMC.AbstractModel,
36+
sampler::AbstractMCMC.AbstractSampler,
37+
parallel::AbstractMCMC.AbstractMCMCParallel,
38+
nsamples::Integer,
39+
nchains::Integer[;
40+
kwargs...]
41+
)
42+
```
43+
for regular and parallel sampling, respectively. In regular sampling, users may
44+
provide a function
45+
```julia
46+
isdone(rng, model, sampler, samples, iteration; kwargs...)
47+
```
48+
that returns `true` when sampling should end, and `false` otherwise, instead of
49+
a fixed number of samples `nsamples`. AbstractMCMC defines the abstract types
50+
`AbstractMCMC.AbstractModel`, `AbstractMCMC.AbstractSampler`, and
51+
`AbstractMCMC.AbstractMCMCParallel` for models, samplers, and parallel sampling
52+
algorithms, respectively. Two algorithms `MCMCThreads` and `MCMCDistributed`
53+
are provided for parallel sampling with multiple threads and multiple processes,
54+
respectively.
55+
56+
The function
57+
```julia
58+
AbstractMCMC.steps([rng::AbstractRNG, ]model::AbstractModel, sampler::AbstractSampler[; kwargs...])
59+
```
60+
returns an iterator that returns samples continuously, without a predefined
61+
stopping condition. Similarly,
62+
```julia
63+
AbstractMCMC.Sample([rng::Random.AbstractRNG, ]model::AbstractModel, sampler::AbstractSampler[; kwargs...])
64+
```
65+
returns a transducer that returns samples continuously.
66+
67+
Common keyword arguments for regular and parallel sampling (not supported by the iterator and transducer)
68+
are:
69+
- `progress` (default: `true`): toggles progress logging
70+
- `chain_type` (default: `Any`): determines the type of the returned chain
71+
- `callback` (default: `nothing`): if `callback !== nohting`, then
72+
`callback(rng, model, sampler, sample, iteration)` is called after every sampling step,
73+
where `sample` is the most recent sample of the Markov chain and `iteration` is the current iteration
74+
75+
Additionally, AbstractMCMC defines the abstract type `AbstractChains` for Markov chains and the
76+
method `AbstractMCMC.chainscat(::AbstractChains...)` for concatenating multiple chains.
77+
(defaults to `cat(::AbstractChains...; dims = 3)`).
78+
79+
Note that AbstractMCMC exports only `MCMCThreads` and `MCMCDistributed` (and in
80+
particular not `StatsBase.sample`).
81+
82+
## Developer documentation: Default implementation
83+
84+
AbstractMCMC provides a default implementation of the user-facing interface described
85+
above. You can completely neglect these and define your own implementation of the
86+
interface. However, as described below, in most use cases the default implementation
87+
allows you to obtain support of parallel sampling, progress logging, callbacks, iterators,
88+
and transducers for free by just defining the sampling step of your inference algorithm,
89+
drastically reducing the amount of code you have to write. In general, the docstrings
90+
of the functions described below might be helpful if you intend to make use of the default
91+
implementations.
92+
93+
### Basic structure
94+
95+
The simplified structure for regular sampling (the actual implementation contains
96+
some additional error checks and support for progress logging and callbacks) is
97+
```julia
98+
StatsBase.sample(
99+
rng::Random.AbstractRNG,
100+
model::AbstractMCMC.AbstractModel,
101+
sampler::AbstractMCMC.AbstractSampler,
102+
nsamples::Integer;
103+
chain_type = ::Type{Any},
104+
kwargs...
105+
)
106+
# Obtain the initial sample and state.
107+
sample, state = AbstractMCMC.step(rng, model, sampler; kwargs...)
108+
109+
# Save the sample.
110+
samples = AbstractMCMC.samples(sample, model, sampler, N; kwargs...)
111+
samples = AbstractMCMC.save!!(samples, sample, 1, model, sampler, N; kwargs...)
112+
113+
# Step through the sampler.
114+
for i in 2:N
115+
# Obtain the next sample and state.
116+
sample, state = AbstractMCMC.step(rng, model, sampler, state; kwargs...)
117+
118+
# Save the sample.
119+
samples = AbstractMCMC.save!!(samples, sample, i, model, sampler, N; kwargs...)
120+
end
121+
122+
return AbstractMCMC.bundle_samples(samples, model, sampler, state, chain_type; kwargs...)
123+
end
124+
```
125+
All other default implementations make use of the same structure and in particular
126+
call the same methods.
127+
128+
### Sampling step
129+
130+
The only method for which no default implementation is provided (and hence which
131+
downstream packages *have* to implement) is `AbstractMCMC.step`
132+
that defines the sampling step of the inference method. In the initial step it is
133+
called as
134+
```julia
135+
AbstractMCMC.step(rng, model, sampler; kwargs...)
136+
```
137+
whereas in all subsequent steps it is called as
138+
```julia
139+
AbstractMCMC.step(rng, model, sampler, state; kwargs...)
140+
```
141+
where `state` denotes the current state of the sampling algorithm. It should return
142+
a 2-tuple consisting of the next sample and the updated state of the sampling algorithm.
143+
Hence `AbstractMCMC.step` can be viewed as an extended version of
144+
[`Base.iterate`](https://docs.julialang.org/en/v1/base/collections/#lib-collections-iteration-1)
145+
with additional positional and keyword arguments.
146+
147+
### Collecting samples (does not apply to the iterator and transducer)
148+
149+
After the initial sample is obtained, the default implementations for regular and parallel sampling
150+
(not for the iterator and the transducer since it is not needed there) create a container for all
151+
samples (the initial one and all subsequent samples) using `AbstractMCMC.samples`. By default,
152+
`AbstractMCMC.samples` just returns a concretely typed `Vector` with the initial sample as single
153+
entry. If the total number of samples is fixed, we use `sizehint!` to suggest that the container
154+
reserves capacity for all samples to improve performance.
155+
156+
In each step, the sample is saved in the container by `AbstractMCMC.save!!`. The notation `!!`
157+
follows the convention of the package [BangBang.jl](https://github.com/JuliaFolds/BangBang.jl)
158+
which is used in the default implementation of `AbstractMCMC.save!!`. It indicates that the
159+
sample is pushed to the container but a "widening" fallback is used if the container type
160+
does not allow to save the sample. Therefore `AbstractMCMC.save!!` *always has* to return the container.
161+
162+
For most use cases the default implementation of `AbstractMCMC.samples` and `AbstractMCMC.save!!`
163+
should work out of the box and hence need not to be overloaded in downstream code. Please have
164+
a look at the docstrings of `AbstractMCMC.samples` and `AbstractMCMC.save!!` if you intend
165+
to overload these functions.
166+
167+
### Creating chains (does not apply to the iterator and transducer)
168+
169+
At the end of the sampling procedure for regular and paralle sampling (not for the iterator
170+
and the transducer) we transform the collection of samples to the desired output type by
171+
calling
172+
```julia
173+
AbstractMCMC.bundle_samples(samples, model, sampler, state, chain_type; kwargs...)
174+
```
175+
where `samples` is the collection of samples, `state` is the final state of the sampler,
176+
and `chain_type` is the desired return type. The default implementation in AbstractMCMC
177+
just returns the collection `samples`.
178+
179+
The default implementation should be fine in most use cases, but downstream packages
180+
could, e.g., save the final state of the sampler as well if they overload `AbstractMCMC.bundle_samples`.

0 commit comments

Comments
 (0)