Skip to content

Conversation

@trappmartin
Copy link
Member

@trappmartin trappmartin commented Oct 22, 2018

This PR is a work in progress PR, integrating the existing codes of #370 and #374 for random partitions.

TODO:

Changes to code base:

  • Added BNP priors

cc: @mlomeli1 , @emilemathieu

@mlomeli1
Copy link

This PR is a work in progress PR, integrating the existing codes of #370 and #374 for random partitions.

Do not merge!

TODO:

  • Add tests for distributions.
  • Add missing codes.
  • Modify Turing.Chain to work with missing values.
  • Add DP-MM example.

cc: @mlomeli1 , @emilemathieu

That's great @trappmartin . If you want to have a look at https://github.com/mlomeli1/SMC-MPhilproject/tree/master/MFM_for_Turing , there is some code for SMC for DPM as well. I believe you have access to this private repo. I also have my Phd Matlab code for the Q-class, let me know if you would like access to that repo if it is useful :)

@emilemathieu
Copy link
Collaborator

Looking forward to that :)
What's still missing then ? I'm glad to answer questions if that can help.
Cheers,
Emile

@trappmartin
Copy link
Member Author

Thanks, @emilemathieu and @mlomeli1!
There are still some things breaking but I keep you up to date.

I changed the implementation of BNP priors by separating the representation from the stochastic process. For example, a Pitman-Yor process can now be constructed as follows:

a = 0.5
θ = 0.1
t = 2

# stick-breaking representation
d = StickBreakingProcess(PitmanYorProcess(a, θ, t))

# size-biased sampling representation
surplus = 2.0
d = SizeBiasedSamplingProcess(PitmanYorProcess(a, θ, t), surplus)

# CRP representation
cluster_counts = [2, 1]
d = ChineseRestaurantProcess(PitmanYorProcess(a, θ, t), cluster_counts)

Let me know what you think of the new interface. I hope it's easier to use and allows us to have a more flexible interface for BNP priors.

Cheers,
Martin

@emilemathieu
Copy link
Collaborator

I believe such an interface is way better !
The only question is how such processes are represented internally ? Thus how inference can be performed (eg SMC)?

@trappmartin
Copy link
Member Author

Thanks, I think it should not have much of an influence on the sampling process. I should see soon. :D

@trappmartin trappmartin self-assigned this Dec 7, 2018
@trappmartin
Copy link
Member Author

Chinese Restaurant Process Example using current implementation:

@model infiniteMM(y; H = Normal(mean(y), std(y) * 2), rpm = DirichletProcess(0.1) ) = begin
    
    # Latent assignments.
    N = length(y)
    z = tzeros(Int, N)

    # Cluster counts.
    cluster_counts = tzeros(Int, N)

    # Cluster locations.
    x = tzeros(Float64, N)

    for i in 1:N

        # Draw assignments using a CRP.
        z[i] ~ ChineseRestaurantProcess(rpm, cluster_counts)
        if cluster_counts[z[i]] == 0
            # Cluster is new, therefore, draw new location.
            x[z[i]] ~ H
        end
        cluster_counts[z[i]] += 1

        # Draw observation.
        y[i] ~ Normal(x[z[i]], 0.5)
    end
    return z
end

@emilemathieu
Copy link
Collaborator

Nice ! Do you think that interface allows to easily extends to NIGP and such ?

@trappmartin
Copy link
Member Author

Yes, I'm pretty certain this should be possible. For this PR the focus is only on DP and PYP but it totally makes sense to extend the code after merging this PR.

@trappmartin
Copy link
Member Author

Ups, I think I broke something. o.O

@trappmartin trappmartin changed the title [WIP] BNP priors for random partitions BNP priors for random partitions Dec 10, 2018
@emilemathieu
Copy link
Collaborator

changing data = vcat(rand(Normal(0, 0.5), 10), rand(Normal(8, 0.5), 10)) to data = vcat(rand(Normal(0, 0.5), 10), rand(Normal(1, 0.5), 10)) yields an error.
I was pushing the cluster closer to try to get different value for the particles since they are all equals...

@trappmartin
Copy link
Member Author

trappmartin commented Mar 11, 2019

I added the test for the stick-breaking representation. However, I'm a bit unsure if my implementation is correct / necessary or if we can use the one by @emilemathieu.

Here is my version of a truncated stick-breaking in Turing.

@model sbimm(y, rpm, trunc) = begin
    # Base distribution.
    H = Normal(mu_0, sigma_0)

    # Latent assignments.
    N = length(y)
    z = tzeros(Int, N)

    # Infinite collection of stick pieces and weights.
    v = tzeros(Float64, trunc)
    w = tzeros(Float64, trunc)
    K = 0

    # Cluster locations.
    x = tzeros(Float64, trunc)

    for i in 1:N

        # Draw a slice ∈ [0,1].
        u[i] ~ Beta(1, 1)

        # Instantiate new cluster.
        while (sum(w) < u[i]) && (K < trunc)
            K += 1
            v[K] ~ StickBreakingProcess(rpm)
            x[K] ~ H
            w[K] = v[K] * prod(1 .- v[1:(K-1)])
        end

        # Find truncation point
        K_ = findfirst(u[i] .< cumsum(w))

        # Sample assignments.
        w_ = w[1:K_] / sum(w[1:K_])
        z[i] ~ Categorical(w_)

        # Draw observation.
        y[i] ~ Normal(x[z[i]], sigma_1)
    end
end

@emilemathieu and @yebai what are your thoughts?

@emilemathieu
Copy link
Collaborator

Hi Martin! This seems to be a valid implementation of a stick breaking process :)
PS: yet it is inefficient as we argue in our workshop paper

@trappmartin
Copy link
Member Author

@cpfiffer the tests of this PR seem to be broken due to some bug in displaying MCMCChains. Can you have a look?

@yebai
Copy link
Member

yebai commented Mar 11, 2019

@cpfiffer the tests of this PR seem to be broken due to some bug in displaying MCMCChains. Can you have a look?

It's solved on the master branch; you need to rebase master into this PR.

@trappmartin
Copy link
Member Author

trappmartin commented Mar 11, 2019

@emilemathieu I did some minor adjustment on your code for the SBS. Could you let me know if this is correct for simulation based sampling. I'm still not too familiar with the SBS and probably should read the paper again once I find the time. :)

Based on: https://github.com/TuringLang/Turing.jl/blob/project-bnp/test/rpm.jl/imm.jl

Thanks!

@model sbsimm(y,rpm) = begin
    # Base distribution.
    H = Normal(mu_0, sigma_0)

    # Latent assignments.
    N = length(y)
    z = tzeros(Int, N)

    x = tzeros(Float64, N)
    J = tzeros(Float64, N)
    z = tzeros(Int, N)

    k = 0
    surplus = 1

    for i in 1:N
        ps = vcat(J[1:k], surplus)
        z[i] ~ Categorical(ps)
        if z[i] > k
            k = k + 1
            J[k] ~ SizeBiasedSamplingProcess(rpm, surplus)
            x[k] ~ H
            surplus -= J[k]
        end
        y[i] ~ Normal(x[z[i]], sigma_1)
    end
end

@trappmartin
Copy link
Member Author

@yebai once the SBS sampling example is correct, this PR is ready for review and merging.

@trappmartin
Copy link
Member Author

This PR is ready to be merged from my side.

end

# Find truncation point
K_ = findfirst(u[i] .< cumsum(w))
Copy link
Member

@yebai yebai Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a non-typical slice sampler for DPs. Do you have a reference for this slice sampling representation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. After looking at it again, it seems rather odd and is probably not quite correct. I can change it to the retrospective sampler by Papaspiliopoulos and Roberts which seems straight forward in Turing to me. Or do you have a preference for another one, e.g. Walker et al.?

Copy link
Member

@yebai yebai Mar 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a preference for this; the method in Papaspiliopoulos and Roberts sounds good to me. Or, we can simply implement a basic recursive stick breaking if it's only for testing purpose. We can leave the task of advanced implementations till later, perhaps in a BNP tutorial?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. I already started a BNP tutorial and will put the Papaspiliopoulos and Roberts code in there. With basic recursive stick breaking you mean a truncated implementation with fixed truncation point, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean Alg 1 from the following paper:

http://www.robots.ox.ac.uk/~twgr/assets/pdf/bloemreddy2017rpm.pdf

This version doesn't involve any truncation through the use of a random coin-flip based termination criterion. I still need to read the original paper to have a better understanding of why this is equivalent to the standard stick-breaking, but my guess is that the expectation of the process converges to the standard stick-breaking process.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Yes, it looks like it does. I'll read the paper more carefully again as I forgot about the recursive coin-flipping.

Copy link
Member Author

@trappmartin trappmartin Mar 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Algo 1 (coin-flipping based) is complicated to implement in Turing because of the variable name issue we have, i.e. the coin will not be resampled recursively because of:

if ~haskey(vi, vn)
r = rand(dist)
push!(vi, vn, r, dist, spl.alg.gid)
spl.info[:cache_updated] = CACHERESET # sanity flag mask for getidcs and getranges
elseif is_flagged(vi, vn, "del")
unset_flag!(vi, vn, "del")
r = rand(dist)
vi[vn] = vectorize(dist, r)
setgid!(vi, spl.alg.gid, vn)
setorder!(vi, vn, vi.num_produce)
else
updategid!(vi, vn, spl)
r = vi[vn]
end
I'll keep the truncated stick-breaking test for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, perhaps move this into a separate issue?

This should be fixed soon - see related discussion #720 (review).

@yebai
Copy link
Member

yebai commented Mar 20, 2019

It's probably good to have a dedicated folder for customised distributions in Turing, e.g. src/distrs/. If so, we can consider placing the main BNP module in src/distrs/RandomMeasures.jl.

@yebai
Copy link
Member

yebai commented Mar 20, 2019

Excellent work - Ready to merge except one minor filename issue (see above).

@yebai yebai merged commit 8f6aee6 into master Mar 20, 2019
@yebai yebai deleted the ref/BNP branch March 20, 2019 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants