Skip to content

Is init_theta keyword working? #1588

@bratslavia

Description

@bratslavia

I've been trying to start my samplers in "better" regions of the parameter space, since it's a bit slow to converge otherwise on certain parameters. But from what I can tell, whenever I pass init_theta as per the docs, whatever values I have there are just ignored.

MWE:

using Turing

# Model definition.
@model demo(x) = begin
    u ~ MvNormal(zeros(2), ones(2))
    x ~ MvNormal(u, ones(2))
end

a = ones(2)
start_vals = [-10, -10]
m1 = sample(demo(a), MH(diagm([0.001, 0.001.])), MCMCThreads(), 10, 4, init_theta=start_vals)
m2 = sample(demo(a), HMC(0.0001, 1), MCMCThreads(), 10, 4, init_theta=start_vals)

Array(m1)
Array(m2)

Here, I start the sampler in a far-away place. The MH proposals in m1 are tiny, so it should be taking quite a while to get away from the region around (-10,-10); apparently, the starting values are just being ignored. I wasn't sure if it was just a MH issue, so I try HMC as well in the above example, and it appears to also ignore the supplied starting values.

After digging around some more, I tried changing it to the keyword init_params instead of init_theta. This works for the MH code above, but not for HMC (oddly enough, since the HMC sampling code is where I found the init_params keyword in the first place. I also tried NUTS, and didn't see any evidence init_params was being respected there, either (although it's much harder to tell with an algorithm like NUTS, that can move away from starting values so quickly).

So, it's not clear to me if this is just a case of the docs being in need of an update, or if there's something else wrong.

And in any case, if init_theta isn't a valid keyword anymore, perhaps supplying such a keyword should give an error or something? I just spent several days trying to debug code that I thought had some subtle identification error that was making my sampler "blow up" right at the start of sampling, when really it was just sample silently ignoring starting values set by a deprecated(?) keyword.

FWIW, being able to catch stuff like this easily is one reason I strongly favor recording the initial values of samplers as part of the chain itself (per the discussion in #1282), something I always do if writing samplers by hand.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions