Skip to content

Conversation

@cpfiffer
Copy link
Member

This PR adds MLE and MAP functions to generate point estimates. I have not yet written tests, so this is still a WIP.

Usage:

@model function gdemo(x)
    s ~ InverseGamma(15,3)
    m ~ Normal(0, 1)

    for i in eachindex(x)
        x[i] ~ Normal(m, s)
    end

    return m, s
end

Random.seed!(100)
data = rand(Normal(0, 2), 10)
model = gdemo(data)

# Make point estimates
mle_gdemo = MLE(model)
map_gdemo = MAP(model)

MLE results:

2×3 Named Array{Float64,2}
A ╲ B │ parameter    std_err      tstat
──────┼────────────────────────────────
s     │   2.08011   0.465139    4.47202
m     │  0.596312   0.657789    0.90654

MAP results:

2×3 Named Array{Float64,2}
A ╲ B │ parameter    std_err      tstat
──────┼────────────────────────────────
s     │    1.3783    0.20177    6.83104
m     │  0.501096    0.40024    1.25199

@mohdibntarek
Copy link
Contributor

I suppose that for many data points, m converges to 0 in MLE?

@cpfiffer
Copy link
Member Author

Yeah, the MLE estimate here is basically the mean.

@devmotion
Copy link
Member

Honestly, I had no clue what you mean with standard errors and tstat before I checked the code. I would have expected it to just return the mode, either as named tuple or VarInfo object, possibly together with the other results from the optimizer. Maybe one could have a separate function for these diagnostics?

If mode_estimation "just" performs the optimization, an alternative approach could also be to just provide the cost function and initial value to the user, similar to https://github.com/baggepinnen/FluxOptTools.jl, so users can just call Optim (or possibly other optimization packages) in whatever way without Turing having to worry about it.

@cpfiffer
Copy link
Member Author

Honestly, I had no clue what you mean with standard errors and tstat before I checked the code. I would have expected it to just return the mode, either as named tuple or VarInfo object, possibly together with the other results from the optimizer. Maybe one could have a separate function for these diagnostics?

No, this is almost completely counter to how these analyses are supposed to work. Using a MLE/MAP estimate with no information on the dispersion of that estimator is next to useless, akin to running OLS without standard errors or using just the median or whatever from a full posterior.

I think if you are the kind of person who works in the MLE world, it's an absolute requirement to have statistics computed from the information matrix. Plus, you can use it to run hypothesis tests.

Granted, perhaps this should be separated out, such that you do something like

est = MLE(model)
info_matrix = information_matrix(est)

Or something equivalent.

But I would prefer to hew to how OLS works, where the estimator covariance matrix is returned simultaneously with the point estimates.

@cpfiffer
Copy link
Member Author

Alright, tests are passing at the moment. Further comments or things you don't think have been addressed yet?

Copy link
Contributor

@mohdibntarek mohdibntarek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just left one small comment. The named tuple initialization can be handled in a separate PR I think. The same needs to be done for sampling as well. Really great work @cpfiffer! I am really happy we finally have MAP and MLE out of the box.

@cpfiffer
Copy link
Member Author

Aight, I think it's finally good to go. 🎆

Comment on lines +61 to +64
@init @require Optim="429524aa-4258-5aef-a3af-852621145aeb" @eval begin
include("modes/ModeEstimation.jl")
export MAP, MLE, optimize
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed: users will have problems if they load Optim but not NamedArrays. We should add a nested @require block that checks for NamedArrays if we want to keep it optional (which I guess we want). Otherwise one could think about returning a named tuple as default and just provide some optional way for converting it to a NamedArray.

@mohdibntarek
Copy link
Contributor

So if NamedArrays is an optional dependency, why does it not have a require block?

Co-authored-by: David Widmann <[email protected]>
@cpfiffer
Copy link
Member Author

I think I'd actually prefer to make NamedArrays a full dependency, just for the moment. I think all this stuff should probably be spun off into a different package so the deps can be managed more specifically.

@torfjelde
Copy link
Member

Also, the VI part would benefit from using NamedArrays because right now it's difficult for the user to figure out what the different elements in the vector refers to (you can use bijector(model; symbol_to_ranges = Val(true)) but that's a "hack" that I've added in so that just have some way of doing it). And the reason why I didn't make a proper solution was to reduce deps. So making NamedArrays a explicit dep is also useful for other parts in Turing:)

@cpfiffer cpfiffer merged commit 76b5b7d into master May 23, 2020
@delete-merged-branch delete-merged-branch bot deleted the csp/modes branch May 23, 2020 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants