Skip to content

Additional Methods for RegressionModel #300

@Nosferican

Description

@Nosferican

I would like to Feature Request the following methods:

StatsBase.model_matrix(obj::RegressionModel)
StatsBase.coefnames(obj::RegressionModel)
StatsBase.wald_test(obj::RegressionModel;
    restrictions::Vector{Symbol} = setdiff(StatsBase.coefnames(obj), ["(Intercept)"]),
    Vcov::AbstractMatrix = vcov(obj),
    rdf::Integer = StatsBase.dof_residual(obj))
  • model_matrix would return a the design matrix as a Matrix{Float64}.

    This would allow various packages to access the field with a standard API instead of hunting down the structure of the structs.

  • coefnames would return the names of the coefficients as Vector{Symbol}.

    It mirrors DataFrames.coefnames(obj::DataFrames.ModelFrame)

  • wald_test would return the Wald statistic for those restrictions as a Dict{Symbol,Any} with keys: Statistic (Value of test), p_value (p-value based on the F-Distribution with number of restrictions and residual degrees of freedom), and possibly some way to display the restrictions.

    It would allow for a standard and automated default method rather than have each package define it (redundant code). Currently the most useful syntax is to especially which coefficients are jointly significant (different from zero) with the default being all, but the intercept if included. However, for being comprehensive a different syntax would have to be worked in order to allow for any valid linear combination restrictions, (e.g., coef(obj)[1] == coef(obj)[2] and coef(obj)[3] == 2 * coef(obj)[4]). The rdf keyword parameter allows for degrees of freedom adjustments commonly used with cluster-robust variance-covariance estimators.

For example,

output = Dict{Symbol,Float64}()
varlist = coefnames(obj)
β = StatsBase.coef(obj)
R = diagm(map(var -> var ∈ restrictions, varlist))
R = R[find(mapslices(col -> col[max(1, findfirst(col))] .== 1,
        RowEchelon.rref(round.(R, 12)), 1)),:]
k = size(R, 1)
Bread = R * β
Meat = inv(R * Vcov * R')
output[:Statistic] = (Bread' * Meat * Bread) / k
F_Dist = Distributions.FDist(k, rdf)
output[:p_value] = Distributions.ccdf(F_Dist, output[:Statistic])
output

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions