Skip to content

Conversation

@MalteKurz
Copy link
Member

@MalteKurz MalteKurz commented May 20, 2022

Description

PLR

  • Nuisance estimation for IV type score: In this PR the nuisance estimation for the IV-type score in the PLR model is adapted to be in line with the DML paper Chernozhukov et al. (2018).
    • Results for the default score='partialling out' (Equation (4.4) in Chernozhukov et al. (2018)) are not affected by the changes in this PR. However, the naming of the nuisance parameter is changed from ml_g to ml_l (analogously predictions g_hat have been renamed to l_hat, etc.) to be better in line with Chernozhukov et al. (2018). To make the transition to the new naming smooth, depreciation warnings have been added (see below for an overview of the API changes and examples for the depreciation warnings).
    • For the score='IV-type' (Equation (4.3) in Chernozhukov et al. (2018)) the implementation now follows the approach described on pp. C31-C33 in Chernozhukov et al. (2018). This means that an initial estimate for theta_0 is obtained via the 'partialling out' score. Then an estimate for g_0(X) is obtained by regressing Y - theta_0 * D on X. Therefore, an additional learner (not needed to evaluate the score) needs to be provided, i.e., the nuisance function l_0(X) (needed for the preliminary theta_0 estimate) is estimated with learner ml_l and g_0(X) with learner ml_g. To make the transition to the new API (additional learner) smooth, depreciation warnings have been added (see below for an overview of the API changes and examples for the depreciation warnings). Especially, if only ml_g is specified but not ml_l, then ml_g = clone(ml_l) is being used and a warning is being thrown.

PLIV

  • In this PR a new score function for the PLIV model is implemented:
    • Results for the default score='partialling out' (Equation (4.8) in Chernozhukov et al. (2018)) are not affected by the changes in this PR. However, the naming of the nuisance parameter is changed from ml_g to ml_l (analogously predictions g_hat to l_hat, etc.) to be better in line with Chernozhukov et al. (2018). To make the transition to the new naming smooth, depreciation warnings have been added (see below for examples).
    • A new score='IV-type' (Equation (4.7) in Chernozhukov et al. (2018)) is now available for the PLIV model. The estimation of the nuisance parts follows the approach described on p. C33 in Chernozhukov et al. (2018). This means that an initial estimate for theta_0 is obtained via the 'partialling out' score. Then an estimate for g_0(X) is obtained by regressing Y - theta_0 * D on X. Therefore, two additional learners (not needed to evaluate the score) need to be provided, i.e., the nuisance functions l_0(X) and r_0(X) (needed for the preliminary theta_0 estimate) are estimated with learner ml_l and ml_r. g_0(X) is estimated with learner ml_g.

API changes

PLR

  • API changed from DoubleMLPLR$new(obj_dml_data, ml_g, ml_m [, ...]) to DoubleMLPLR$new(obj_dml_data, ml_l, ml_m, ml_g [, ...]).
    • For score='partialling out' ml_l & ml_m are needed.
    • For score='IV-type' ml_l, ml_m & ml_g.
    • For function()s as score ml_l & ml_m are mandatory and ml_g optional.
  • If a function() is provided as score, it must be of the form function(y, d, l_hat, m_hat, g_hat, smpls) (previously function(y, d, g_hat, m_hat, smpls)).

PLIV

  • API changed from DoubleMLPLIV$new(obj_dml_data, ml_g, ml_m, ml_r [, ...]) to DoubleMLPLIV$new(obj_dml_data, ml_g, ml_m, ml_r, ml_g [, ...]).
    • For score='partialling out' ml_l, ml_m & ml_r are needed.
    • For score='IV-type' ml_l, ml_m, ml_r & ml_g.
    • For function()s as score ml_l, ml_m & ml_r are mandatory and ml_g optional.
  • If a function() is provided as score, it must be of the form function(y, z, d, l_hat, m_hat, r_hat, g_hat, smpls) (previously function(y, z, d, g_hat, m_hat, r_hat, smpls)).

Depreciation warnings for the API changes for DoubleMLPLR and DoubleMLPLIV

  • Initialization code for the following code examples:
library(DoubleML)
library(mlr3)
library(mlr3learners)
library(data.table)
set.seed(2)
ml_l = lrn("regr.ranger", num.trees = 10, max.depth = 2)
ml_m = ml_l$clone()
ml_r = ml_l$clone()
ml_g = ml_l$clone()
plr_data = make_plr_CCDDHNR2018(n_obs=500)
pliv_data = make_pliv_CHS2015(n_obs=500)
  • For PLR & PLIV with score='partialling out' and if the learners are provided as positional arguments, nothing changed.
dml_plr_obj = DoubleMLPLR$new(plr_data, ml_l, ml_m, score='partialling out')
dml_pliv_obj = DoubleMLPLIV$new(pliv_data, ml_l, ml_m, ml_r, score='partialling out')

-- >Note however that, if, besides the learner, other arguments have also been provided as positional arguments, the changed API causes exceptions because the additional learner was added as fourth (PLR) / fifth (PLIV) argument

  • For PLR with score='partialling out' and keyword arguments ml_g and ml_m (old API naming), the learner provided for ml_g is used for ml_l and a warning is issued.
dml_plr_obj = DoubleMLPLR$new(plr_data, ml_g=ml_g, ml_m=ml_m, score='partialling out')
Warning message:
The argument ml_g was renamed to ml_l. Please adapt the argument name accordingly. ml_g is redirected to ml_l.
The redirection will be removed in a future version. 
  • For PLR with score='IV-type' and keyword arguments ml_g and ml_m (old API naming), the learner provided for ml_g is also used for ml_l and a warning is issued. (Note it is first redirected to ml_l and then cloned to ml_g)
dml_plr_obj = DoubleMLPLR$new(plr_data, ml_g=ml_g, ml_m=ml_m, score='IV-type')
Warning messages:
1: The argument ml_g was renamed to ml_l. Please adapt the argument name accordingly. ml_g is redirected to ml_l.
The redirection will be removed in a future version. 
2: For score = 'IV-type', learners ml_l and ml_g should be specified. Set ml_g = ml_l$clone().
  • For PLR with score='IV-type' and only two learners as positional arguments, the learner provided for ml_g is used for ml_l and a warning is issued.
dml_plr_obj = DoubleMLPLR$new(plr_data, ml_l, ml_m, score='IV-type')
Warning message:
For score = 'IV-type', learners ml_l and ml_g should be specified. Set ml_g = ml_l$clone(). 
  • For PLR & PLIV with score score='partialling out', the methods set_ml_nuisance_params and tune redirect ml_g to ml_l.
dml_plr_obj = DoubleMLPLR$new(plr_data, ml_l, ml_m, score='partialling out')
dml_plr_obj$set_ml_nuisance_params('ml_g', 'd', list(num.trees = 10, max.depth = 2))
Warning message:
Learner ml_g was renamed to ml_l. Please adapt the argument learner accordingly. The provided parameters are set for ml_l. The redirection will be removed in a future version.

Miscellaneous

PR Checklist

  • The title of the pull request summarizes the changes made.
  • The PR contains a detailed description of all changes and additions.
  • The code passes R CMD check and all (unit) tests (see our contributing guidelines for details).
  • Enhancements or new feature are equipped with unit tests.
  • The changes adhere to the "mlr-style" standards (see our contributing guidelines for details).

suggest test for functional initializer for IV-type score
Copy link
Member

@PhilippBach PhilippBach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @MalteKurz ,

thanks for the PR fixing the IV-type score for the PLR and implementing it for PLIV (case with 1 instrument and partialling out X).

I only have minor comments and have I suggestion for an additional test (5dfe618) covering the functional initialization of the PLIV partial X with score = "IV-type" .

Feel free to integrate this or to drop it as you like. The other changes only refer to the exception handling in case user provide ml_g with score = "partialling out" (no strong opinion on this) and a minor change to the format of the documentation

Overall this looks good and is ready to be merged (subject to anything you'd like to change)

@MalteKurz
Copy link
Member Author

@PhilippBach Thanks for the review. I adapted the code accordingly. Additionally, I also adapted the corresponding Python PR such that the newly introduced warning is also present there. I also checked your suggested new unit test: I opened a corresponding PR in order to integrate it into this PR and added a comment with a suggestion / extension, see #162.

PhilippBach and others added 3 commits June 10, 2022 21:09
test for initializer with IV-type score PLIV
Unit test for functional initializer for PLIV
Copy link
Member

@PhilippBach PhilippBach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good now. Thanks @MalteKurz for incorporating these additional changes

@MalteKurz MalteKurz merged commit fd2dce8 into master Jun 14, 2022
@MalteKurz MalteKurz deleted the m-pliv-iv-type branch June 15, 2022 07:31
PhilippBach added a commit that referenced this pull request Feb 13, 2023
drop mtry parameter for bonus example, adjust naming of learner according to #161
@PhilippBach PhilippBach mentioned this pull request Feb 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants