LPLR model #365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

jer2ig wants to merge 18 commits into main from jh-logistic-model

Member

jer2ig commented Oct 27, 2025

Addition of logistic partially linear model.

Created PR to check for merging and review changes. Unit tests in work.

jer2ig added 14 commits

October 24, 2025 13:47


          Logistic regression implementation WIP

df03887


          First WIP of implementation

f5521f1


          Working implementation. Started on test set-up.

bfa756c


          Changed data type of arrays

d729d0a


          Fix variable name

8fe7ca6


          Moved into plm folder, started testing setup

18bac23


          Fixed bug in score computation

c6e600d


          Reverted from ensure_all_finite to force_all_finite

6f556e0


          Fixes to instrument score

3a332bf


          Added option for exception on convergence failure

b41a773


          Added unbalanced dataset option, bug fixes

c434667


          Added binary treatment dataset, fixed bug for model check

443d82d


          Adjusted dataset balancing

774c74d


          Renamed Logistic to LPLR

Added test set-up

github-advanced-security bot found potential problems

View reviewed changes

doubleml/datasets.py Fixed Show fixed Hide fixed

doubleml/double_ml_data.py Fixed Show fixed Hide fixed

doubleml/plm/datasets/dgp_lplr_LZZ2020.py Fixed Show fixed Hide fixed

doubleml/plm/lplr.py Fixed Show fixed Hide fixed

doubleml/plm/lplr.py Fixed Show fixed Hide fixed

doubleml/plm/lplr.py Fixed Show fixed Hide fixed

doubleml/plm/tests/_utils_logistic_manual.py Fixed Show fixed Hide fixed

doubleml/plm/tests/test_lplr_exceptions.py Fixed Show fixed Hide fixed

doubleml/plm/tests/tests_logistic.py Fixed Show fixed Hide fixed

doubleml/plm/tests/tests_logistic.py Fixed Show fixed Hide fixed

jer2ig added 2 commits

October 27, 2025 14:19


          Clean-up of branch

dbfea73


          Ruff checks and formatting

29114ce

github-advanced-security bot found potential problems

View reviewed changes

doubleml/plm/lplr.py


		return self

		def set_sample_splitting(self):

Check warning

Code scanning / CodeQL

Signature mismatch in overriding method Warning

This method requires 1 positional argument, whereas overridden

SampleSplittingMixin.set_sample_splitting

requires at least 2.

Copilot Autofix

AI 8 days ago

To fix the issue, update the signature of set_sample_splitting in DoubleMLLPLR to match that of the method in the base class, SampleSplittingMixin.set_sample_splitting. Since we do not have the exact signature from the base class, but know it requires at least two arguments (likely self, sample_splitting or something similar), the best and most robust fix is to accept at least the same positional arguments and possibly any optional/keyword arguments the base method supports. Typically, when unsure and aiming for compatibility, accepting *args, **kwargs covers any additional arguments expected.

Implement this change in doubleml/plm/lplr.py at the definition of set_sample_splitting. The new method should mirror the old behavior of raising NotImplementedError, but with the correct expanded signature.

Suggested changeset 1

doubleml/plm/lplr.py

@@ -519,7 +519,7 @@
                     return self
-                def set_sample_splitting(self):
+                def set_sample_splitting(self, *args, **kwargs):
                     raise NotImplementedError("set_sample_splitting is not implemented for DoubleMLLPLR.")
                 def _compute_score(self, psi_elements, coef):

Copilot is powered by AI and may make mistakes. Always verify output.


          Unit tests work and bug fix in lplr

5d2d1ed

github-advanced-security bot found potential problems

View reviewed changes

doubleml/plm/tests/_utils_lplr_manual.py Fixed Show fixed Hide fixed

doubleml/plm/tests/_utils_lplr_manual.py

+                      all_t_hat.append(t_hat)
+                      all_m_hat.append(m_hat)
+                      thetas[i_rep], ses[i_rep] = solve_score(M_hat_list, t_hat_list, m_hat_list)

Check warning

Code scanning / CodeQL

Use of the return value of a procedure Warning test

The result of

solve_score

is used even though it is always None.

Copilot Autofix

AI 8 days ago

To fix the error, you must ensure that the code does not use the return value of a function that does not return anything (i.e., a procedure), as this leads to confusing or broken behaviour and runtime errors. The single best solution is to either:

Implement solve_score to return two values as expected, or
Remove any expectation of return values, depending on whether such output is intended.

Given the context, the line thetas[i_rep], ses[i_rep] = solve_score(M_hat_list, t_hat_list, m_hat_list) suggests solve_score is expected to return two values (likely a theta and a standard error). Therefore, the best solution is to implement a placeholder solve_score that returns two dummy values (for example, 0.0, 0.0), or, if possible, return appropriate calculated results. Since fit_selection relies on these values for further calculation, returning two float values from solve_score is necessary for functionality.

All code changes are limited to the shown code block, so only edit the definition of solve_score in doubleml/plm/tests/_utils_lplr_manual.py.

Suggested changeset 1

doubleml/plm/tests/_utils_lplr_manual.py

@@ -75,7 +75,11 @@
                 return res
             def solve_score(M_hat, t_hat, m_hat):
-                pass
+                """
+                Placeholder implementation.
+                Returns dummy values for theta and se.
+                """
+                return 0.0, 0.0
             def fit_nuisance_selection(
                     y,

Copilot is powered by AI and may make mistakes. Always verify output.

doubleml/plm/tests/_utils_lplr_manual.py

+                      m_params=None,
+              ):
+                  # TODO: complete for lplr
+                  n_obs = len(y)

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable n_obs is not used.

Copilot Autofix

AI 8 days ago

To fix a genuinely unused local variable, the assignment should be safely removed. In this case, n_obs = len(y) can be deleted from line 96 in fit_nuisance_selection, as it does not impact the function’s logic, outputs, or structure. No further code changes or imports are necessary since no other reference to n_obs exists within the shown scope.

Suggested changeset 1

doubleml/plm/tests/_utils_lplr_manual.py

@@ -93,7 +93,6 @@
                     m_params=None,
             ):
                 # TODO: complete for lplr
-                n_obs = len(y)
                 ml_M = clone(learner_M)
                 ml_t = clone(learner_t)
                 ml_m = clone(learner_m)

Copilot is powered by AI and may make mistakes. Always verify output.

doubleml/plm/tests/_utils_lplr_manual.py

+              ):
+                  # TODO: complete for lplr
+                  n_obs = len(y)
+                  ml_M = clone(learner_M)

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable ml_M is not used.

Copilot Autofix

AI 8 days ago

To fix the issue of the unused local variable ml_M in the fit_nuisance_selection function, simply remove the assignment ml_M = clone(learner_M) on line 97. This will clean up the code by eliminating the unused variable without altering any necessary functionality. There are no side effects in the right-hand side (clone(learner_M)) that need to be preserved, since the result is not used elsewhere. Only the assignment to ml_M should be deleted; leave the corresponding assignments for ml_t and ml_m unless they are also unused (this fix only concerns ml_M as requested).

Suggested changeset 1

doubleml/plm/tests/_utils_lplr_manual.py

@@ -94,7 +94,6 @@
             ):
                 # TODO: complete for lplr
                 n_obs = len(y)
-                ml_M = clone(learner_M)
                 ml_t = clone(learner_t)
                 ml_m = clone(learner_m)

Copilot is powered by AI and may make mistakes. Always verify output.

doubleml/plm/tests/_utils_lplr_manual.py

+                  # TODO: complete for lplr
+                  n_obs = len(y)
+                  ml_M = clone(learner_M)
+                  ml_t = clone(learner_t)

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable ml_t is not used.

Copilot Autofix

AI 8 days ago

To resolve this issue, the best course of action is to remove the assignment to ml_t entirely from fit_nuisance_selection. This can be safely done since the right-hand side, clone(learner_t), as with scikit-learn's standard behavior, does not have side effects; it simply returns a new estimator object. No further code changes are needed elsewhere in the function, as ml_t is not referenced. The removal should be confined only to the relevant line in fit_nuisance_selection to preserve existing functionality.

Suggested changeset 1

doubleml/plm/tests/_utils_lplr_manual.py

@@ -95,7 +95,6 @@
                 # TODO: complete for lplr
                 n_obs = len(y)
                 ml_M = clone(learner_M)
-                ml_t = clone(learner_t)
                 ml_m = clone(learner_m)
                 dx = np.column_stack((d, x))

Copilot is powered by AI and may make mistakes. Always verify output.

doubleml/plm/tests/_utils_lplr_manual.py


		t_tune_res = tune_grid_search(d, x, ml_t, smpls, param_grid_t, n_folds_tune)

		M_best_params = [xx.best_params_ for xx in M_tune_res]

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable M_best_params is not used.

Copilot Autofix

AI 8 days ago

To fix this problem without affecting existing functionality, simply remove the unused assignment: M_best_params = [xx.best_params_ for xx in M_tune_res]. As recommended, if the right-hand side has side effects, keep them; in this case, it's a simple list comprehension extracting attributes—no side effects are expected. Verify that deleting this line does not impact later code (within the shown snippet, it does not appear to), so simply delete the line from doubleml/plm/tests/_utils_lplr_manual.py.

Suggested changeset 1

doubleml/plm/tests/_utils_lplr_manual.py

@@ -210,7 +210,6 @@
                 t_tune_res = tune_grid_search(d, x, ml_t, smpls, param_grid_t, n_folds_tune)
-                M_best_params = [xx.best_params_ for xx in M_tune_res]
                 t_best_params = [xx.best_params_ for xx in t_tune_res]
                 m_best_params = [xx.best_params_ for xx in m_tune_res]

Copilot is powered by AI and may make mistakes. Always verify output.

doubleml/plm/tests/_utils_lplr_manual.py

+                  t_tune_res = tune_grid_search(d, x, ml_t, smpls, param_grid_t, n_folds_tune)
+                  M_best_params = [xx.best_params_ for xx in M_tune_res]
+                  t_best_params = [xx.best_params_ for xx in t_tune_res]

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable t_best_params is not used.

Copilot Autofix

AI 8 days ago

To fix the problem, we should remove the assignment to the unused variable t_best_params on line 214. Because its right-hand side (the list comprehension [xx.best_params_ for xx in t_tune_res]) is not kept for a side effect, it is safe to remove the entire line. No other changes are required, as this will not affect the existing functionality of the method. Be careful to only remove this specific line, and not any unrelated code.

Suggested changeset 1

doubleml/plm/tests/_utils_lplr_manual.py

@@ -211,7 +211,6 @@
                 t_tune_res = tune_grid_search(d, x, ml_t, smpls, param_grid_t, n_folds_tune)
                 M_best_params = [xx.best_params_ for xx in M_tune_res]
-                t_best_params = [xx.best_params_ for xx in t_tune_res]
                 m_best_params = [xx.best_params_ for xx in m_tune_res]
                 t_tune_res = tune_grid_search(t_targets, x, ml_t, smpls, param_grid_t, n_folds_tune)

Copilot is powered by AI and may make mistakes. Always verify output.

doubleml/plm/tests/_utils_lplr_manual.py

+                  M_best_params = [xx.best_params_ for xx in M_tune_res]
+                  t_best_params = [xx.best_params_ for xx in t_tune_res]
+                  m_best_params = [xx.best_params_ for xx in m_tune_res]

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable m_best_params is not used.

Copilot Autofix

AI 8 days ago

To fix this issue, we should remove the assignment to m_best_params since it serves no purpose in the current function. Removing it will make the code cleaner and eliminate the static analysis warning. The modification involves simply deleting line 215 (m_best_params = [xx.best_params_ for xx in m_tune_res]) in the tune_nuisance function in doubleml/plm/tests/_utils_lplr_manual.py without impacting any other aspect of program functionality.

Suggested changeset 1

doubleml/plm/tests/_utils_lplr_manual.py

@@ -212,6 +212,5 @@
                 M_best_params = [xx.best_params_ for xx in M_tune_res]
                 t_best_params = [xx.best_params_ for xx in t_tune_res]
-                m_best_params = [xx.best_params_ for xx in m_tune_res]
                 t_tune_res = tune_grid_search(t_targets, x, ml_t, smpls, param_grid_t, n_folds_tune)

Copilot is powered by AI and may make mistakes. Always verify output.

doubleml/plm/tests/_utils_lplr_manual.py

+                  t_best_params = [xx.best_params_ for xx in t_tune_res]
+                  m_best_params = [xx.best_params_ for xx in m_tune_res]
+                  t_tune_res = tune_grid_search(t_targets, x, ml_t, smpls, param_grid_t, n_folds_tune)

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable t_tune_res is not used.

Copilot Autofix

AI 8 days ago

The best fix is to remove line 217:

t_tune_res = tune_grid_search(t_targets, x, ml_t, smpls, param_grid_t, n_folds_tune)

This eliminates the unused assignment. It is safe since the right-hand side function call doesn't have required side effects—the earlier similar code calls the same function where the result is used, and here it is not. Only remove the assignment, not any functional use elsewhere.

No additional imports, definitions, or method changes are needed. Only this line in doubleml/plm/tests/_utils_lplr_manual.py needs to be deleted.

Suggested changeset 1

doubleml/plm/tests/_utils_lplr_manual.py

@@ -214,4 +214,3 @@
                 t_best_params = [xx.best_params_ for xx in t_tune_res]
                 m_best_params = [xx.best_params_ for xx in m_tune_res]
-                t_tune_res = tune_grid_search(t_targets, x, ml_t, smpls, param_grid_t, n_folds_tune)

Copilot is powered by AI and may make mistakes. Always verify output.

doubleml/plm/tests/test_lplr.py Fixed Show fixed Hide fixed


          Cleanup

2c626a0

JanTeichertKluge reviewed

View reviewed changes

doubleml/plm/lplr.py


		return self

		def set_sample_splitting(self):

Member

JanTeichertKluge Oct 28, 2025

Please refer to the change in PR #338 , specifically the new mixin. This replaces the class-specific implementation of all sample-splitting methods. I would recommend adapting this to the current main version and using the mixins.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet