Skip to content

Conversation

jack-fs
Copy link
Contributor

@jack-fs jack-fs commented Mar 15, 2023

Apologies for the large pull request.

This contains:

  • a starting point for sphinx documentation workflow
  • a full first pass at updated documentation for estimators.py
  • partially updated documentation for evaluators.py

Many type hints are updated/added as this has been relatively free given I have been going through the codebase.
I do not expect these hints to fully pass mypy, but should at least aid any future attempt to do so

@jack-fs jack-fs marked this pull request as ready for review March 16, 2023 02:02
Copy link
Contributor

@dsteinberg dsteinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I like the TODOs you are putting into the code.

I like the typing, but I'm also okay if there are some things that are not typed yet if they are hard to type - perhaps some re-design could help in those situtations?

RegressionStatisticalResults
Linear coefficient statistics for this estimator.
"""
# TODO should we not check that dof_, t_ p_ are fitted as well?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep!

class RegressionStatisticalResults(NamedTuple):
"""Statistical results object for linear regressors.
TODO should this be private? Only used internally
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes probably, especially if this results object is never returned or accessed by a user

This is a method of :class:`~sklearn.base.BaseEstimator`.
TODO make deep argument functional.
I believe this implementation could be replaced with the class's default implementation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so that it's possible to update each component of a nested object.
TODO I believe this could be replaced with the base class's implementation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - it seems like this is not necessary anymore for newer versions of scikit learn

# an sklearn Scorer takes an estimator, X and optional y, and returns a scalar score
# https://scikit-learn.org/stable/modules/generated/sklearn.metrics.make_scorer.html#sklearn.metrics.make_scorer
Scorer = Callable[[Estimator, Optional[npt.ArrayLike]], Score]
ScoreEvaluation = Dict[str, List[Score]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10/10 for effort - though I think we've lost this battle in python...


scores = defaultdict(list)
if self.groupby is not None:
# TODO: this makes X/y implicitly a dataframe. Is this intended?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this seems like a code smell to me - good catch. I think we should make a good fix for this... I suspect this happens throughout the code. IT may be fixed by assuming everything is a DataFrame? When would we not use a DataFrame?

@dsteinberg dsteinberg merged commit 574c66e into main Apr 6, 2023
@jack-fs jack-fs mentioned this pull request May 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants