Skip to content

Conversation

@dimitris-athanasiou
Copy link
Contributor

…st two

Data frame analytics classification currently only supports 2 classes for the
dependent variable. We were checking that the field's cardinality is not higher
than 2 but we should also check it is not less than that as otherwise the process
fails.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

Copy link
Contributor

@przemekwitek przemekwitek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Just a few minor comments

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/limits/constraints
?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Limits/Constraints
?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a "Matchers.empty()" matcher that could be used here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a "Matchers.empty()" matcher that could be used here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is now necessary as we are checking the field cardinality before we call startAnalytics which refreshes the dest index.

@dimitris-athanasiou
Copy link
Contributor Author

@przemekwitek I have addressed all your points plus fixes a bug regarding refreshing of the dest index which was caught by the tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be reverted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes of course.

@przemekwitek
Copy link
Contributor

@przemekwitek I have addressed all your points plus fixes a bug regarding refreshing of the dest index which was caught by the tests.

Please see my comment in ClassificationIT. Otherwise the PR is good to go.

…st two

Data frame analytics classification currently only supports 2 classes for the
dependent variable. We were checking that the field's cardinality is not higher
than 2 but we should also check it is not less than that as otherwise the process
fails.
@dimitris-athanasiou dimitris-athanasiou force-pushed the validate-classification-dep-var-cardinality-at-least-two branch from e51393e to eab85d5 Compare January 22, 2020 12:03
@dimitris-athanasiou dimitris-athanasiou merged commit a6fa577 into elastic:master Jan 22, 2020
@dimitris-athanasiou dimitris-athanasiou deleted the validate-classification-dep-var-cardinality-at-least-two branch January 22, 2020 13:47
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Jan 22, 2020
…t lea… (elastic#51232)

Data frame analytics classification currently only supports 2 classes for the
dependent variable. We were checking that the field's cardinality is not higher
than 2 but we should also check it is not less than that as otherwise the process
fails.

Backport of elastic#51232
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Jan 22, 2020
…t lea… (elastic#51232)

Data frame analytics classification currently only supports 2 classes for the
dependent variable. We were checking that the field's cardinality is not higher
than 2 but we should also check it is not less than that as otherwise the process
fails.

Backport of elastic#51232
dimitris-athanasiou added a commit that referenced this pull request Jan 22, 2020
…t lea… (#51232) (#51309)

Data frame analytics classification currently only supports 2 classes for the
dependent variable. We were checking that the field's cardinality is not higher
than 2 but we should also check it is not less than that as otherwise the process
fails.

Backport of #51232
dimitris-athanasiou added a commit that referenced this pull request Jan 22, 2020
…t lea… (#51232) (#51310)

Data frame analytics classification currently only supports 2 classes for the
dependent variable. We were checking that the field's cardinality is not higher
than 2 but we should also check it is not less than that as otherwise the process
fails.

Backport of #51232
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants