-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[ML] Implement new rules design (#31110) #31294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Implement new rules design (#31110) #31294
Conversation
|
Pinging @elastic/ml-core |
|
retest this please |
b2caefb to
0464a84
Compare
|
retest this please |
1 similar comment
|
retest this please |
0464a84 to
2de5d71
Compare
|
@droberts195 It turns out there was a BWC issue. I had to add a method that reads the old format as if an older node has rules, we need to read it off the stream to pave the way for the rest. We do not have that problem on |
droberts195
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally we said that we'd rely on nobody having used the rules functionality prior to 6.4, as it was undocumented. I think with this assumption the new method wouldn't be necessary, as the list of old rules would always be empty.
Anyway, since you've created a method to read the old rule formats we might as well stick with it for robustness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be 6.2.0 - 6.3.99? Otherwise this will go wrong when the 6.x branch is for 6.5 and the "old" cluster is running 6.4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The job that this test PUTs doesn't have any explicit rules in it. Should it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment explaining why.
Rules allow users to supply a detector with domain knowledge that can improve the quality of the results. The model detects statistically anomalous results but it has no knowledge of the meaning of the values being modelled. For example, a detector that performs a population analysis over IP addresses could benefit from a list of IP addresses that the user knows to be safe. Then anomalous results for those IP addresses will not be created and will not affect the quantiles either. Another example would be a detector looking for anomalies in the median value of CPU utilization. A user might want to inform the detector that any results where the actual value is less than 5 is not interesting. This commit introduces a `custom_rules` field to the `Detector`. A detector may have multiple rules which are combined with `or`. A rule has 3 fields: `actions`, `scope` and `conditions`. Actions is a list of what should happen when the rule applies. The current options include `skip_result` and `skip_model_update`. The default value for `actions` is the `skip_result` action. Scope is optional and allows for applying filters on any of the partition/over/by field. When not defined the rule applies to all series. The `filter_id` needs to be specified to match the id of the filter to be used. Optionally, the `filter_type` can be specified as either `include` (default) or `exclude`. When set to `include` the rule applies to entities that are in the filter. When set to `exclude` the rule only applies to entities not in the filter. There may be zero or more conditions. A condition requires `applies_to`, `operator` and `value` to be specified. The `applies_to` value can be either `actual`, `typical` or `diff_from_typical` and it specifies the numerical value to which the condition applies. The `operator` (`lt`, `lte`, `gt`, `gte`) and `value` complete the definition. Conditions are combined with `and` and allow to specify numerical conditions for when a rule applies. A rule must either have a scope or one or more conditions. Finally, a rule with scope and conditions applies when all of them apply.
e6a8f6b to
7cb4891
Compare
* 6.x: Upgrade to Lucene-7.4.0-snapshot-518d303506 (#31360) [ML] Implement new rules design (#31110) (#31294) Remove RestGetAllAliasesAction (#31308) CCS: don't proxy requests for already connected node (#31273) Rankeval: Fold template test project into main module (#31203) [Docs] Remove reference to repository-s3 plugin creating an S3 bucket (#31359) More detailed tracing when writing metadata (#31319) Add details section for dcg ranking metric (#31177)
Rules allow users to supply a detector with domain
knowledge that can improve the quality of the results.
The model detects statistically anomalous results but it
has no knowledge of the meaning of the values being modelled.
For example, a detector that performs a population analysis
over IP addresses could benefit from a list of IP addresses
that the user knows to be safe. Then anomalous results for
those IP addresses will not be created and will not affect
the quantiles either.
Another example would be a detector looking for anomalies
in the median value of CPU utilization. A user might want
to inform the detector that any results where the actual
value is less than 5 is not interesting.
This commit introduces a
custom_rulesfield to theDetector.A detector may have multiple rules which are combined with
or.A rule has 3 fields:
actions,scopeandconditions.Actions is a list of what should happen when the rule applies.
The current options include
skip_resultandskip_model_update.The default value for
actionsis theskip_resultaction.Scope is optional and allows for applying filters on any of the
partition/over/by field. When not defined the rule applies to
all series. The
filter_idneeds to be specified to match the idof the filter to be used. Optionally, the
filter_typecan be specifiedas either
include(default) orexclude. When set toincludethe rule applies to entities that are in the filter. When set to
excludethe rule only applies to entities not in the filter.There may be zero or more conditions. A condition requires
applies_to,operatorandvalueto be specified. Theapplies_tovalue can beeither
actual,typicalordiff_from_typicaland it specifiesthe numerical value to which the condition applies. The
operator(
lt,lte,gt,gte) andvaluecomplete the definition.Conditions are combined with
andand allow to specify numericalconditions for when a rule applies.
A rule must either have a scope or one or more conditions. Finally,
a rule with scope and conditions applies when all of them apply.
Backport of #31110