From 5933e198b413a73e2c046802a84718ad7a321a98 Mon Sep 17 00:00:00 2001 From: lcawl Date: Fri, 13 Dec 2019 10:13:34 -0800 Subject: [PATCH 1/4] [DOCS] Drafts removal of results resource details --- .../apis/resultsresource.asciidoc | 58 ------------------- 1 file changed, 58 deletions(-) diff --git a/docs/reference/ml/anomaly-detection/apis/resultsresource.asciidoc b/docs/reference/ml/anomaly-detection/apis/resultsresource.asciidoc index b35100c24e6ae..afe5926742ee1 100644 --- a/docs/reference/ml/anomaly-detection/apis/resultsresource.asciidoc +++ b/docs/reference/ml/anomaly-detection/apis/resultsresource.asciidoc @@ -3,49 +3,6 @@ [[ml-results-resource]] === Results resources -Several different result types are created for each job. You can query anomaly -results for _buckets_, _influencers_, and _records_ by using the results API. -Summarized bucket results over multiple jobs can be queried as well; those -results are called _overall buckets_. - -Results are written for each `bucket_span`. The timestamp for the results is the -start of the bucket time interval. - -The results include scores, which are calculated for each anomaly result type and -each bucket interval. These scores are aggregated in order to reduce noise, and -normalized in order to identify and rank the most mathematically significant -anomalies. - -Bucket results provide the top level, overall view of the job and are ideal for -alerts. For example, the bucket results might indicate that at 16:05 the system -was unusual. This information is a summary of all the anomalies, pinpointing -when they occurred. - -Influencer results show which entities were anomalous and when. For example, -the influencer results might indicate that at 16:05 `user_name: Bob` was unusual. -This information is a summary of all the anomalies for each entity, so there -can be a lot of these results. Once you have identified a notable bucket time, -you can look to see which entities were significant. - -Record results provide details about what the individual anomaly was, when it -occurred and which entity was involved. For example, the record results might -indicate that at 16:05 Bob sent 837262434 bytes, when the typical value was -1067 bytes. Once you have identified a bucket time and perhaps a significant -entity too, you can drill through to the record results in order to investigate -the anomalous behavior. - -Categorization results contain the definitions of _categories_ that have been -identified. These are only applicable for jobs that are configured to analyze -unstructured log data using categorization. These results do not contain a -timestamp or any calculated scores. For more information, see -{stack-ov}/ml-configuring-categories.html[Categorizing log messages]. - -* <> -* <> -* <> -* <> -* <> - NOTE: All of these resources and properties are informational; you cannot change their values. @@ -53,21 +10,6 @@ change their values. [[ml-results-buckets]] ==== Buckets -Bucket results provide the top level, overall view of the job and are best for -alerting. - -Each bucket has an `anomaly_score`, which is a statistically aggregated and -normalized view of the combined anomalousness of all the record results within -each bucket. - -One bucket result is written for each `bucket_span` for each job, even if it is -not considered to be anomalous. If the bucket is not anomalous, it has an -`anomaly_score` of zero. - -When you identify an anomalous bucket, you can investigate further by expanding -the bucket resource to show the records as nested objects. Alternatively, you -can access the records resource directly and filter by the date range. - A bucket resource has the following properties: `anomaly_score`:: From ab31f6bd790a0cfc3c8bb145c10322082a45008c Mon Sep 17 00:00:00 2001 From: lcawl Date: Fri, 13 Dec 2019 10:52:01 -0800 Subject: [PATCH 2/4] [DOCS] Updates get buckets API --- .../apis/get-bucket.asciidoc | 94 ++++++++++++++++++- .../apis/resultsresource.asciidoc | 45 +-------- 2 files changed, 91 insertions(+), 48 deletions(-) diff --git a/docs/reference/ml/anomaly-detection/apis/get-bucket.asciidoc b/docs/reference/ml/anomaly-detection/apis/get-bucket.asciidoc index 027de1385e83f..555fbe4ae2e8b 100644 --- a/docs/reference/ml/anomaly-detection/apis/get-bucket.asciidoc +++ b/docs/reference/ml/anomaly-detection/apis/get-bucket.asciidoc @@ -79,11 +79,97 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection] [[ml-get-bucket-results]] ==== {api-response-body-title} -The API returns the following information: +The API returns an array of bucket objects, which have the following properties: -`buckets`:: - (array) An array of bucket objects. For more information, see - <>. + +`anomaly_score`:: +(number) The maximum anomaly score, between 0-100, for any of the bucket +influencers. This is an overall, rate-limited score for the job. All the anomaly +records in the bucket contribute to this score. This value might be updated as +new data is analyzed. + +`bucket_influencers`:: +(array) An array of bucket influencer objects, which have the following +properties: + +`bucket_influencers`.`anomaly_score`:: +(number) A normalized score between 0-100, which is calculated for each bucket +influencer. This score might be updated as newer data is analyzed. + +`bucket_influencers`.`bucket_span`:: +(number) The length of the bucket in seconds. This value matches the `bucket_span` +that is specified in the job. + +`bucket_influencers`.`initial_anomaly_score`:: +(number) The score between 0-100 for each bucket influencer. This score is the +initial value that was calculated at the time the bucket was processed. + +`bucket_influencers`.`influencer_field_name`:: +(string) The field name of the influencer. +//// +TBD: Doesn't appear anymore in 8.0? +`bucket_influencers`.`influencer_field_value`:: + (string) The field value of the influencer. For example `192.168.88.2` or + `Bob`. +//// +`bucket_influencers`.`is_interim`:: +(boolean) If `true`, this is an interim result. In other words, the bucket +influencer results are calculated based on partial input data. + +`bucket_influencers`.`job_id`:: +(string) +include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection] + +`bucket_influencers`.`probability`:: +(number) The probability that the bucket has this behavior, in the range 0 to 1. +This value can be held to a high precision of over 300 decimal places, so the +`anomaly_score` is provided as a human-readable and friendly interpretation of +this. + +`bucket_influencers`.`raw_anomaly_score`:: +(number) Internal. + +`bucket_influencers`.`result_type`:: +(string) Internal. This value is always set to `bucket_influencer`. + +`bucket_influencers`.`timestamp`:: +(date) The start time of the bucket for which these results were calculated. + +`bucket_span`:: +(number) The length of the bucket in seconds. This value matches the +`bucket_span` that is specified in the job. + +`event_count`:: +(number) The number of input data records processed in this bucket. + +`initial_anomaly_score`:: +(number) The maximum `anomaly_score` for any of the bucket influencers. This is +the initial value that was calculated at the time the bucket was processed. + +`is_interim`:: +(boolean) If `true`, this is an interim result. In other words, the bucket +results are calculated based on partial input data. + +`job_id`:: +(string) +include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection] + +`processing_time_ms`:: +(number) The amount of time, in milliseconds, that it took to analyze the +bucket contents and calculate results. + +`result_type`:: +(string) Internal. This value is always set to `bucket`. + +`timestamp`:: +(date) The start time of the bucket. This timestamp uniquely identifies the +bucket. ++ +-- +NOTE: Events that occur exactly at the timestamp of the bucket are included in +the results for the bucket. + +-- [[ml-get-bucket-example]] ==== {api-examples-title} diff --git a/docs/reference/ml/anomaly-detection/apis/resultsresource.asciidoc b/docs/reference/ml/anomaly-detection/apis/resultsresource.asciidoc index afe5926742ee1..8bae0411ca7c5 100644 --- a/docs/reference/ml/anomaly-detection/apis/resultsresource.asciidoc +++ b/docs/reference/ml/anomaly-detection/apis/resultsresource.asciidoc @@ -10,50 +10,7 @@ change their values. [[ml-results-buckets]] ==== Buckets -A bucket resource has the following properties: - -`anomaly_score`:: - (number) The maximum anomaly score, between 0-100, for any of the bucket - influencers. This is an overall, rate-limited score for the job. All the - anomaly records in the bucket contribute to this score. This value might be - updated as new data is analyzed. - -`bucket_influencers`:: - (array) An array of bucket influencer objects. - For more information, see <>. - -`bucket_span`:: - (number) The length of the bucket in seconds. - This value matches the `bucket_span` that is specified in the job. - -`event_count`:: - (number) The number of input data records processed in this bucket. - -`initial_anomaly_score`:: - (number) The maximum `anomaly_score` for any of the bucket influencers. - This is the initial value that was calculated at the time the bucket was - processed. - -`is_interim`:: - (boolean) If true, this is an interim result. In other words, the bucket - results are calculated based on partial input data. - -`job_id`:: - (string) The unique identifier for the job that these results belong to. - -`processing_time_ms`:: - (number) The amount of time, in milliseconds, that it took to analyze the - bucket contents and calculate results. - -`result_type`:: - (string) Internal. This value is always set to `bucket`. - -`timestamp`:: - (date) The start time of the bucket. This timestamp uniquely identifies the - bucket. + - -NOTE: Events that occur exactly at the timestamp of the bucket are included in -the results for the bucket. +See <>. [float] From d64b545da085e3035f8a268ad1fb4e266c49283f Mon Sep 17 00:00:00 2001 From: lcawl Date: Mon, 16 Dec 2019 15:55:41 -0800 Subject: [PATCH 3/4] [DOCS] Removes results resource definitions --- .../apis/get-bucket.asciidoc | 48 ++- .../apis/get-category.asciidoc | 56 ++- .../apis/get-influencer.asciidoc | 111 +++-- .../apis/get-overall-buckets.asciidoc | 52 ++- .../apis/get-record.asciidoc | 204 ++++++++-- .../apis/resultsresource.asciidoc | 378 ------------------ docs/reference/ml/ml-shared.asciidoc | 23 ++ docs/reference/redirects.asciidoc | 17 + docs/reference/rest-api/defs.asciidoc | 2 - 9 files changed, 385 insertions(+), 506 deletions(-) delete mode 100644 docs/reference/ml/anomaly-detection/apis/resultsresource.asciidoc diff --git a/docs/reference/ml/anomaly-detection/apis/get-bucket.asciidoc b/docs/reference/ml/anomaly-detection/apis/get-bucket.asciidoc index 555fbe4ae2e8b..c1439ebfea95f 100644 --- a/docs/reference/ml/anomaly-detection/apis/get-bucket.asciidoc +++ b/docs/reference/ml/anomaly-detection/apis/get-bucket.asciidoc @@ -40,41 +40,41 @@ bucket. include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection] ``:: - (Optional, string) The timestamp of a single bucket result. If you do not - specify this parameter, the API returns information about all buckets. +(Optional, string) The timestamp of a single bucket result. If you do not +specify this parameter, the API returns information about all buckets. [[ml-get-bucket-request-body]] ==== {api-request-body-title} `anomaly_score`:: - (Optional, double) Returns buckets with anomaly scores greater or equal than - this value. +(Optional, double) Returns buckets with anomaly scores greater or equal than +this value. `desc`:: - (Optional, boolean) If true, the buckets are sorted in descending order. +(Optional, boolean) If true, the buckets are sorted in descending order. `end`:: - (Optional, string) Returns buckets with timestamps earlier than this time. +(Optional, string) Returns buckets with timestamps earlier than this time. `exclude_interim`:: - (Optional, boolean) If true, the output excludes interim results. By default, - interim results are included. +(Optional, boolean) +include::{docdir}/ml/ml-shared.asciidoc[tag=exclude-interim-results] `expand`:: - (Optional, boolean) If true, the output includes anomaly records. +(Optional, boolean) If true, the output includes anomaly records. `page`:: `from`::: - (Optional, integer) Skips the specified number of buckets. +(Optional, integer) Skips the specified number of buckets. `size`::: - (Optional, integer) Specifies the maximum number of buckets to obtain. +(Optional, integer) Specifies the maximum number of buckets to obtain. `sort`:: - (Optional, string) Specifies the sort field for the requested buckets. By - default, the buckets are sorted by the `timestamp` field. +(Optional, string) Specifies the sort field for the requested buckets. By +default, the buckets are sorted by the `timestamp` field. `start`:: - (Optional, string) Returns buckets with timestamps after this time. +(Optional, string) Returns buckets with timestamps after this time. [[ml-get-bucket-results]] ==== {api-response-body-title} @@ -106,15 +106,13 @@ initial value that was calculated at the time the bucket was processed. `bucket_influencers`.`influencer_field_name`:: (string) The field name of the influencer. -//// -TBD: Doesn't appear anymore in 8.0? + `bucket_influencers`.`influencer_field_value`:: - (string) The field value of the influencer. For example `192.168.88.2` or - `Bob`. -//// +(string) The field value of the influencer. + `bucket_influencers`.`is_interim`:: -(boolean) If `true`, this is an interim result. In other words, the bucket -influencer results are calculated based on partial input data. +(boolean) +include::{docdir}/ml/ml-shared.asciidoc[tag=is-interim] `bucket_influencers`.`job_id`:: (string) @@ -136,8 +134,8 @@ this. (date) The start time of the bucket for which these results were calculated. `bucket_span`:: -(number) The length of the bucket in seconds. This value matches the -`bucket_span` that is specified in the job. +(number) +include::{docdir}/ml/ml-shared.asciidoc[tag=bucket-span-results] `event_count`:: (number) The number of input data records processed in this bucket. @@ -147,8 +145,8 @@ this. the initial value that was calculated at the time the bucket was processed. `is_interim`:: -(boolean) If `true`, this is an interim result. In other words, the bucket -results are calculated based on partial input data. +(boolean) +include::{docdir}/ml/ml-shared.asciidoc[tag=is-interim] `job_id`:: (string) diff --git a/docs/reference/ml/anomaly-detection/apis/get-category.asciidoc b/docs/reference/ml/anomaly-detection/apis/get-category.asciidoc index 3280b79534f50..441a20f99793f 100644 --- a/docs/reference/ml/anomaly-detection/apis/get-category.asciidoc +++ b/docs/reference/ml/anomaly-detection/apis/get-category.asciidoc @@ -28,7 +28,15 @@ privileges. See <> and [[ml-get-category-desc]] ==== {api-description-title} -For more information about categories, see +When `categorization_field_name` is specified in the job configuration, it is +possible to view the definitions of the resulting categories. A category +definition describes the common terms matched and contains examples of matched +values. + +The anomaly results from a categorization analysis are available as bucket, +influencer, and record results. For example, the results might indicate that +at 16:45 there was an unusual count of log message category 11. You can then +examine the description and examples of that category. For more information, see {stack-ov}/ml-configuring-categories.html[Categorizing log messages]. [[ml-get-category-path-parms]] @@ -39,34 +47,55 @@ For more information about categories, see include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection] ``:: - (Optional, long) Identifier for the category. If you do not specify this - parameter, the API returns information about all categories in the - {anomaly-job}. +(Optional, long) Identifier for the category. If you do not specify this +parameter, the API returns information about all categories in the {anomaly-job}. [[ml-get-category-request-body]] ==== {api-request-body-title} `page`:: `from`::: - (Optional, integer) Skips the specified number of categories. +(Optional, integer) Skips the specified number of categories. `size`::: - (Optional, integer) Specifies the maximum number of categories to obtain. +(Optional, integer) Specifies the maximum number of categories to obtain. [[ml-get-category-results]] ==== {api-response-body-title} -The API returns the following information: +The API returns an array of category objects, which have the following properties: -`categories`:: - (array) An array of category objects. For more information, see - <>. +`category_id`:: +(unsigned integer) A unique identifier for the category. + +`examples`:: +(array) A list of examples of actual values that matched the category. + +`grok_pattern`:: +experimental[] (string) A Grok pattern that could be used in {ls} or an ingest +pipeline to extract fields from messages that match the category. This field is experimental and may be changed or removed in a future release. The Grok +patterns that are found are not optimal, but are often a good starting point for +manual tweaking. + +`job_id`:: +(string) +include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection] + +`max_matching_length`:: +(unsigned integer) The maximum length of the fields that matched the category. +The value is increased by 10% to enable matching for similar fields that have +not been analyzed. + +`regex`:: +(string) A regular expression that is used to search for values that match the +category. + +`terms`:: +(string) A space separated list of the common tokens that are matched in values +of the category. [[ml-get-category-example]] ==== {api-examples-title} -The following example gets information about one category for the -`esxi_log` job: - [source,console] -------------------------------------------------- GET _ml/anomaly_detectors/esxi_log/results/categories @@ -78,7 +107,6 @@ GET _ml/anomaly_detectors/esxi_log/results/categories -------------------------------------------------- // TEST[skip:todo] -In this example, the API returns the following information: [source,js] ---- { diff --git a/docs/reference/ml/anomaly-detection/apis/get-influencer.asciidoc b/docs/reference/ml/anomaly-detection/apis/get-influencer.asciidoc index 2165d8ef9f7f9..e2727a04dc07c 100644 --- a/docs/reference/ml/anomaly-detection/apis/get-influencer.asciidoc +++ b/docs/reference/ml/anomaly-detection/apis/get-influencer.asciidoc @@ -23,6 +23,13 @@ need `read` index privilege on the index that stores the results. The privileges. See <> and <>. +[[ml-get-influencer-desc]] +==== {api-description-title} + +Influencers are the entities that have contributed to, or are to blame for, +the anomalies. Influencer results are available only if an +`influencer_field_name` is specified in the job configuration. + [[ml-get-influencer-path-parms]] ==== {api-path-parms-title} @@ -34,75 +41,119 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection] ==== {api-request-body-title} `desc`:: - (Optional, boolean) If true, the results are sorted in descending order. +(Optional, boolean) +include::{docdir}/ml/ml-shared.asciidoc[tag=desc-results] `end`:: - (Optional, string) Returns influencers with timestamps earlier than this time. +(Optional, string) Returns influencers with timestamps earlier than this time. `exclude_interim`:: - (Optional, boolean) If true, the output excludes interim results. By default, - interim results are included. +(Optional, boolean) +include::{docdir}/ml/ml-shared.asciidoc[tag=exclude-interim-results] `influencer_score`:: - (Optional, double) Returns influencers with anomaly scores greater than or - equal to this value. +(Optional, double) Returns influencers with anomaly scores greater than or +equal to this value. `page`:: `from`::: - (Optional, integer) Skips the specified number of influencers. +(Optional, integer) Skips the specified number of influencers. `size`::: - (Optional, integer) Specifies the maximum number of influencers to obtain. +(Optional, integer) Specifies the maximum number of influencers to obtain. `sort`:: - (Optional, string) Specifies the sort field for the requested influencers. By - default, the influencers are sorted by the `influencer_score` value. +(Optional, string) Specifies the sort field for the requested influencers. By +default, the influencers are sorted by the `influencer_score` value. `start`:: - (Optional, string) Returns influencers with timestamps after this time. +(Optional, string) Returns influencers with timestamps after this time. [[ml-get-influencer-results]] ==== {api-response-body-title} -The API returns the following information: +The API returns an array of influencer objects, which have the following +properties: + +`bucket_span`:: +(number) +include::{docdir}/ml/ml-shared.asciidoc[tag=bucket-span-results] + +`influencer_score`:: +(number) A normalized score between 0-100, which is based on the probability of +the influencer in this bucket aggregated across detectors. Unlike +`initial_influencer_score`, this value will be updated by a re-normalization +process as new data is analyzed. + +`influencer_field_name`:: +(string) The field name of the influencer. + +`influencer_field_value`:: +(string) The entity that influenced, contributed to, or was to blame for the +anomaly. + +`initial_influencer_score`:: +(number) A normalized score between 0-100, which is based on the probability of +the influencer aggregated across detectors. This is the initial value that was +calculated at the time the bucket was processed. -`influencers`:: - (array) An array of influencer objects. - For more information, see <>. +`is_interim`:: +(boolean) +include::{docdir}/ml/ml-shared.asciidoc[tag=is-interim] + +`job_id`:: +(string) +include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection] + +`probability`:: +(number) The probability that the influencer has this behavior, in the range 0 +to 1. For example, 0.0000109783. This value can be held to a high precision of +over 300 decimal places, so the `influencer_score` is provided as a +human-readable and friendly interpretation of this. + +`result_type`:: +(string) Internal. This value is always set to `influencer`. + +`timestamp`:: +(date) +include::{docdir}/ml/ml-shared.asciidoc[tag=timestamp-results] + +NOTE: Additional influencer properties are added, depending on the fields being +analyzed. For example, if it's analyzing `user_name` as an influencer, then a +field `user_name` is added to the result document. This information enables you to +filter the anomaly results more easily. [[ml-get-influencer-example]] ==== {api-examples-title} -The following example gets influencer information for the `it_ops_new_kpi` job: - [source,console] -------------------------------------------------- -GET _ml/anomaly_detectors/it_ops_new_kpi/results/influencers +GET _ml/anomaly_detectors/high_sum_total_sales/results/influencers { "sort": "influencer_score", "desc": true } -------------------------------------------------- -// TEST[skip:todo] +// TEST[skip:Kibana sample data] In this example, the API returns the following information, sorted based on the influencer score in descending order: [source,js] ---- { - "count": 28, + "count": 189, "influencers": [ { - "job_id": "it_ops_new_kpi", + "job_id": "high_sum_total_sales", "result_type": "influencer", - "influencer_field_name": "kpi_indicator", - "influencer_field_value": "online_purchases", - "kpi_indicator": "online_purchases", - "influencer_score": 94.1386, - "initial_influencer_score": 94.1386, - "probability": 0.000111612, - "bucket_span": 600, - "is_interim": false, - "timestamp": 1454943600000 + "influencer_field_name": "customer_full_name.keyword", + "influencer_field_value": "Wagdi Shaw", + "customer_full_name.keyword" : "Wagdi Shaw", + "influencer_score": 99.02493, + "initial_influencer_score" : 94.67233079580171, + "probability" : 1.4784807245686567E-10, + "bucket_span" : 3600, + "is_interim" : false, + "timestamp" : 1574661600000 }, ... ] diff --git a/docs/reference/ml/anomaly-detection/apis/get-overall-buckets.asciidoc b/docs/reference/ml/anomaly-detection/apis/get-overall-buckets.asciidoc index 62acd7902b9ff..256336f0d85bc 100644 --- a/docs/reference/ml/anomaly-detection/apis/get-overall-buckets.asciidoc +++ b/docs/reference/ml/anomaly-detection/apis/get-overall-buckets.asciidoc @@ -64,38 +64,56 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection-wildcard-li include::{docdir}/ml/ml-shared.asciidoc[tag=allow-no-jobs] `bucket_span`:: - (Optional, string) The span of the overall buckets. Must be greater or equal - to the largest bucket span of the specified {anomaly-jobs}, which is the - default value. +(Optional, string) The span of the overall buckets. Must be greater or equal to +the largest bucket span of the specified {anomaly-jobs}, which is the default +value. `end`:: - (Optional, string) Returns overall buckets with timestamps earlier than this - time. +(Optional, string) Returns overall buckets with timestamps earlier than this +time. `exclude_interim`:: - (Optional, boolean) If `true`, the output excludes interim overall buckets. - Overall buckets are interim if any of the job buckets within the overall - bucket interval are interim. By default, interim results are included. +(Optional, boolean) If `true`, the output excludes interim overall buckets. +Overall buckets are interim if any of the job buckets within the overall bucket +interval are interim. By default, interim results are included. `overall_score`:: - (Optional, double) Returns overall buckets with overall scores greater or - equal than this value. +(Optional, double) Returns overall buckets with overall scores greater or equal +than this value. `start`:: - (Optional, string) Returns overall buckets with timestamps after this time. +(Optional, string) Returns overall buckets with timestamps after this time. `top_n`:: - (Optional, integer) The number of top {anomaly-job} bucket scores to be used - in the `overall_score` calculation. The default value is `1`. +(Optional, integer) The number of top {anomaly-job} bucket scores to be used in +the `overall_score` calculation. The default value is `1`. [[ml-get-overall-buckets-results]] ==== {api-response-body-title} -The API returns the following information: +The API returns an array of overall bucket objects, which have the following +properties: -`overall_buckets`:: - (array) An array of overall bucket objects. For more information, see - <>. +`bucket_span`:: +(number) The length of the bucket in seconds. Matches the `bucket_span` +of the job with the longest one. + +`is_interim`:: +(boolean) +include::{docdir}/ml/ml-shared.asciidoc[tag=is-interim] + +`jobs`:: +(array) An array of objects that contain the `max_anomaly_score` per `job_id`. + +`overall_score`:: +(number) The `top_n` average of the max bucket `anomaly_score` per job. + +`result_type`:: +(string) Internal. This is always set to `overall_bucket`. + +`timestamp`:: +(date) +include::{docdir}/ml/ml-shared.asciidoc[tag=timestamp-results] [[ml-get-overall-buckets-example]] ==== {api-examples-title} diff --git a/docs/reference/ml/anomaly-detection/apis/get-record.asciidoc b/docs/reference/ml/anomaly-detection/apis/get-record.asciidoc index b5bbb15580e19..33b8804078b80 100644 --- a/docs/reference/ml/anomaly-detection/apis/get-record.asciidoc +++ b/docs/reference/ml/anomaly-detection/apis/get-record.asciidoc @@ -22,6 +22,22 @@ need `read` index privilege on the index that stores the results. The `machine_learning_admin` and `machine_learning_user` roles provide these privileges. See <> and <>. +[[ml-get-record-desc]] +==== {api-description-title} + +Records contain the detailed analytical results. They describe the anomalous +activity that has been identified in the input data based on the detector +configuration. + +There can be many anomaly records depending on the characteristics and size of +the input data. In practice, there are often too many to be able to manually +process them. The {ml-features} therefore perform a sophisticated aggregation of +the anomaly records into buckets. + +The number of record results depends on the number of anomalies found in each +bucket, which relates to the number of time series being modeled and the number +of detectors. + [[ml-get-record-path-parms]] ==== {api-path-parms-title} @@ -33,83 +49,191 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection] ==== {api-request-body-title} `desc`:: - (Optional, boolean) If true, the results are sorted in descending order. +(Optional, boolean) +include::{docdir}/ml/ml-shared.asciidoc[tag=desc-results] `end`:: - (Optional, string) Returns records with timestamps earlier than this time. +(Optional, string) Returns records with timestamps earlier than this time. `exclude_interim`:: - (Optional, boolean) If true, the output excludes interim results. By default, - interim results are included. +(Optional, boolean) +include::{docdir}/ml/ml-shared.asciidoc[tag=exclude-interim-results] `page`:: `from`::: - (Optional, integer) Skips the specified number of records. +(Optional, integer) Skips the specified number of records. `size`::: - (Optional, integer) Specifies the maximum number of records to obtain. +(Optional, integer) Specifies the maximum number of records to obtain. `record_score`:: - (Optional, double) Returns records with anomaly scores greater or equal than - this value. +(Optional, double) Returns records with anomaly scores greater or equal than +this value. `sort`:: - (Optional, string) Specifies the sort field for the requested records. By - default, the records are sorted by the `anomaly_score` value. +(Optional, string) Specifies the sort field for the requested records. By +default, the records are sorted by the `anomaly_score` value. `start`:: - (Optional, string) Returns records with timestamps after this time. +(Optional, string) Returns records with timestamps after this time. [[ml-get-record-results]] ==== {api-response-body-title} -The API returns the following information: +The API returns an array of record objects, which have the following properties: + +`actual`:: +(array) The actual value for the bucket. + +`bucket_span`:: +(number) +include::{docdir}/ml/ml-shared.asciidoc[tag=bucket-span-results] + +`by_field_name`:: +(string) +include::{docdir}/ml/ml-shared.asciidoc[tag=by-field-name] + +`by_field_value`:: +(string) The value of `by_field_name`. + +`causes`:: +(array) For population analysis, an over field must be specified in the detector. +This property contains an array of anomaly records that are the causes for the +anomaly that has been identified for the over field. If no over fields exist, +this field is not present. This sub-resource contains the most anomalous records +for the `over_field_name`. For scalability reasons, a maximum of the 10 most +significant causes of the anomaly are returned. As part of the core analytical modeling, these low-level anomaly records are aggregated for their parent over +field record. The causes resource contains similar elements to the record +resource, namely `actual`, `typical`, `geo_results.actual_point`, +`geo_results.typical_point`, `*_field_name` and `*_field_value`. Probability and +scores are not applicable to causes. + +`detector_index`:: +(number) A unique identifier for the detector. + +`field_name`:: +(string) Certain functions require a field to operate on, for example, `sum()`. +For those functions, this value is the name of the field to be analyzed. + +`function`:: +(string) The function in which the anomaly occurs, as specified in the detector +configuration. For example, `max`. + +`function_description`:: +(string) The description of the function in which the anomaly occurs, as +specified in the detector configuration. + +`geo_results.actual_point`:: +(string) The actual value for the bucket formatted as a `geo_point`. If the +detector function is `lat_long`, this is a comma delimited string of the +latitude and longitude. + +`geo_results.typical_point`:: +(string) The typical value for the bucket formatted as a `geo_point`. If the +detector function is `lat_long`, this is a comma delimited string of the +latitude and longitude. + +`influencers`:: +(array) If `influencers` was specified in the detector configuration, this array +contains influencers that contributed to or were to blame for an anomaly. + +`initial_record_score`:: +(number) A normalized score between 0-100, which is based on the probability of +the anomalousness of this record. This is the initial value that was calculated +at the time the bucket was processed. + +`is_interim`:: +(boolean) +include::{docdir}/ml/ml-shared.asciidoc[tag=is-interim] + +`job_id`:: +(string) +include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection] + +`over_field_name`:: +(string) +include::{docdir}/ml/ml-shared.asciidoc[tag=over-field-name] + +`over_field_value`:: +(string) The value of `over_field_name`. + +`partition_field_name`:: +(string) +include::{docdir}/ml/ml-shared.asciidoc[tag=partition-field-name] + +`partition_field_value`:: +(string) The value of `partition_field_name`. + +`probability`:: +(number) The probability of the individual anomaly occurring, in the range +0 to 1. For example, 0.0000772031. This value can be held to a high precision +of over 300 decimal places, so the `record_score` is provided as a +human-readable and friendly interpretation of this. + +`multi_bucket_impact`:: +(number) An indication of how strongly an anomaly is multi bucket or single +bucket. The value is on a scale of `-5.0` to `+5.0` where `-5.0` means the +anomaly is purely single bucket and `+5.0` means the anomaly is purely multi +bucket. + +`record_score`:: +(number) A normalized score between 0-100, which is based on the probability of +the anomalousness of this record. Unlike `initial_record_score`, this value will +be updated by a re-normalization process as new data is analyzed. + +`result_type`:: +(string) Internal. This is always set to `record`. + +`timestamp`:: +(date) +include::{docdir}/ml/ml-shared.asciidoc[tag=timestamp-results] + +`typical`:: +(array) The typical value for the bucket, according to analytical modeling. + +NOTE: Additional record properties are added, depending on the fields being +analyzed. For example, if it's analyzing `hostname` as a _by field_, then a field +`hostname` is added to the result document. This information enables you to +filter the anomaly results more easily. -`records`:: - (array) An array of record objects. For more information, see - <>. [[ml-get-record-example]] ==== {api-examples-title} -The following example gets record information for the `it-ops-kpi` job: - [source,console] -------------------------------------------------- -GET _ml/anomaly_detectors/it-ops-kpi/results/records +GET _ml/anomaly_detectors/low_request_rate/results/records { "sort": "record_score", "desc": true, "start": "1454944100000" } -------------------------------------------------- -// TEST[skip:todo] +// TEST[skip:Kibana sample data] -In this example, the API returns twelve results for the specified -time constraints: [source,js] ---- { - "count": 12, - "records": [ + "count" : 4, + "records" : [ { - "job_id": "it-ops-kpi", - "result_type": "record", - "probability": 0.00000332668, - "record_score": 72.9929, - "initial_record_score": 65.7923, - "bucket_span": 300, - "detector_index": 0, - "is_interim": false, - "timestamp": 1454944200000, - "function": "low_sum", - "function_description": "sum", - "typical": [ - 1806.48 - ], - "actual": [ - 288 + "job_id" : "low_request_rate", + "result_type" : "record", + "probability" : 1.3882308899968812E-4, + "multi_bucket_impact" : -5.0, + "record_score" : 94.98554565630553, + "initial_record_score" : 94.98554565630553, + "bucket_span" : 3600, + "detector_index" : 0, + "is_interim" : false, + "timestamp" : 1577793600000, + "function" : "low_count", + "function_description" : "count", + "typical" : [ + 28.254208230188834 ], - "field_name": "events_per_min" + "actual" : [ + 0.0 + ] }, ... ] diff --git a/docs/reference/ml/anomaly-detection/apis/resultsresource.asciidoc b/docs/reference/ml/anomaly-detection/apis/resultsresource.asciidoc deleted file mode 100644 index 8bae0411ca7c5..0000000000000 --- a/docs/reference/ml/anomaly-detection/apis/resultsresource.asciidoc +++ /dev/null @@ -1,378 +0,0 @@ -[role="xpack"] -[testenv="platinum"] -[[ml-results-resource]] -=== Results resources - -NOTE: All of these resources and properties are informational; you cannot -change their values. - -[float] -[[ml-results-buckets]] -==== Buckets - -See <>. - - -[float] -[[ml-results-bucket-influencers]] -==== Bucket Influencers - -Bucket influencer results are available as nested objects contained within -bucket results. These results are an aggregation for each type of influencer. -For example, if both `client_ip` and `user_name` were specified as influencers, -then you would be able to determine when the `client_ip` or `user_name` values -were collectively anomalous. - -There is a built-in bucket influencer called `bucket_time` which is always -available. This bucket influencer is the aggregation of all records in the -bucket; it is not just limited to a type of influencer. - -NOTE: A bucket influencer is a type of influencer. For example, `client_ip` or -`user_name` can be bucket influencers, whereas `192.168.88.2` and `Bob` are -influencers. - -An bucket influencer object has the following properties: - -`anomaly_score`:: - (number) A normalized score between 0-100, which is calculated for each bucket - influencer. This score might be updated as newer data is analyzed. - -`bucket_span`:: - (number) The length of the bucket in seconds. This value matches the `bucket_span` - that is specified in the job. - -`initial_anomaly_score`:: - (number) The score between 0-100 for each bucket influencer. This score is - the initial value that was calculated at the time the bucket was processed. - -`influencer_field_name`:: - (string) The field name of the influencer. For example `client_ip` or - `user_name`. - -`influencer_field_value`:: - (string) The field value of the influencer. For example `192.168.88.2` or - `Bob`. - -`is_interim`:: - (boolean) If true, this is an interim result. In other words, the bucket - influencer results are calculated based on partial input data. - -`job_id`:: - (string) The unique identifier for the job that these results belong to. - -`probability`:: - (number) The probability that the bucket has this behavior, in the range 0 - to 1. For example, 0.0000109783. This value can be held to a high precision - of over 300 decimal places, so the `anomaly_score` is provided as a - human-readable and friendly interpretation of this. - -`raw_anomaly_score`:: - (number) Internal. - -`result_type`:: - (string) Internal. This value is always set to `bucket_influencer`. - -`timestamp`:: - (date) The start time of the bucket for which these results were calculated. - -[float] -[[ml-results-influencers]] -==== Influencers - -Influencers are the entities that have contributed to, or are to blame for, -the anomalies. Influencer results are available only if an -`influencer_field_name` is specified in the job configuration. - -Influencers are given an `influencer_score`, which is calculated based on the -anomalies that have occurred in each bucket interval. For jobs with more than -one detector, this gives a powerful view of the most anomalous entities. - -For example, if you are analyzing unusual bytes sent and unusual domains -visited and you specified `user_name` as the influencer, then an -`influencer_score` for each anomalous user name is written per bucket. For -example, if `user_name: Bob` had an `influencer_score` greater than 75, then -`Bob` would be considered very anomalous during this time interval in one or -both of those areas (unusual bytes sent or unusual domains visited). - -One influencer result is written per bucket for each influencer that is -considered anomalous. - -When you identify an influencer with a high score, you can investigate further -by accessing the records resource for that bucket and enumerating the anomaly -records that contain the influencer. - -An influencer object has the following properties: - -`bucket_span`:: - (number) The length of the bucket in seconds. This value matches the `bucket_span` - that is specified in the job. - -`influencer_score`:: - (number) A normalized score between 0-100, which is based on the probability - of the influencer in this bucket aggregated across detectors. Unlike - `initial_influencer_score`, this value will be updated by a re-normalization - process as new data is analyzed. - -`initial_influencer_score`:: - (number) A normalized score between 0-100, which is based on the probability - of the influencer aggregated across detectors. This is the initial value that - was calculated at the time the bucket was processed. - -`influencer_field_name`:: - (string) The field name of the influencer. - -`influencer_field_value`:: - (string) The entity that influenced, contributed to, or was to blame for the - anomaly. - -`is_interim`:: - (boolean) If true, this is an interim result. In other words, the influencer - results are calculated based on partial input data. - -`job_id`:: - (string) The unique identifier for the job that these results belong to. - -`probability`:: - (number) The probability that the influencer has this behavior, in the range - 0 to 1. For example, 0.0000109783. This value can be held to a high precision - of over 300 decimal places, so the `influencer_score` is provided as a - human-readable and friendly interpretation of this. -// For example, 0.03 means 3%. This value is held to a high precision of over -//300 decimal places. In scientific notation, a value of 3.24E-300 is highly -//unlikely and therefore highly anomalous. - -`result_type`:: - (string) Internal. This value is always set to `influencer`. - -`timestamp`:: - (date) The start time of the bucket for which these results were calculated. - -NOTE: Additional influencer properties are added, depending on the fields being -analyzed. For example, if it's analyzing `user_name` as an influencer, then a -field `user_name` is added to the result document. This information enables you to -filter the anomaly results more easily. - - -[float] -[[ml-results-records]] -==== Records - -Records contain the detailed analytical results. They describe the anomalous -activity that has been identified in the input data based on the detector -configuration. - -For example, if you are looking for unusually large data transfers, an anomaly -record can identify the source IP address, the destination, the time window -during which it occurred, the expected and actual size of the transfer, and the -probability of this occurrence. - -There can be many anomaly records depending on the characteristics and size of -the input data. In practice, there are often too many to be able to manually -process them. The {ml-features} therefore perform a sophisticated -aggregation of the anomaly records into buckets. - -The number of record results depends on the number of anomalies found in each -bucket, which relates to the number of time series being modeled and the number of -detectors. - -A record object has the following properties: - -`actual`:: - (array) The actual value for the bucket. - -`bucket_span`:: - (number) The length of the bucket in seconds. - This value matches the `bucket_span` that is specified in the job. - -`by_field_name`:: - (string) The name of the analyzed field. This value is present only if - it is specified in the detector. For example, `client_ip`. - -`by_field_value`:: - (string) The value of `by_field_name`. This value is present only if - it is specified in the detector. For example, `192.168.66.2`. - -`causes`:: - (array) For population analysis, an over field must be specified in the - detector. This property contains an array of anomaly records that are the - causes for the anomaly that has been identified for the over field. If no - over fields exist, this field is not present. This sub-resource contains - the most anomalous records for the `over_field_name`. For scalability reasons, - a maximum of the 10 most significant causes of the anomaly are returned. As - part of the core analytical modeling, these low-level anomaly records are - aggregated for their parent over field record. The causes resource contains - similar elements to the record resource, namely `actual`, `typical`, - `geo_results.actual_point`, `geo_results.typical_point`, - `*_field_name` and `*_field_value`. - Probability and scores are not applicable to causes. - -`detector_index`:: - (number) A unique identifier for the detector. - -`field_name`:: - (string) Certain functions require a field to operate on, for example, `sum()`. - For those functions, this value is the name of the field to be analyzed. - -`function`:: - (string) The function in which the anomaly occurs, as specified in the - detector configuration. For example, `max`. - -`function_description`:: - (string) The description of the function in which the anomaly occurs, as - specified in the detector configuration. - -`influencers`:: - (array) If `influencers` was specified in the detector configuration, then - this array contains influencers that contributed to or were to blame for an - anomaly. - -`initial_record_score`:: - (number) A normalized score between 0-100, which is based on the - probability of the anomalousness of this record. This is the initial value - that was calculated at the time the bucket was processed. - -`is_interim`:: - (boolean) If true, this is an interim result. In other words, the anomaly - record is calculated based on partial input data. - -`job_id`:: - (string) The unique identifier for the job that these results belong to. - -`over_field_name`:: - (string) The name of the over field that was used in the analysis. This value - is present only if it was specified in the detector. Over fields are used - in population analysis. For example, `user`. - -`over_field_value`:: - (string) The value of `over_field_name`. This value is present only if it - was specified in the detector. For example, `Bob`. - -`partition_field_name`:: - (string) The name of the partition field that was used in the analysis. This - value is present only if it was specified in the detector. For example, - `region`. - -`partition_field_value`:: - (string) The value of `partition_field_name`. This value is present only if - it was specified in the detector. For example, `us-east-1`. - -`probability`:: - (number) The probability of the individual anomaly occurring, in the range - 0 to 1. For example, 0.0000772031. This value can be held to a high precision - of over 300 decimal places, so the `record_score` is provided as a - human-readable and friendly interpretation of this. -//In scientific notation, a value of 3.24E-300 is highly unlikely and therefore -//highly anomalous. - -`multi_bucket_impact`:: - (number) an indication of how strongly an anomaly is multi bucket or single bucket. - The value is on a scale of -5 to +5 where -5 means the anomaly is purely single - bucket and +5 means the anomaly is purely multi bucket. - -`record_score`:: - (number) A normalized score between 0-100, which is based on the probability - of the anomalousness of this record. Unlike `initial_record_score`, this - value will be updated by a re-normalization process as new data is analyzed. - -`result_type`:: - (string) Internal. This is always set to `record`. - -`timestamp`:: - (date) The start time of the bucket for which these results were calculated. - -`typical`:: - (array) The typical value for the bucket, according to analytical modeling. - -`geo_results.actual_point`:: - (string) The actual value for the bucket formatted as a `geo_point`. - If the detector function is `lat_long`, this is a comma delimited string - of the latitude and longitude. - -`geo_results.typical_point`:: - (string) The typical value for the bucket formatted as a `geo_point`. - If the detector function is `lat_long`, this is a comma delimited string - of the latitude and longitude. - -NOTE: Additional record properties are added, depending on the fields being -analyzed. For example, if it's analyzing `hostname` as a _by field_, then a field -`hostname` is added to the result document. This information enables you to -filter the anomaly results more easily. - - -[float] -[[ml-results-categories]] -==== Categories - -When `categorization_field_name` is specified in the job configuration, it is -possible to view the definitions of the resulting categories. A category -definition describes the common terms matched and contains examples of matched -values. - -The anomaly results from a categorization analysis are available as bucket, -influencer, and record results. For example, the results might indicate that -at 16:45 there was an unusual count of log message category 11. You can then -examine the description and examples of that category. - -A category resource has the following properties: - -`category_id`:: - (unsigned integer) A unique identifier for the category. - -`examples`:: - (array) A list of examples of actual values that matched the category. - -`grok_pattern`:: - experimental[] (string) A Grok pattern that could be used in Logstash or an - Ingest Pipeline to extract fields from messages that match the category. This - field is experimental and may be changed or removed in a future release. The - Grok patterns that are found are not optimal, but are often a good starting - point for manual tweaking. - -`job_id`:: - (string) The unique identifier for the job that these results belong to. - -`max_matching_length`:: - (unsigned integer) The maximum length of the fields that matched the category. - The value is increased by 10% to enable matching for similar fields that have - not been analyzed. - -`regex`:: - (string) A regular expression that is used to search for values that match the - category. - -`terms`:: - (string) A space separated list of the common tokens that are matched in - values of the category. - -[float] -[[ml-results-overall-buckets]] -==== Overall Buckets - -Overall buckets provide a summary of bucket results over multiple jobs. -Their `bucket_span` equals the longest `bucket_span` of the jobs in question. -The `overall_score` is the `top_n` average of the max `anomaly_score` per job -within the overall bucket time interval. -This means that you can fine-tune the `overall_score` so that it is more -or less sensitive to the number of jobs that detect an anomaly at the same time. - -An overall bucket resource has the following properties: - -`timestamp`:: - (date) The start time of the overall bucket. - -`bucket_span`:: - (number) The length of the bucket in seconds. Matches the `bucket_span` - of the job with the longest one. - -`overall_score`:: - (number) The `top_n` average of the max bucket `anomaly_score` per job. - -`jobs`:: - (array) An array of objects that contain the `max_anomaly_score` per `job_id`. - -`is_interim`:: - (boolean) If true, this is an interim result. In other words, the anomaly - record is calculated based on partial input data. - -`result_type`:: - (string) Internal. This is always set to `overall_bucket`. diff --git a/docs/reference/ml/ml-shared.asciidoc b/docs/reference/ml/ml-shared.asciidoc index 5b292f24ef515..ca5b87aca8083 100644 --- a/docs/reference/ml/ml-shared.asciidoc +++ b/docs/reference/ml/ml-shared.asciidoc @@ -185,6 +185,11 @@ The size of the interval that the analysis is aggregated into, typically between see <>. end::bucket-span[] +tag::bucket-span-results[] +The length of the bucket in seconds. This value matches the `bucket_span` +that is specified in the job. +end::bucket-span-results[] + tag::by-field-name[] The field used to split the data. In particular, this property is used for analyzing the splits with respect to their own history. It is used for finding @@ -520,6 +525,10 @@ that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. end::dependent-variable[] +tag::desc-results[] +If true, the results are sorted in descending order. +end::desc-results[] + tag::description-dfa[] A description of the job. end::description-dfa[] @@ -618,6 +627,11 @@ working with both over and by fields, then you can set `exclude_frequent` to `all` for both fields, or to `by` or `over` for those specific fields. end::exclude-frequent[] +tag::exclude-interim-results[] +If `true`, the output excludes interim results. By default, interim results are +included. +end::exclude-interim-results[] + tag::feature-bag-fraction[] Defines the fraction of features that will be used when selecting a random bag for each candidate split. @@ -707,6 +721,11 @@ is available as part of the input data. When you use multiple detectors, the use of influencers is recommended as it aggregates results for each influencer entity. end::influencers[] +tag::is-interim[] +If `true`, this is an interim result. In other words, the results are calculated +based on partial input data. +end::is-interim[] + tag::job-id-anomaly-detection[] Identifier for the {anomaly-job}. end::job-id-anomaly-detection[] @@ -1114,6 +1133,10 @@ The time span that each search will be querying. This setting is only applicable when the mode is set to `manual`. For example: `3h`. end::time-span[] +tag::timestamp-results[] +The start time of the bucket for which these results were calculated. +end::timestamp-results[] + tag::tokenizer[] The name or definition of the <> to use after character filters are applied. This property is compulsory if diff --git a/docs/reference/redirects.asciidoc b/docs/reference/redirects.asciidoc index 32a30a326bfde..57c7bf93092a7 100644 --- a/docs/reference/redirects.asciidoc +++ b/docs/reference/redirects.asciidoc @@ -1095,3 +1095,20 @@ See <> and <>. This page was deleted. See <>, <>, <>, <>. + +[role="exclude",id="ml-results-resource"] +=== Results resources + +This page was deleted. +[[ml-results-buckets]] +See <>, +[[ml-results-bucket-influencers]] +<>, +[[ml-results-influencers]] +<>, +[[ml-results-records]] +<>, +[[ml-results-categories]] +<>, and +[[ml-results-overall-buckets]] +<>. diff --git a/docs/reference/rest-api/defs.asciidoc b/docs/reference/rest-api/defs.asciidoc index 137af880c0a9a..85ffbae5925fd 100644 --- a/docs/reference/rest-api/defs.asciidoc +++ b/docs/reference/rest-api/defs.asciidoc @@ -7,10 +7,8 @@ These resource definitions are used in APIs related to {ml-features} and * <> -* <> * <> include::{es-repo-dir}/ml/df-analytics/apis/analysisobjects.asciidoc[] include::{xes-repo-dir}/rest-api/security/role-mapping-resources.asciidoc[] -include::{es-repo-dir}/ml/anomaly-detection/apis/resultsresource.asciidoc[] From ab5bfe9f863125ca71ff4541aa62e6f200442887 Mon Sep 17 00:00:00 2001 From: lcawl Date: Tue, 17 Dec 2019 13:13:04 -0800 Subject: [PATCH 4/4] [DOCS] Fixes formatting issues --- .../apis/get-bucket.asciidoc | 27 +++++++++---------- .../apis/get-category.asciidoc | 4 +-- .../apis/get-overall-buckets.asciidoc | 2 ++ .../apis/get-record.asciidoc | 4 +-- 4 files changed, 19 insertions(+), 18 deletions(-) diff --git a/docs/reference/ml/anomaly-detection/apis/get-bucket.asciidoc b/docs/reference/ml/anomaly-detection/apis/get-bucket.asciidoc index c1439ebfea95f..f06632bcbd54f 100644 --- a/docs/reference/ml/anomaly-detection/apis/get-bucket.asciidoc +++ b/docs/reference/ml/anomaly-detection/apis/get-bucket.asciidoc @@ -64,9 +64,9 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=exclude-interim-results] (Optional, boolean) If true, the output includes anomaly records. `page`:: -`from`::: +`page`.`from`::: (Optional, integer) Skips the specified number of buckets. -`size`::: +`page`.`size`::: (Optional, integer) Specifies the maximum number of buckets to obtain. `sort`:: @@ -81,7 +81,6 @@ default, the buckets are sorted by the `timestamp` field. The API returns an array of bucket objects, which have the following properties: - `anomaly_score`:: (number) The maximum anomaly score, between 0-100, for any of the bucket influencers. This is an overall, rate-limited score for the job. All the anomaly @@ -92,45 +91,45 @@ new data is analyzed. (array) An array of bucket influencer objects, which have the following properties: -`bucket_influencers`.`anomaly_score`:: +`bucket_influencers`.`anomaly_score`::: (number) A normalized score between 0-100, which is calculated for each bucket influencer. This score might be updated as newer data is analyzed. -`bucket_influencers`.`bucket_span`:: +`bucket_influencers`.`bucket_span`::: (number) The length of the bucket in seconds. This value matches the `bucket_span` that is specified in the job. -`bucket_influencers`.`initial_anomaly_score`:: +`bucket_influencers`.`initial_anomaly_score`::: (number) The score between 0-100 for each bucket influencer. This score is the initial value that was calculated at the time the bucket was processed. -`bucket_influencers`.`influencer_field_name`:: +`bucket_influencers`.`influencer_field_name`::: (string) The field name of the influencer. -`bucket_influencers`.`influencer_field_value`:: +`bucket_influencers`.`influencer_field_value`::: (string) The field value of the influencer. -`bucket_influencers`.`is_interim`:: +`bucket_influencers`.`is_interim`::: (boolean) include::{docdir}/ml/ml-shared.asciidoc[tag=is-interim] -`bucket_influencers`.`job_id`:: +`bucket_influencers`.`job_id`::: (string) include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection] -`bucket_influencers`.`probability`:: +`bucket_influencers`.`probability`::: (number) The probability that the bucket has this behavior, in the range 0 to 1. This value can be held to a high precision of over 300 decimal places, so the `anomaly_score` is provided as a human-readable and friendly interpretation of this. -`bucket_influencers`.`raw_anomaly_score`:: +`bucket_influencers`.`raw_anomaly_score`::: (number) Internal. -`bucket_influencers`.`result_type`:: +`bucket_influencers`.`result_type`::: (string) Internal. This value is always set to `bucket_influencer`. -`bucket_influencers`.`timestamp`:: +`bucket_influencers`.`timestamp`::: (date) The start time of the bucket for which these results were calculated. `bucket_span`:: diff --git a/docs/reference/ml/anomaly-detection/apis/get-category.asciidoc b/docs/reference/ml/anomaly-detection/apis/get-category.asciidoc index 441a20f99793f..92beaa0360b5e 100644 --- a/docs/reference/ml/anomaly-detection/apis/get-category.asciidoc +++ b/docs/reference/ml/anomaly-detection/apis/get-category.asciidoc @@ -54,9 +54,9 @@ parameter, the API returns information about all categories in the {anomaly-job} ==== {api-request-body-title} `page`:: -`from`::: +`page`.`from`::: (Optional, integer) Skips the specified number of categories. -`size`::: +`page`.`size`::: (Optional, integer) Specifies the maximum number of categories to obtain. [[ml-get-category-results]] diff --git a/docs/reference/ml/anomaly-detection/apis/get-overall-buckets.asciidoc b/docs/reference/ml/anomaly-detection/apis/get-overall-buckets.asciidoc index 256336f0d85bc..3c0df917e4b3c 100644 --- a/docs/reference/ml/anomaly-detection/apis/get-overall-buckets.asciidoc +++ b/docs/reference/ml/anomaly-detection/apis/get-overall-buckets.asciidoc @@ -54,6 +54,8 @@ a span equal to the jobs' largest bucket span. [[ml-get-overall-buckets-path-parms]] ==== {api-path-parms-title} +``:: +(Required, string) include::{docdir}/ml/ml-shared.asciidoc[tag=job-id-anomaly-detection-wildcard-list] [[ml-get-overall-buckets-request-body]] diff --git a/docs/reference/ml/anomaly-detection/apis/get-record.asciidoc b/docs/reference/ml/anomaly-detection/apis/get-record.asciidoc index 33b8804078b80..7e56fc757cb79 100644 --- a/docs/reference/ml/anomaly-detection/apis/get-record.asciidoc +++ b/docs/reference/ml/anomaly-detection/apis/get-record.asciidoc @@ -60,9 +60,9 @@ include::{docdir}/ml/ml-shared.asciidoc[tag=desc-results] include::{docdir}/ml/ml-shared.asciidoc[tag=exclude-interim-results] `page`:: -`from`::: +`page`.`from`::: (Optional, integer) Skips the specified number of records. -`size`::: +`page`.`size`::: (Optional, integer) Specifies the maximum number of records to obtain. `record_score`::