Skip to content

Commit 2e820fc

Browse files
authored
[DOCS] Clarifies terminology in Performing population analysis page. (#74237)
1 parent 31e1b1a commit 2e820fc

File tree

1 file changed

+13
-13
lines changed

1 file changed

+13
-13
lines changed

docs/reference/ml/anomaly-detection/ml-configuring-populations.asciidoc

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -7,18 +7,18 @@ Entities or events in your data can be considered anomalous when:
77
* Their behavior changes over time, relative to their own previous behavior, or
88
* Their behavior is different than other entities in a specified population.
99

10-
The latter method of detecting outliers is known as _population analysis_. The
11-
{ml} analytics build a profile of what a "typical" user, machine, or other entity
12-
does over a specified time period and then identify when one is behaving
10+
The latter method of detecting anomalies is known as _population analysis_. The
11+
{ml} analytics build a profile of what a "typical" user, machine, or other
12+
entity does over a specified time period and then identify when one is behaving
1313
abnormally compared to the population.
1414

1515
This type of analysis is most useful when the behavior of the population as a
16-
whole is mostly homogeneous and you want to identify outliers. In general,
17-
population analysis is not useful when members of the population inherently
18-
have vastly different behavior. You can, however, segment your data into groups
19-
that behave similarly and run these as separate jobs. For example, you can use a
20-
query filter in the {dfeed} to segment your data or you can use the
21-
`partition_field_name` to split the analysis for the different groups.
16+
whole is mostly homogeneous and you want to identify unusual behavior. In
17+
general, population analysis is not useful when members of the population
18+
inherently have vastly different behavior. You can, however, segment your data
19+
into groups that behave similarly and run these as separate jobs. For example,
20+
you can use a query filter in the {dfeed} to segment your data or you can use
21+
the `partition_field_name` to split the analysis for the different groups.
2222

2323
Population analysis scales well and has a lower resource footprint than
2424
individual analysis of each series. For example, you can analyze populations
@@ -52,8 +52,8 @@ PUT _ml/anomaly_detectors/population
5252
----------------------------------
5353
// TEST[skip:needs-licence]
5454

55-
<1> This `over_field_name` property indicates that the metrics for each client (
56-
as identified by their IP address) are analyzed relative to other clients
55+
<1> This `over_field_name` property indicates that the metrics for each client
56+
(as identified by their IP address) are analyzed relative to other clients
5757
in each bucket.
5858

5959
If your data is stored in {es}, you can use the population job wizard in {kib}
@@ -73,8 +73,8 @@ image::images/ml-population-results.png["Population analysis results in the Anom
7373

7474
As in this case, the results are often quite sparse. There might be just a few
7575
data points for the selected time period. Population analysis is particularly
76-
useful when you have many entities and the data for specific entitles is sporadic
77-
or sparse.
76+
useful when you have many entities and the data for specific entitles is
77+
sporadic or sparse.
7878

7979
If you click on a section in the timeline or swim lanes, you can see more
8080
details about the anomalies:

0 commit comments

Comments
 (0)