@@ -7,18 +7,18 @@ Entities or events in your data can be considered anomalous when:
77* Their behavior changes over time, relative to their own previous behavior, or
88* Their behavior is different than other entities in a specified population.
99
10- The latter method of detecting outliers is known as _population analysis_. The
11- {ml} analytics build a profile of what a "typical" user, machine, or other entity
12- does over a specified time period and then identify when one is behaving
10+ The latter method of detecting anomalies is known as _population analysis_. The
11+ {ml} analytics build a profile of what a "typical" user, machine, or other
12+ entity does over a specified time period and then identify when one is behaving
1313abnormally compared to the population.
1414
1515This type of analysis is most useful when the behavior of the population as a
16- whole is mostly homogeneous and you want to identify outliers . In general,
17- population analysis is not useful when members of the population inherently
18- have vastly different behavior. You can, however, segment your data into groups
19- that behave similarly and run these as separate jobs. For example, you can use a
20- query filter in the {dfeed} to segment your data or you can use the
21- `partition_field_name` to split the analysis for the different groups.
16+ whole is mostly homogeneous and you want to identify unusual behavior . In
17+ general, population analysis is not useful when members of the population
18+ inherently have vastly different behavior. You can, however, segment your data
19+ into groups that behave similarly and run these as separate jobs. For example,
20+ you can use a query filter in the {dfeed} to segment your data or you can use
21+ the `partition_field_name` to split the analysis for the different groups.
2222
2323Population analysis scales well and has a lower resource footprint than
2424individual analysis of each series. For example, you can analyze populations
@@ -52,8 +52,8 @@ PUT _ml/anomaly_detectors/population
5252----------------------------------
5353// TEST[skip:needs-licence]
5454
55- <1> This `over_field_name` property indicates that the metrics for each client (
56- as identified by their IP address) are analyzed relative to other clients
55+ <1> This `over_field_name` property indicates that the metrics for each client
56+ ( as identified by their IP address) are analyzed relative to other clients
5757 in each bucket.
5858
5959If your data is stored in {es}, you can use the population job wizard in {kib}
@@ -73,8 +73,8 @@ image::images/ml-population-results.png["Population analysis results in the Anom
7373
7474As in this case, the results are often quite sparse. There might be just a few
7575data points for the selected time period. Population analysis is particularly
76- useful when you have many entities and the data for specific entitles is sporadic
77- or sparse.
76+ useful when you have many entities and the data for specific entitles is
77+ sporadic or sparse.
7878
7979If you click on a section in the timeline or swim lanes, you can see more
8080details about the anomalies:
0 commit comments