Skip to content

Commit 0bd5f07

Browse files
committed
[DOCS] Fix classification score details in example (#1368)
1 parent e1fb07d commit 0bd5f07

File tree

1 file changed

+14
-17
lines changed

1 file changed

+14
-17
lines changed

docs/en/stack/ml/df-analytics/flightdata-classification.asciidoc

Lines changed: 14 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -316,14 +316,14 @@ or testing data set. You can filter the table and the confusion matrix such that
316316
they contain only testing or training data. You can also enable histogram charts
317317
to get a better understanding of the distribution of values in your data.
318318

319-
If you examine this destination index more closely in the *Discover* app in
320-
{kib} or use the standard {es} search command, you can see that the analysis
321-
predicts the probability of all possible classes for the dependent variable (in
322-
a `top_classes` object). In this case, there are two classes: `true` and
323-
`false`. The most probable class is the prediction, which is what's shown in the
324-
{classification} results table. If you want to understand how sure the model is
325-
about the prediction, however, you might want to examine the class probability
326-
values. A higher number means that the model is more confident.
319+
If you want to understand how certain the model is about each prediction, you
320+
can examine its probability and score (`ml.prediction_probability` and
321+
`ml.prediction_score`). The higher these values are, the more confident the
322+
model is that the data point belongs to the named class. If you examine the
323+
destination index more closely in the *Discover* app in {kib} or use the
324+
standard {es} search command, you can see that the analysis predicts the
325+
probability of all possible classes for the dependent variable. The
326+
`top_classes` object contains the predicted classes with the highest scores.
327327

328328
.API example
329329
[%collapsible]
@@ -334,7 +334,6 @@ GET df-flight-delayed/_search
334334
--------------------------------------------------
335335
// TEST[skip:TBD]
336336
337-
338337
The snippet below shows a part of a document with the annotated results:
339338
340339
[source,console-result]
@@ -372,14 +371,12 @@ The snippet below shows a part of a document with the annotated results:
372371
}
373372
----
374373
<1> An array of values specifying the probability of the prediction and the
375-
`class_score` for each class.
376-
377-
The `top_classes` object contains the predicted classes with the highest
378-
scores. The `class_probability` is a value between 0 and 1. The higher the
379-
number, the more confident the model is that the data point belongs to the named
380-
class. In the example above, `false` has a `class_probability` of 0.91 while
381-
`true` has only 0.08, so the prediction will be `false`. The `class_score` is a
382-
function of the probability.
374+
score for each class.
375+
376+
The class with the highest score is the prediction. In this example, `false` has
377+
a `class_score` of 0.37 while `true` has only 0.08, so the prediction will be
378+
`false`. For more details about these values, see
379+
<<dfa-classification-interpret>>.
383380
384381
////
385382
It is chosen so that the decision to assign the

0 commit comments

Comments
 (0)