-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[ML] Documentation for Data Frame Analytics high-level REST client #42288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
4399801
9a1b188
3dfc847
d1dcd5f
059b564
15da55d
7d31b75
3139167
c3b7167
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| -- | ||
| :api: delete-data-frame-analytics | ||
| :request: DeleteDataFrameAnalyticsRequest | ||
| :response: AcknowledgedResponse | ||
| -- | ||
| [id="{upid}-{api}"] | ||
| === Delete Data Frame Analytics API | ||
|
|
||
| The Delete Data Frame Analytics API is used to delete an existing {dataframe-analytics-config}. | ||
| The API accepts a +{request}+ object as a request and returns a +{response}+. | ||
|
|
||
| [id="{upid}-{api}-request"] | ||
| ==== Delete Data Frame Analytics Request | ||
|
|
||
| A +{request}+ object requires a {dataframe-analytics-config} id. | ||
|
|
||
| ["source","java",subs="attributes,callouts,macros"] | ||
| --------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-request] | ||
| --------------------------------------------------- | ||
| <1> Constructing a new request referencing an existing {dataframe-analytics-config} | ||
|
|
||
| include::../execution.asciidoc[] | ||
|
|
||
| [id="{upid}-{api}-response"] | ||
| ==== Response | ||
|
|
||
| The returned +{response}+ object acknowledges the {dataframe-analytics-config} deletion. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,45 @@ | ||
| -- | ||
| :api: evaluate-data-frame | ||
| :request: EvaluateDataFrameRequest | ||
| :response: EvaluateDataFrameResponse | ||
| -- | ||
| [id="{upid}-{api}"] | ||
| === Evaluate Data Frame API | ||
|
|
||
| The Evaluate Data Frame API is used to evaluate an ML algorithm that ran on a {dataframe}. | ||
| The API accepts an +{request}+ object and returns an +{response}+. | ||
|
|
||
| [id="{upid}-{api}-request"] | ||
| ==== Evaluate Data Frame Request | ||
|
|
||
| ["source","java",subs="attributes,callouts,macros"] | ||
| -------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-request] | ||
| -------------------------------------------------- | ||
| <1> Constructing a new evaluation request | ||
| <2> Reference to an existing index | ||
| <3> Kind of evaluation to perform | ||
| <4> Name of the field in the index. Its value denotes the actual (i.e. ground truth) label for an example. Must be either true or false | ||
| <5> Name of the field in the index. Its value denotes the probability (as per some ML algorithm) of the example being classified as positive | ||
| <6> The remaining parameters are the metrics to be calculated based on the two fields described above. | ||
| <7> https://en.wikipedia.org/wiki/Precision_and_recall[Precision] calculated at thresholds: 0.4, 0.5 and 0.6 | ||
| <8> https://en.wikipedia.org/wiki/Precision_and_recall[Recall] calculated at thresholds: 0.5 and 0.7 | ||
| <9> https://en.wikipedia.org/wiki/Confusion_matrix[Confusion matrix] calculated at threshold 0.5 | ||
| <10> https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve[AuC ROC] calculated and the curve points returned | ||
|
|
||
| include::../execution.asciidoc[] | ||
|
|
||
| [id="{upid}-{api}-response"] | ||
| ==== Response | ||
|
|
||
| The returned +{response}+ contains the requested evaluation metrics. | ||
|
|
||
| ["source","java",subs="attributes,callouts,macros"] | ||
| -------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-response] | ||
| -------------------------------------------------- | ||
| <1> Fetching all the calculated metrics results | ||
| <2> Fetching precision metric by name | ||
| <3> Fetching precision at a given (0.4) threshold | ||
| <4> Fetching confusion matrix metric by name | ||
| <5> Fetching confusion matrix at a given (0.5) threshold |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| -- | ||
| :api: get-data-frame-analytics-stats | ||
| :request: GetDataFrameAnalyticsStatsRequest | ||
| :response: GetDataFrameAnalyticsStatsResponse | ||
| -- | ||
| [id="{upid}-{api}"] | ||
| === Get Data Frame Analytics Stats API | ||
|
|
||
| The Get Data Frame Analytics Stats API is used to read the operational statistics of one or more {dataframe-analytics-config}s. | ||
| The API accepts a +{request}+ object and returns a +{response}+. | ||
|
|
||
| [id="{upid}-{api}-request"] | ||
| ==== Get Data Frame Analytics Stats Request | ||
|
|
||
| A +{request}+ requires either a {dataframe-analytics-config} id, a comma separated list of ids or | ||
| the special wildcard `_all` to get the statistics for all {dataframe-analytics-config}s | ||
|
|
||
| ["source","java",subs="attributes,callouts,macros"] | ||
| -------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-request] | ||
| -------------------------------------------------- | ||
| <1> Constructing a new GET Stats request referencing an existing {dataframe-analytics-config} | ||
|
|
||
| include::../execution.asciidoc[] | ||
|
|
||
| [id="{upid}-{api}-response"] | ||
| ==== Response | ||
|
|
||
| The returned +{response}+ contains the requested {dataframe-analytics-config} statistics. | ||
|
|
||
| ["source","java",subs="attributes,callouts,macros"] | ||
| -------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-response] | ||
| -------------------------------------------------- |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| -- | ||
| :api: get-data-frame-analytics | ||
| :request: GetDataFrameAnalyticsRequest | ||
| :response: GetDataFrameAnalyticsResponse | ||
| -- | ||
| [id="{upid}-{api}"] | ||
| === Get Data Frame Analytics API | ||
|
|
||
| The Get Data Frame Analytics API is used to get one or more {dataframe-analytics-config}s. | ||
| The API accepts a +{request}+ object and returns a +{response}+. | ||
|
|
||
| [id="{upid}-{api}-request"] | ||
| ==== Get Data Frame Analytics Request | ||
|
|
||
| A +{request}+ requires either a {dataframe-analytics-config} id, a comma separated list of ids or | ||
| the special wildcard `_all` to get all {dataframe-analytics-config}s. | ||
|
|
||
| ["source","java",subs="attributes,callouts,macros"] | ||
| -------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-request] | ||
| -------------------------------------------------- | ||
| <1> Constructing a new GET request referencing an existing {dataframe-analytics-config} | ||
|
|
||
| include::../execution.asciidoc[] | ||
|
|
||
| [id="{upid}-{api}-response"] | ||
| ==== Response | ||
|
|
||
| The returned +{response}+ contains the requested {dataframe-analytics-config}s. | ||
|
|
||
| ["source","java",subs="attributes,callouts,macros"] | ||
| -------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-response] | ||
| -------------------------------------------------- |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,115 @@ | ||
| -- | ||
| :api: put-data-frame-analytics | ||
| :request: PutDataFrameAnalyticsRequest | ||
| :response: PutDataFrameAnalyticsResponse | ||
| -- | ||
| [id="{upid}-{api}"] | ||
| === Put Data Frame Analytics API | ||
|
|
||
| The Put Data Frame Analytics API is used to create a new {dataframe-analytics-config}. | ||
| The API accepts a +{request}+ object as a request and returns a +{response}+. | ||
|
|
||
| [id="{upid}-{api}-request"] | ||
| ==== Put Data Frame Analytics Request | ||
|
|
||
| A +{request}+ requires the following argument: | ||
|
|
||
| ["source","java",subs="attributes,callouts,macros"] | ||
| -------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-request] | ||
| -------------------------------------------------- | ||
| <1> The configuration of the {dataframe-job} to create | ||
|
|
||
| [id="{upid}-{api}-config"] | ||
| ==== Data Frame Analytics Configuration | ||
|
|
||
| The `DataFrameAnalyticsConfig` object contains all the details about the {dataframe-job} | ||
| configuration and contains the following arguments: | ||
|
|
||
| ["source","java",subs="attributes,callouts,macros"] | ||
| -------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-config] | ||
| -------------------------------------------------- | ||
| <1> The {dataframe-analytics-config} id | ||
| <2> The source index and query from which to gather data | ||
| <3> The destination index | ||
| <4> The analysis to be performed | ||
| <5> The fields to be included in / excluded from the analysis | ||
| <6> The memory limit for the model created as part of the analysis process | ||
|
|
||
| [id="{upid}-{api}-query-config"] | ||
|
|
||
| ==== SourceConfig | ||
|
|
||
| The index and the query from which to collect data. | ||
|
|
||
| ["source","java",subs="attributes,callouts,macros"] | ||
| -------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-source-config] | ||
| -------------------------------------------------- | ||
| <1> Constructing a new DataFrameAnalyticsSource | ||
| <2> The source index | ||
| <3> The query from which to gather the data. If query is not set, a `match_all` query is used by default. | ||
|
|
||
| ===== QueryConfig | ||
|
|
||
| The query with which to select data from the source. | ||
|
|
||
| ["source","java",subs="attributes,callouts,macros"] | ||
| -------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-query-config] | ||
| -------------------------------------------------- | ||
|
|
||
| ==== DestinationConfig | ||
|
|
||
| The index to which data should be written by the {dataframe-job}. | ||
|
|
||
| ["source","java",subs="attributes,callouts,macros"] | ||
| -------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-dest-config] | ||
| -------------------------------------------------- | ||
| <1> Constructing a new DataFrameAnalyticsDest | ||
| <2> The destination index | ||
|
|
||
| ==== Analysis | ||
|
|
||
| The analysis to be performed. | ||
| Currently, only one analysis is supported: +OutlierDetection+. | ||
|
|
||
| +OutlierDetection+ analysis can be created in one of two ways: | ||
|
|
||
| ["source","java",subs="attributes,callouts,macros"] | ||
| -------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-analysis-default] | ||
| -------------------------------------------------- | ||
| <1> Constructing a new OutlierDetection object with default strategy to determine outliers | ||
|
|
||
| or | ||
| ["source","java",subs="attributes,callouts,macros"] | ||
| -------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-analysis-customized] | ||
| -------------------------------------------------- | ||
| <1> Constructing a new OutlierDetection object | ||
| <2> The method used to perform the analysis | ||
| <3> Number of neighbors taken into account during analysis | ||
|
|
||
| ==== Analyzed fields | ||
|
|
||
| FetchContext object containing fields to be included in / excluded from the analysis | ||
|
|
||
| ["source","java",subs="attributes,callouts,macros"] | ||
| -------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-analyzed-fields] | ||
| -------------------------------------------------- | ||
|
|
||
|
||
| include::../execution.asciidoc[] | ||
|
|
||
| [id="{upid}-{api}-response"] | ||
| ==== Response | ||
|
|
||
| The returned +{response}+ contains the newly created {dataframe-analytics-config}. | ||
|
|
||
| ["source","java",subs="attributes,callouts,macros"] | ||
| -------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-response] | ||
| -------------------------------------------------- | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| -- | ||
| :api: start-data-frame-analytics | ||
| :request: StartDataFrameAnalyticsRequest | ||
| :response: AcknowledgedResponse | ||
| -- | ||
| [id="{upid}-{api}"] | ||
| === Start Data Frame Analytics API | ||
|
|
||
| The Start Data Frame Analytics API is used to start an existing {dataframe-analytics-config}. | ||
| It accepts a +{request}+ object and responds with a +{response}+ object. | ||
|
|
||
| [id="{upid}-{api}-request"] | ||
| ==== Start Data Frame Analytics Request | ||
|
|
||
| A +{request}+ object requires a {dataframe-analytics-config} id. | ||
|
|
||
| ["source","java",subs="attributes,callouts,macros"] | ||
| --------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-request] | ||
| --------------------------------------------------- | ||
| <1> Constructing a new start request referencing an existing {dataframe-analytics-config} | ||
|
|
||
| include::../execution.asciidoc[] | ||
|
|
||
| [id="{upid}-{api}-response"] | ||
| ==== Response | ||
|
|
||
| The returned +{response}+ object acknowledges the {dataframe-job} has started. | ||
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| -- | ||
| :api: stop-data-frame-analytics | ||
| :request: StopDataFrameAnalyticsRequest | ||
| :response: StopDataFrameAnalyticsResponse | ||
| -- | ||
| [id="{upid}-{api}"] | ||
| === Stop Data Frame Analytics API | ||
|
|
||
| The Stop Data Frame Analytics API is used to stop a running {dataframe-analytics-config}. | ||
| It accepts a +{request}+ object and responds with a +{response}+ object. | ||
|
|
||
| [id="{upid}-{api}-request"] | ||
| ==== Stop Data Frame Analytics Request | ||
|
|
||
| A +{request}+ object requires a {dataframe-analytics-config} id. | ||
|
|
||
| ["source","java",subs="attributes,callouts,macros"] | ||
| --------------------------------------------------- | ||
| include-tagged::{doc-tests-file}[{api}-request] | ||
| --------------------------------------------------- | ||
| <1> Constructing a new stop request referencing an existing {dataframe-analytics-config} | ||
|
|
||
| include::../execution.asciidoc[] | ||
|
|
||
| [id="{upid}-{api}-response"] | ||
| ==== Response | ||
|
|
||
| The returned +{response}+ object acknowledges the {dataframe-job} has stopped. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see my other comment.