[ML] Add an estimate model memory endpoint for anomaly detection

At present the ML UI has functionality to calculate a rough estimate of the model memory requirement for certain types of anomaly detection jobs.  However, it doesn't cover all detector functions and doesn't cover population jobs.

The ML API in Elasticsearch should provide an endpoint that encapsulates the various formulas, can be extended to cover all possible configurations, and can be kept up to date when model sizes change.

The inputs to this endpoint will be:

1. An `analysis_config`, in the same format as would be provided to the create job endpoint - documented in https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-put-job.html#ml-put-job-path-parms
2. Overall cardinalities for the `by`, `over` and `partition` fields
3. Max bucket cardinalities for `influencer` fields that are not also `by`, `over` or `partition` fields

An example of the proposed request format is:

```
POST _ml/anomaly_detectors/_estimate_model_memory
{
  "analysis_config": {
    "bucket_span": "10m",
    "detectors": [
      {
        "function": "sum",
        "field_name": "bytes",
        "partition_field_name": "src_ip"
      }
    ],
    "influencers": [ "src_ip", "dest_ip" ]
  },
  "overall_cardinality": {
    "src_ip": 567483
  },
  "max_bucket_cardinality": {
    "dest_ip": 7456
  }
}
```

An example of the proposed response format is:

```
{
  "model_memory_estimate": "836mb"
}
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ML] Add an estimate model memory endpoint for anomaly detection #53219

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[ML] Add an estimate model memory endpoint for anomaly detection #53219

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions