Skip to content

Conversation

@benwtrent
Copy link
Member

This adds the _stats endpoint for inference.

In the future, it might make sense to add cache stats or something to this.

Right now it only provides ingest stats and the total number of pipelines that reference the trained models. I decided against displaying the individual node stats as that seemed too granular, and that information is already accessible via a node/_stats/ingest call.

This _stats endpoint is a focus around the ingest pipelines that contain inference processors.

I think future work around "loaded" or "cached" or not might be useful as well.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakelandis I added equals and hashCode objects here as I needed them for testing and for hashing the objects.

Adding them seemed like a no brainer to me, but I don't want to change this code without somebody who works on ingest giving their blessing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part of the change LGTM (didn't review the other parts)

@benwtrent benwtrent force-pushed the feature/ml-inference-add-_stats-endpoint branch from 9fdf346 to 91ff7b5 Compare October 24, 2019 20:31
Copy link

@hendrikmuhs hendrikmuhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, added some minor suggestions

builder.field(PIPELINE_COUNT.getPreferredName(), pipelineCount);
if (pipelineCount > 0) {
// Ingest stats is a fragment
builder.value(ingestStats);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to call ingestStats.toXContent(builder, params);

public TrainedModelStats(String modelId, IngestStats ingestStats, int pipelineCount) {
this.modelId = Objects.requireNonNull(modelId);
this.ingestStats = ingestStats == null ? EMPTY_INGEST_STATS : ingestStats;
this.pipelineCount = pipelineCount;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: paranoia, as we use VInt which is strictly >0, maybe assert > 0

public TrainedModelStats(StreamInput in) throws IOException {
modelId = in.readString();
ingestStats = new IngestStats(in);
pipelineCount = in.readVInt();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about swaping pipelineCount and ingestStats and only read/write the stats object if pipelineCount > 0??

Collections.emptyList(),
Collections.emptyMap());

public TrainedModelStats(String modelId, IngestStats ingestStats, int pipelineCount) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure: pipelineCount is int, modelCount below is long - should we use the same type for consistency?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pipelineCount is int because it is the size of the pipeline stats array. Java returns an int from a size() call.

As for modelCount it is the total number of models matched by the provided ids. If _all is provided, then the total number of models is returned.

Additionally, the underlying QueryPage object assumes that the count value is long.

.size(request.getPageParams().getSize());
}
sourceBuilder.trackTotalHits(true)
// we only care about the item IDS, there is no need to load large model definitions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "item IDS" -> "item id's" ??

@benwtrent benwtrent merged commit c2946d1 into elastic:feature/ml-inference Oct 29, 2019
benwtrent added a commit that referenced this pull request Nov 18, 2019
* [ML][Inference] adds lazy model loader and inference (#47410)

This adds a couple of things:

- A model loader service that is accessible via transport calls. This service will load in models and cache them. They will stay loaded until a processor no longer references them
- A Model class and its first sub-class LocalModel. Used to cache model information and run inference.
- Transport action and handler for requests to infer against a local model
Related Feature PRs: 
* [ML][Inference] Adjust inference configuration option API (#47812)

* [ML][Inference] adds logistic_regression output aggregator (#48075)

* [ML][Inference] Adding read/del trained models (#47882)

* [ML][Inference] Adding inference ingest processor (#47859)

* [ML][Inference] fixing classification inference for ensemble (#48463)

* [ML][Inference] Adding model memory estimations (#48323)

* [ML][Inference] adding more options to inference processor (#48545)

* [ML][Inference] handle string values better in feature extraction (#48584)

* [ML][Inference] Adding _stats endpoint for inference (#48492)

* [ML][Inference] add inference processors and trained models to usage (#47869)

* [ML][Inference] add new flag for optionally including model definition (#48718)

* [ML][Inference] adding license checks (#49056)

* [ML][Inference] Adding memory and compute estimates to inference (#48955)
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Nov 18, 2019
* [ML][Inference] adds lazy model loader and inference (elastic#47410)

This adds a couple of things:

- A model loader service that is accessible via transport calls. This service will load in models and cache them. They will stay loaded until a processor no longer references them
- A Model class and its first sub-class LocalModel. Used to cache model information and run inference.
- Transport action and handler for requests to infer against a local model
Related Feature PRs:
* [ML][Inference] Adjust inference configuration option API (elastic#47812)

* [ML][Inference] adds logistic_regression output aggregator (elastic#48075)

* [ML][Inference] Adding read/del trained models (elastic#47882)

* [ML][Inference] Adding inference ingest processor (elastic#47859)

* [ML][Inference] fixing classification inference for ensemble (elastic#48463)

* [ML][Inference] Adding model memory estimations (elastic#48323)

* [ML][Inference] adding more options to inference processor (elastic#48545)

* [ML][Inference] handle string values better in feature extraction (elastic#48584)

* [ML][Inference] Adding _stats endpoint for inference (elastic#48492)

* [ML][Inference] add inference processors and trained models to usage (elastic#47869)

* [ML][Inference] add new flag for optionally including model definition (elastic#48718)

* [ML][Inference] adding license checks (elastic#49056)

* [ML][Inference] Adding memory and compute estimates to inference (elastic#48955)
@benwtrent benwtrent deleted the feature/ml-inference-add-_stats-endpoint branch November 18, 2019 17:12
benwtrent added a commit that referenced this pull request Nov 18, 2019
* [ML] ML Model Inference Ingest Processor (#49052)

* [ML][Inference] adds lazy model loader and inference (#47410)

This adds a couple of things:

- A model loader service that is accessible via transport calls. This service will load in models and cache them. They will stay loaded until a processor no longer references them
- A Model class and its first sub-class LocalModel. Used to cache model information and run inference.
- Transport action and handler for requests to infer against a local model
Related Feature PRs:

* [ML][Inference] Adjust inference configuration option API (#47812)

* [ML][Inference] adds logistic_regression output aggregator (#48075)

* [ML][Inference] Adding read/del trained models (#47882)

* [ML][Inference] Adding inference ingest processor (#47859)

* [ML][Inference] fixing classification inference for ensemble (#48463)

* [ML][Inference] Adding model memory estimations (#48323)

* [ML][Inference] adding more options to inference processor (#48545)

* [ML][Inference] handle string values better in feature extraction (#48584)

* [ML][Inference] Adding _stats endpoint for inference (#48492)

* [ML][Inference] add inference processors and trained models to usage (#47869)

* [ML][Inference] add new flag for optionally including model definition (#48718)

* [ML][Inference] adding license checks (#49056)

* [ML][Inference] Adding memory and compute estimates to inference (#48955)

* fixing version of indexed docs for model inference
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants