-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[ML] Start gathering and storing inference stats #53429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Start gathering and storing inference stats #53429
Conversation
|
Pinging @elastic/ml-core (:ml) |
| ActionListener.wrap( | ||
| r -> this.loadModel(modelId, r), | ||
| e -> { | ||
| logger.error("[{}] failed to get previous model stats", modelId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might actually want to fail completely if the unwrapped error is anything other than a ResourceNotFound. If .ml-stats-* exists but has unallocated primary shards, we may want to bail.
...n/core/src/main/java/org/elasticsearch/xpack/core/ml/action/GetTrainedModelsStatsAction.java
Outdated
Show resolved
Hide resolved
...ore/src/main/java/org/elasticsearch/xpack/core/ml/inference/trainedmodel/InferenceStats.java
Outdated
Show resolved
Hide resolved
...ore/src/main/java/org/elasticsearch/xpack/core/ml/inference/trainedmodel/InferenceStats.java
Outdated
Show resolved
Hide resolved
...ore/src/main/java/org/elasticsearch/xpack/core/ml/inference/trainedmodel/InferenceStats.java
Show resolved
Hide resolved
...ore/src/main/java/org/elasticsearch/xpack/core/ml/inference/trainedmodel/InferenceStats.java
Outdated
Show resolved
Hide resolved
...l/src/main/java/org/elasticsearch/xpack/ml/inference/loadingservice/ModelLoadingService.java
Show resolved
Hide resolved
.../plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/loadingservice/LocalModel.java
Outdated
Show resolved
Hide resolved
.../ml/src/main/java/org/elasticsearch/xpack/ml/inference/persistence/TrainedModelProvider.java
Outdated
Show resolved
Hide resolved
.../ml/src/main/java/org/elasticsearch/xpack/ml/inference/persistence/TrainedModelProvider.java
Show resolved
Hide resolved
.../ml/src/main/java/org/elasticsearch/xpack/ml/inference/persistence/TrainedModelProvider.java
Outdated
Show resolved
Hide resolved
davidkyle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks good but I'd like you to reconsider how valuable the time spent metric is.
If only a few documents are inferred then time spent in nanos will be rounded down to 0 millis.
The overhead to measuring the time is tiny compared to the cost of the modelling function but there is overhead calling System.nanoTime()
My main concern is that we give a false impression of accuracy where we know there is rounding error and loss. How valuable will the user find the metric?
Let's open up the discussion my feeling is that it is not necessary and it is easy to put the timing code back in a later version if required but difficult to remove.
...k/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/TrainedModelStatsService.java
Outdated
Show resolved
Hide resolved
...l/src/main/java/org/elasticsearch/xpack/ml/inference/loadingservice/ModelLoadingService.java
Outdated
Show resolved
Hide resolved
...k/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/TrainedModelStatsService.java
Outdated
Show resolved
Hide resolved
I included it as it is a statistic gathered by ingest. But, I do agree, the overhead and loss of accuracy here might not make it worth it. All the other use cases of inference would include timing stats themselves (ingest and search). So, information of how long a model takes to infer can be well...inferred, by the time used in search/ingest. I have no problem removing timing stats right now. |
|
@elasticmachine update branch |
…nwtrent/elasticsearch into feature/ml-inference-stats-collection
|
@elasticmachine update branch |
|
@elasticmachine update branch |
...k/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/TrainedModelStatsService.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/core/src/main/resources/org/elasticsearch/xpack/core/ml/stats_index_mappings.json
Outdated
Show resolved
Hide resolved
...k/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/TrainedModelStatsService.java
Outdated
Show resolved
Hide resolved
davidkyle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The persist interval can probably be bump a couple of seconds
davidkyle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
...k/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/TrainedModelStatsService.java
Outdated
Show resolved
Hide resolved
.../plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/loadingservice/LocalModel.java
Show resolved
Hide resolved
This PR enables stats on inference to be gathered and stored in the `.ml-stats-*` indices. Each node + model_id will have its own running stats document and these will later be summed together when returning _stats to the user. `.ml-stats-*` is ILM managed (when possible). So, at any point the underlying index could change. This means that a stats document that is read in and then later updated will actually be a new doc in a new index. This complicates matters as this means that having a running knowledge of seq_no and primary_term is complicated and almost impossible. This is because we don't know the latest index name. We should also strive for throughput, as this code sits in the middle of an ingest pipeline (or even a query).
* [ML] Start gathering and storing inference stats (#53429) This PR enables stats on inference to be gathered and stored in the `.ml-stats-*` indices. Each node + model_id will have its own running stats document and these will later be summed together when returning _stats to the user. `.ml-stats-*` is ILM managed (when possible). So, at any point the underlying index could change. This means that a stats document that is read in and then later updated will actually be a new doc in a new index. This complicates matters as this means that having a running knowledge of seq_no and primary_term is complicated and almost impossible. This is because we don't know the latest index name. We should also strive for throughput, as this code sits in the middle of an ingest pipeline (or even a query).
This PR enables stats on inference to be gathered and stored in the
.ml-stats-*indices.Each node + model_id will have its own running stats document and these will later be summed together when returning _stats to the user.
.ml-stats-*is ILM managed (when possible). So, at any point the underlying index could change. This means that a stats document that is read in and then later updated will actually be a new doc in a new index. This complicates matters as this means that having a running knowledge of seq_no and primary_term is complicated and almost impossible. This is because we don't know the latest index name.We should also strive for throughput, as this code sits in the middle of an ingest pipeline (or even a query).