[ML] inference performance optimizations and refactor #57674

benwtrent · 2020-06-04T14:10:00Z

This is a major refactor of the underlying inference logic.

The main refactor is now we are separating the model configuration and
the inference interfaces.

This has the following benefits:

we can store extra things with the model that are not
necessary for inference (i.e. treenode split information gain)
we can optimize inference separate from model serialization and storage.
The user is oblivious to the optimizations (other than seeing the benefits).

A major part of this commit is removing all inference related methods from the
trained model configurations (ensemble, tree, etc.) and moving them to a new class.

This new class satisfies a new interface that is ONLY for inference.

The optimizations applied currently are:

feature maps are flattened once
feature extraction only happens once at the highest level
(improves inference + feature importance through put)
Only storing what we need for inference + feature importance on heap

elasticmachine · 2020-06-04T14:10:02Z

Pinging @elastic/ml-core (:ml)

benwtrent · 2020-06-04T14:10:34Z

Some numbers from JMH. These were gathered by running inference against a larger tree ensemble model. Each measurement is a single inference call against data with various fields missing and various expected results.

Before optimizations:

Benchmark                   Mode  Cnt        Score        Error  Units
infer_1105142  avgt   60  4676118.771 ± 408004.627  ns/op //~4ms
infer_1163199  avgt   60  6572394.355 ± 448022.365  ns/op //~6ms
infer_1274687  avgt   60  5221690.355 ± 298235.541  ns/op //~5ms
infer_1286491  avgt   60  6057884.209 ± 503322.898  ns/op //~6ms
infer_1329864  avgt   60  3704593.746 ± 209046.926  ns/op //~3ms
infer_1345914  avgt   60  3526360.753 ± 338985.848  ns/op //~3ms
infer_147975   avgt   60  2478510.531 ±  37657.708  ns/op //~2ms
infer_182200   avgt   60  2577532.404 ± 114494.789  ns/op //~2ms
infer_424703   avgt   60  6766206.021 ± 131101.240  ns/op //~6ms
infer_672407   avgt   60  8523977.602 ± 323823.581  ns/op //~8ms

After optimizations:

infer_1105142  avgt   60  105574.327 ± 4315.656  ns/op
infer_1163199  avgt   60  117342.726 ± 3857.156  ns/op
infer_1274687  avgt   60  113382.842 ± 2490.623  ns/op
infer_1286491  avgt   60  110763.581 ±  842.259  ns/op
infer_1329864  avgt   60  108473.284 ± 4547.068  ns/op
infer_1345914  avgt   60  104551.520 ±  336.243  ns/op
infer_147975   avgt   60   87700.209 ± 2342.228  ns/op
infer_182200   avgt   60  101667.732 ± 4628.543  ns/op
infer_424703   avgt   60  103756.438 ± 2358.109  ns/op
infer_672407   avgt   60  105891.101 ± 6329.288  ns/op

benwtrent · 2020-06-04T19:28:45Z

run elasticsearch-ci/bwc

droberts195

LGTM

This is a major refactor of the underlying inference logic. The main refactor is now we are separating the model configuration and the inference interfaces. This has the following benefits: - we can store extra things with the model that are not necessary for inference (i.e. treenode split information gain) - we can optimize inference separate from model serialization and storage. - The user is oblivious to the optimizations (other than seeing the benefits). A major part of this commit is removing all inference related methods from the trained model configurations (ensemble, tree, etc.) and moving them to a new class. This new class satisfies a new interface that is ONLY for inference. The optimizations applied currently are: - feature maps are flattened once - feature extraction only happens once at the highest level (improves inference + feature importance through put) - Only storing what we need for inference + feature importance on heap

[ML] inference performance optimizations

755f51c

benwtrent added >refactoring :ml Machine learning v8.0.0 v7.9.0 labels Jun 4, 2020

fixing test failure, reducing code churn

29b5b87

benwtrent mentioned this pull request Jun 5, 2020

[ML] add new circuit breaker for inference model caching #57731

Merged

droberts195 approved these changes Jun 5, 2020

View reviewed changes

benwtrent merged commit 78fcd05 into elastic:master Jun 5, 2020

benwtrent deleted the feature/ml-inference-performance-improvements branch June 5, 2020 17:03

benwtrent mentioned this pull request Jun 5, 2020

[7.x] [ML] inference performance optimizations and refactor (#57674) #57753

Merged

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ML] inference performance optimizations and refactor #57674

[ML] inference performance optimizations and refactor #57674

Uh oh!

benwtrent commented Jun 4, 2020

Uh oh!

elasticmachine commented Jun 4, 2020

Uh oh!

benwtrent commented Jun 4, 2020

Uh oh!

benwtrent commented Jun 4, 2020

Uh oh!

droberts195 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[ML] inference performance optimizations and refactor #57674

[ML] inference performance optimizations and refactor #57674

Uh oh!

Conversation

benwtrent commented Jun 4, 2020

Uh oh!

elasticmachine commented Jun 4, 2020

Uh oh!

benwtrent commented Jun 4, 2020

Uh oh!

benwtrent commented Jun 4, 2020

Uh oh!

droberts195 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants