Skip to content

Conversation

@benwtrent
Copy link
Member

This is a major refactor of the underlying inference logic.

The main refactor is now we are separating the model configuration and
the inference interfaces.

This has the following benefits:

  • we can store extra things with the model that are not
    necessary for inference (i.e. treenode split information gain)
  • we can optimize inference separate from model serialization and storage.
  • The user is oblivious to the optimizations (other than seeing the benefits).

A major part of this commit is removing all inference related methods from the
trained model configurations (ensemble, tree, etc.) and moving them to a new class.

This new class satisfies a new interface that is ONLY for inference.

The optimizations applied currently are:

  • feature maps are flattened once
  • feature extraction only happens once at the highest level
    (improves inference + feature importance through put)
  • Only storing what we need for inference + feature importance on heap

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@benwtrent
Copy link
Member Author

Some numbers from JMH. These were gathered by running inference against a larger tree ensemble model. Each measurement is a single inference call against data with various fields missing and various expected results.

Before optimizations:

Benchmark                   Mode  Cnt        Score        Error  Units
infer_1105142  avgt   60  4676118.771 ± 408004.627  ns/op //~4ms
infer_1163199  avgt   60  6572394.355 ± 448022.365  ns/op //~6ms
infer_1274687  avgt   60  5221690.355 ± 298235.541  ns/op //~5ms
infer_1286491  avgt   60  6057884.209 ± 503322.898  ns/op //~6ms
infer_1329864  avgt   60  3704593.746 ± 209046.926  ns/op //~3ms
infer_1345914  avgt   60  3526360.753 ± 338985.848  ns/op //~3ms
infer_147975   avgt   60  2478510.531 ±  37657.708  ns/op //~2ms
infer_182200   avgt   60  2577532.404 ± 114494.789  ns/op //~2ms
infer_424703   avgt   60  6766206.021 ± 131101.240  ns/op //~6ms
infer_672407   avgt   60  8523977.602 ± 323823.581  ns/op //~8ms

After optimizations:

infer_1105142  avgt   60  105574.327 ± 4315.656  ns/op
infer_1163199  avgt   60  117342.726 ± 3857.156  ns/op
infer_1274687  avgt   60  113382.842 ± 2490.623  ns/op
infer_1286491  avgt   60  110763.581 ±  842.259  ns/op
infer_1329864  avgt   60  108473.284 ± 4547.068  ns/op
infer_1345914  avgt   60  104551.520 ±  336.243  ns/op
infer_147975   avgt   60   87700.209 ± 2342.228  ns/op
infer_182200   avgt   60  101667.732 ± 4628.543  ns/op
infer_424703   avgt   60  103756.438 ± 2358.109  ns/op
infer_672407   avgt   60  105891.101 ± 6329.288  ns/op

@benwtrent
Copy link
Member Author

run elasticsearch-ci/bwc

Copy link

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@benwtrent benwtrent merged commit 78fcd05 into elastic:master Jun 5, 2020
@benwtrent benwtrent deleted the feature/ml-inference-performance-improvements branch June 5, 2020 17:03
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Jun 5, 2020
This is a major refactor of the underlying inference logic.

The main refactor is now we are separating the model configuration and
the inference interfaces.

This has the following benefits:
 - we can store extra things with the model that are not
   necessary for inference (i.e. treenode split information gain)
 - we can optimize inference separate from model serialization and storage.
 - The user is oblivious to the optimizations (other than seeing the benefits).

A major part of this commit is removing all inference related methods from the
trained model configurations (ensemble, tree, etc.) and moving them to a new class.

This new class satisfies a new interface that is ONLY for inference.

The optimizations applied currently are:
- feature maps are flattened once
- feature extraction only happens once at the highest level
  (improves inference + feature importance through put)
- Only storing what we need for inference + feature importance on heap
benwtrent added a commit that referenced this pull request Jun 5, 2020
This is a major refactor of the underlying inference logic.

The main refactor is now we are separating the model configuration and
the inference interfaces.

This has the following benefits:
 - we can store extra things with the model that are not
   necessary for inference (i.e. treenode split information gain)
 - we can optimize inference separate from model serialization and storage.
 - The user is oblivious to the optimizations (other than seeing the benefits).

A major part of this commit is removing all inference related methods from the
trained model configurations (ensemble, tree, etc.) and moving them to a new class.

This new class satisfies a new interface that is ONLY for inference.

The optimizations applied currently are:
- feature maps are flattened once
- feature extraction only happens once at the highest level
  (improves inference + feature importance through put)
- Only storing what we need for inference + feature importance on heap
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants