Skip to content

Conversation

@droberts195
Copy link

@droberts195 droberts195 commented Jul 28, 2022

This change will facilitate a performance improvement on the C++
side. The request ID and cache hit indicator are the parts that
need to be changed when the C++ process responds to an inference
request. Having them at the top level means we do not need to
parse and manipulate the original response - we can simply cache
the inner object of the response and add the outer fields around
it when serializing it.

Replaces #88485
Companion to elastic/ml-cpp#2376

This change will facilitate a performance improvement on the C++
side. The request ID and cache hit indicator are the parts that
need to be changed when the C++ process responds to an inference
request. Having them at the top level means we do not need to
parse and manipulate the original response - we can simply cache
the inner object of the response and add the outer fields around
it when serializing it.
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Jul 28, 2022
@droberts195
Copy link
Author

Ah, the time taken should move up too, as that is calculated differently for cache hits.


errorCount++;

logger.trace(() -> format("[%s] Parsed error with id [%s]", deploymentId, errorResult.requestId()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing how processResult is synchronized and processErrorResult isn't, errorCount can be woefully incorrect due to race conditions.

That can be fixed in a different commit. This particular change looks good.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's quite a simple change I addressed it in 1bdf3a0

droberts195 pushed a commit to droberts195/elasticsearch that referenced this pull request Aug 3, 2022
These tests will fail if elastic/ml-cpp#2376 with
them unmuted. elastic#88901 will follow up with the Java
side changes.
droberts195 pushed a commit that referenced this pull request Aug 3, 2022
These tests will fail if elastic/ml-cpp#2376 with
them unmuted. #88901 will follow up with the Java
side changes.
droberts195 pushed a commit to elastic/ml-cpp that referenced this pull request Aug 3, 2022
…2376)

Previously the inference cache stored complete results, including
a request ID and time taken. This was inefficient as it then meant
the original response had to be parsed and modified before sending
back to the Java side.

This PR changes the cache to store just the inner portion of the
inference result. Then the outer layer is added per request after
retrieving from the cache.

Additionally, the result writing functions are moved into a class
of their own, which means they can be unit tested.

Companion to elastic/elasticsearch#88901
@droberts195
Copy link
Author

@elasticmachine update branch

@droberts195 droberts195 merged commit 8c21d03 into elastic:main Aug 3, 2022
@droberts195 droberts195 deleted the refactor_pytorch_results branch August 3, 2022 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:ml Machine learning >non-issue Team:ML Meta label for the ML team v8.5.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants