Skip to content

Conversation

@tvernum
Copy link
Contributor

@tvernum tvernum commented Sep 15, 2021

Detect a StringMatcher of "*" and optimize it as a predicate of

s -> true

This is about 80% faster than using an automaton.
The speed improvement is very minor on a single execution but can
add up to milliseconds if used in a tight loop - e.g. testing
against every index in a large cluster

Detect a StringMatcher of "*" and optimize it as a predicate of

   s -> true

This is about 80% faster than using an automaton.
This speed improvement is very minor on a single execution but can
add up to milliseconds is used in a tight loop - e.g. testing
against every indices in a large cluster
@tvernum tvernum added >enhancement :Security/Security Security issues without another label v8.0.0 v7.16.0 labels Sep 15, 2021
@tvernum tvernum requested a review from ywangd September 15, 2021 05:10
@elasticmachine elasticmachine added the Team:Security Meta label for security team label Sep 15, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

Copy link
Member

@ywangd ywangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

It would be great if you could address two non-critical comments.

Comment on lines 121 to 123
if (nonExactMatch.contains("*")) {
return new StringMatcher(description, s -> true);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are trying to squeeze every bit for performance, I wonder whether it could help a tiny bit if we perform this check in the include(String pattern) method. It already checks for wildcard and should be easy to tweak it to flag for seeing a match-all pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my original implementation, but it's really not necessary.
We're not looking to squeeze that level of performance in the Builder, just in the StringMatcher itself.
Testing every pattern that is included to see whether it's equal to * and then setting a flag isn't really going to make a difference, and it means carrying more state in the builder.

@tvernum
Copy link
Contributor Author

tvernum commented Sep 21, 2021

@elasticmachine update branch

@tvernum
Copy link
Contributor Author

tvernum commented Sep 21, 2021

@elasticmachine run elasticsearch-ci/part-2 please

Test failed due to: #78035

@tvernum tvernum merged commit 1a688b2 into elastic:master Sep 21, 2021
tvernum added a commit to tvernum/elasticsearch that referenced this pull request Sep 21, 2021
Detect a StringMatcher of "*" and optimize it as a predicate of

    s -> true

This is about 80% faster than using an automaton.
This speed improvement is very minor on a single execution but can
add up to milliseconds is used in a tight loop - e.g. testing
against every indices in a large cluster
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
7.x

elasticsearchmachine pushed a commit that referenced this pull request Sep 22, 2021
* Optimize StringMatcher for match-all patterns (#77738)

Detect a StringMatcher of "*" and optimize it as a predicate of

    s -> true

This is about 80% faster than using an automaton.
This speed improvement is very minor on a single execution but can
add up to milliseconds is used in a tight loop - e.g. testing
against every indices in a large cluster

* Fix List references for backport

Co-authored-by: Elastic Machine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :Security/Security Security issues without another label Team:Security Meta label for security team v7.16.0 v8.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants