Skip to content

Conversation

@BurntSushi
Copy link
Member

This fixes a bug in how prefilters were applied for multi-regexes compiled with "all" semantics. It turns out that this corresponds to the regex crate's RegexSet API, but only its is_match routine.

See the comment on the regression test added in this PR for an explanation of what happened. Basically, it came down to incorrectly using Aho-Corasick's "standard" semantics, which doesn't necessarily report leftmost matches. Since the regex crate is really all about leftmost matching, this can lead to skipping over parts of the haystack and thus lead to missing matches.

Fixes #1070

This fixes a bug in how prefilters were applied for multi-regexes
compiled with "all" semantics. It turns out that this corresponds to the
regex crate's RegexSet API, but only its `is_match` routine.

See the comment on the regression test added in this PR for an
explanation of what happened. Basically, it came down to incorrectly
using Aho-Corasick's "standard" semantics, which doesn't necessarily
report leftmost matches. Since the regex crate is really all about
leftmost matching, this can lead to skipping over parts of the haystack
and thus lead to missing matches.

Fixes #1070
The main reason we used mips before was to get test coverage on a big
endian target. Now that mips no longer seems to work[1], I wanted to
add at least one other big endian target. From the tier 2 supported
platforms[2], the only big endian targets I could find were powerpc and
s390x. So we just add both here.

[1]: rust-lang/compiler-team#648
[2]: https://doc.rust-lang.org/nightly/rustc/platform-support.html#tier-2-with-host-tools
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RegexSet and Regex give different results for the same pattern in 1.9

2 participants