Skip to content

Conversation

@jpountz
Copy link
Contributor

@jpountz jpountz commented Feb 9, 2017

This is an in-progress PR as I have a failure in MatchQueryIT that looks related to recent changes around graph token streams. If someone could help me dig it, I would appreciate.

@clintongormley clintongormley added the :Core/Infra/Core Core issues without another label label Feb 9, 2017
assertHitCount(searchResponse, 5L);
assertSearchHits(searchResponse, "1", "2", "3", "7", "8");
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the failing tests, this is due to a change in Lucene 6.5.0. In 6.4 Queries with multi word synonyms create a GraphQuery with all possible paths. So if you have "foo, foo bar" and a query like "foo boo" it will produce two boolean queries packed in a GraphQuery:
foo boo OR foo bar boo
Since we have all paths we can apply the cutoff frequency to each but in 6.5 the query parser will now create a query:
(+foo OR (+foo +bar)) AND boo
So the multi synonyms is now treated as a single term. This is better in terms of performance but it breaks some tests because the logic has changed. We cannot apply cutoff frequency on multi terms synonyms anymore (they are treated as a single term). Same problem with minimum_should_match.
IMO we should just rewrite this test and remove the multi terms synonym.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However I'll leave it for a follow-up PR if you don't mind.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can take over when you merged this PR. Thanks

@@ -0,0 +1,101 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this deserves a separate PR, WDYT ? We need to document how this works and make sure that we add all the warnings regarding the restrictions of using this filter in conjunction with others.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do that. I added it because the test that checks whether all analysis components are exposed failed otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I created #23104 to track the inclusion of this new filter. We can add the documentation and tests in a follow-up

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jpountz jpountz removed the WIP label Feb 10, 2017
@jpountz jpountz merged commit 709cc9b into elastic:master Feb 10, 2017
@jpountz jpountz deleted the upgrade/lucene-6.5.0 branch February 10, 2017 14:08
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Feb 10, 2017
* master:
  Fix alias HEAD requests
  Upgrade to lucene-6.5.0-snapshot-f919485. (elastic#23087)
  Add BulkProcessor methods with XContentType parameter (elastic#23078)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants