Skip to content

Conversation

@jtibshirani
Copy link
Contributor

This PR integrates support for ANN with filtering added in Lucene 9.1. It adds
a new filter section to the _knn_search endpoint, which accepts a query (in
the Elasticsearch query DSL). The value can either be a single query or a list
of queries, which matches the syntax we use for defining filter clauses in a
bool query.

Closes #81788.

@jtibshirani jtibshirani added >enhancement :Search/Search Search-related issues that do not fall into other categories v8.2.0 labels Mar 7, 2022
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Mar 7, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@elasticsearchmachine
Copy link
Collaborator

Hi @jtibshirani, I've created a changelog YAML for you.

@mayya-sharipova mayya-sharipova self-requested a review March 8, 2022 18:05
Copy link
Contributor

@mayya-sharipova mayya-sharipova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtibshirani Thank you, great work! Overall LGTM, I've left some small comments.

this.fieldName = fieldName;
this.queryVector = queryVector;
this.numCands = numCands;
this.filterQueries = List.of();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make the parameter filterQueries also final and ensure filterQueries is never null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually allow it to be set in the filterQueries(...) method. Since this builder is only meant to be used internally to Elasticsearch, I just chose the setter methods that made our code the cleanest.

Copy link
Contributor

@mayya-sharipova mayya-sharipova Mar 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering, since it is a public class, and filterQueries are public methods can't this class be used by Java HLRC client? if we don't want to make filterQueries final may be just ensure that are not null when we set them? WDYT?

Thanks for other new changes, they LGTM as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this can be used from the Java HLRC. Since it lives in an xpack module, it's not available by default -- for example, for pinned queries we had to add an extra query builder just for the HLRC (#45779).

I ended up reworking this to just add the filters instead of replacing them. I also added a note to the query builder warning that it's an internal class. Thanks for the ideas.

can match. The kNN search will return the top `k` documents that also match
this filter. The value can be a single query or a list of queries. If `filter`
is not provided, all documents are allowed to match.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have some json example as with filter as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll actually add an Examples section at the end of these docs.

@jtibshirani
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/part-1

@jtibshirani
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/part-2

@jtibshirani jtibshirani merged commit 15708d5 into elastic:master Mar 10, 2022
@jtibshirani jtibshirani deleted the knn-filter branch March 10, 2022 23:53
@coreation
Copy link

Awesome job! Do I understand correctly from the tag in this PR that it will be available in ElasticSearch 8.2?

@jtibshirani
Copy link
Contributor Author

@coreation yes, that's the plan!

@coreation
Copy link

Awesome! Not to press, but merely as information, is there an estimated timeframe for when 8.2 might be released?

@jtibshirani
Copy link
Contributor Author

We don't usually give out release dates (even estimates), but I can say that we're hard at work on 8.2 now.

@coreation
Copy link

Ok thanks for the info @jtibshirani I want to press again, only something I was curious about. Looking forward to the release!

jtibshirani added a commit that referenced this pull request Apr 14, 2022
We implemented this in #84734 but forgot to update these docs.
jtibshirani added a commit that referenced this pull request Apr 14, 2022
We implemented this in #84734 but forgot to update these docs.
@jtibshirani jtibshirani added :Search Relevance/Vectors Vector search and removed :Search/Search Search-related issues that do not fall into other categories labels Jul 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support ANN with filtering

5 participants