|
1 | 1 | [[query-dsl-multi-term-rewrite]] |
2 | | -== Multi Term Query Rewrite |
3 | | - |
4 | | -Multi term queries, like |
5 | | -<<query-dsl-wildcard-query,wildcard>> and |
6 | | -<<query-dsl-prefix-query,prefix>> are called |
7 | | -multi term queries and end up going through a process of rewrite. This |
8 | | -also happens on the |
9 | | -<<query-dsl-query-string-query,query_string>>. |
10 | | -All of those queries allow to control how they will get rewritten using |
11 | | -the `rewrite` parameter: |
12 | | - |
13 | | -* `constant_score` (default): A rewrite method that performs like |
14 | | -`constant_score_boolean` when there are few matching terms and otherwise |
15 | | -visits all matching terms in sequence and marks documents for that term. |
16 | | -Matching documents are assigned a constant score equal to the query's |
17 | | -boost. |
18 | | -* `scoring_boolean`: A rewrite method that first translates each term |
19 | | -into a should clause in a boolean query, and keeps the scores as |
20 | | -computed by the query. Note that typically such scores are meaningless |
21 | | -to the user, and require non-trivial CPU to compute, so it's almost |
22 | | -always better to use `constant_score`. This rewrite method will hit |
23 | | -too many clauses failure if it exceeds the boolean query limit (defaults |
24 | | -to `1024`). |
25 | | -* `constant_score_boolean`: Similar to `scoring_boolean` except scores |
26 | | -are not computed. Instead, each matching document receives a constant |
27 | | -score equal to the query's boost. This rewrite method will hit too many |
28 | | -clauses failure if it exceeds the boolean query limit (defaults to |
29 | | -`1024`). |
30 | | -* `top_terms_N`: A rewrite method that first translates each term into |
31 | | -should clause in boolean query, and keeps the scores as computed by the |
32 | | -query. This rewrite method only uses the top scoring terms so it will |
33 | | -not overflow boolean max clause count. The `N` controls the size of the |
34 | | -top scoring terms to use. |
35 | | -* `top_terms_boost_N`: A rewrite method that first translates each term |
36 | | -into should clause in boolean query, but the scores are only computed as |
37 | | -the boost. This rewrite method only uses the top scoring terms so it |
38 | | -will not overflow the boolean max clause count. The `N` controls the |
39 | | -size of the top scoring terms to use. |
40 | | -* `top_terms_blended_freqs_N`: A rewrite method that first translates each |
41 | | -term into should clause in boolean query, but all term queries compute scores |
42 | | -as if they had the same frequency. In practice the frequency which is used |
43 | | -is the maximum frequency of all matching terms. This rewrite method only uses |
44 | | -the top scoring terms so it will not overflow boolean max clause count. The |
45 | | -`N` controls the size of the top scoring terms to use. |
| 2 | +== `rewrite` Parameter |
| 3 | + |
| 4 | +WARNING: This parameter is for expert users only. Changing the value of |
| 5 | +this parameter can impact search performance and relevance. |
| 6 | + |
| 7 | +{es} uses https://lucene.apache.org/core/[Apache Lucene] internally to power |
| 8 | +indexing and searching. In their original form, Lucene cannot execute the |
| 9 | +following queries: |
| 10 | + |
| 11 | +* <<query-dsl-fuzzy-query, `fuzzy`>> |
| 12 | +* <<query-dsl-prefix-query, `prefix`>> |
| 13 | +* <<query-dsl-query-string-query, `query_string`>> |
| 14 | +* <<query-dsl-regexp-query, `regexp`>> |
| 15 | +* <<query-dsl-wildcard-query, `wildcard`>> |
| 16 | + |
| 17 | +To execute them, Lucene changes these queries to a simpler form, such as a |
| 18 | +<<query-dsl-bool-query, `bool` query>> or a |
| 19 | +https://en.wikipedia.org/wiki/Bit_array[bit set]. |
| 20 | + |
| 21 | +The `rewrite` parameter determines: |
| 22 | + |
| 23 | +* How Lucene calculates the relevance scores for each matching document |
| 24 | +* Whether Lucene changes the original query to a `bool` |
| 25 | +query or bit set |
| 26 | +* If changed to a `bool` query, which `term` query clauses are included |
| 27 | + |
| 28 | +[float] |
| 29 | +[[rewrite-param-valid-values]] |
| 30 | +=== Valid values |
| 31 | + |
| 32 | +`constant_score` (Default):: |
| 33 | +Uses the `constant_score_boolean` method for fewer matching terms. Otherwise, |
| 34 | +this method finds all matching terms in sequence and returns matching documents |
| 35 | +using a bit set. |
| 36 | + |
| 37 | +`constant_score_boolean`:: |
| 38 | +Assigns each document a relevance score equal to the `boost` |
| 39 | +parameter. |
| 40 | ++ |
| 41 | +This method changes the original query to a <<query-dsl-bool-query, `bool` |
| 42 | +query>>. This `bool` query contains a `should` clause and |
| 43 | +<<query-dsl-term-query, `term` query>> for each matching term. |
| 44 | ++ |
| 45 | +This method can cause the final `bool` query to exceed the clause limit in the |
| 46 | +<<indices-query-bool-max-clause-count, `indices.query.bool.max_clause_count`>> |
| 47 | +setting. If the query exceeds this limit, {es} returns an error. |
| 48 | + |
| 49 | +`scoring_boolean`:: |
| 50 | +Calculates a relevance score for each matching document. |
| 51 | ++ |
| 52 | +This method changes the original query to a <<query-dsl-bool-query, `bool` |
| 53 | +query>>. This `bool` query contains a `should` clause and |
| 54 | +<<query-dsl-term-query, `term` query>> for each matching term. |
| 55 | ++ |
| 56 | +This method can cause the final `bool` query to exceed the clause limit in the |
| 57 | +<<indices-query-bool-max-clause-count, `indices.query.bool.max_clause_count`>> |
| 58 | +setting. If the query exceeds this limit, {es} returns an error. |
| 59 | + |
| 60 | +`top_terms_blended_freqs_N`:: |
| 61 | +Calculates a relevance score for each matching document as if all terms had the |
| 62 | +same frequency. This frequency is the maximum frequency of all matching terms. |
| 63 | ++ |
| 64 | +This method changes the original query to a <<query-dsl-bool-query, `bool` |
| 65 | +query>>. This `bool` query contains a `should` clause and |
| 66 | +<<query-dsl-term-query, `term` query>> for each matching term. |
| 67 | ++ |
| 68 | +The final `bool` query only includes `term` queries for the top `N` scoring |
| 69 | +terms. |
| 70 | ++ |
| 71 | +You can use this method to avoid exceeding the clause limit in the |
| 72 | +<<indices-query-bool-max-clause-count, `indices.query.bool.max_clause_count`>> |
| 73 | +setting. |
| 74 | + |
| 75 | +`top_terms_boost_N`:: |
| 76 | +Assigns each matching document a relevance score equal to the `boost` parameter. |
| 77 | ++ |
| 78 | +This method changes the original query to a <<query-dsl-bool-query, `bool` |
| 79 | +query>>. This `bool` query contains a `should` clause and |
| 80 | +<<query-dsl-term-query, `term` query>> for each matching term. |
| 81 | ++ |
| 82 | +The final `bool` query only includes `term` queries for the top `N` terms. |
| 83 | ++ |
| 84 | +You can use this method to avoid exceeding the clause limit in the |
| 85 | +<<indices-query-bool-max-clause-count, `indices.query.bool.max_clause_count`>> |
| 86 | +setting. |
| 87 | + |
| 88 | +`top_terms_N`:: |
| 89 | +Calculates a relevance score for each matching document. |
| 90 | ++ |
| 91 | +This method changes the original query to a <<query-dsl-bool-query, `bool` |
| 92 | +query>>. This `bool` query contains a `should` clause and |
| 93 | +<<query-dsl-term-query, `term` query>> for each matching term. |
| 94 | ++ |
| 95 | +The final `bool` query |
| 96 | +only includes `term` queries for the top `N` scoring terms. |
| 97 | ++ |
| 98 | +You can use this method to avoid exceeding the clause limit in the |
| 99 | +<<indices-query-bool-max-clause-count, `indices.query.bool.max_clause_count`>> |
| 100 | +setting. |
| 101 | + |
| 102 | +[float] |
| 103 | +[[rewrite-param-perf-considerations]] |
| 104 | +=== Performance considerations for the `rewrite` parameter |
| 105 | +For most uses, we recommend using the `constant_score`, |
| 106 | +`constant_score_boolean`, or `top_terms_boost_N` rewrite methods. |
| 107 | + |
| 108 | +Other methods calculate relevance scores. These score calculations are often |
| 109 | +expensive and do not improve query results. |
0 commit comments