Skip to content

Commit 04b7449

Browse files
authored
Deprecate CommonTermsQuery and cutoff_frequency (#42619)
* Deprecate CommonTermsQuery and cutoff_frequency Since the max_score optimization landed in Elasticsearch 7, the CommonTermsQuery is redundant and slower. Moreover the cutoff_frequency parameter for MatchQuery and MultiMatchQuery is redundant. Relates to #27096
1 parent 0ee8fed commit 04b7449

File tree

18 files changed

+145
-51
lines changed

18 files changed

+145
-51
lines changed

docs/reference/query-dsl/common-terms-query.asciidoc

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
[[query-dsl-common-terms-query]]
22
=== Common Terms Query
33

4+
deprecated[7.3.0,"Use <<query-dsl-match-query>> instead, which skips blocks of documents efficiently, without any configuration, provided that the total number of hits is not tracked."]
5+
46
The `common` terms query is a modern alternative to stopwords which
57
improves the precision and recall of search results (by taking stopwords
68
into account), without sacrificing performance.
@@ -83,6 +85,7 @@ GET /_search
8385
}
8486
--------------------------------------------------
8587
// CONSOLE
88+
// TEST[warning:Deprecated field [common] used, replaced by [[match] query which can efficiently skip blocks of documents if the total number of hits is not tracked]]
8689

8790
The number of terms which should match can be controlled with the
8891
<<query-dsl-minimum-should-match,`minimum_should_match`>>
@@ -108,6 +111,7 @@ GET /_search
108111
}
109112
--------------------------------------------------
110113
// CONSOLE
114+
// TEST[warning:Deprecated field [common] used, replaced by [[match] query which can efficiently skip blocks of documents if the total number of hits is not tracked]]
111115

112116
which is roughly equivalent to:
113117

@@ -154,6 +158,7 @@ GET /_search
154158
}
155159
--------------------------------------------------
156160
// CONSOLE
161+
// TEST[warning:Deprecated field [common] used, replaced by [[match] query which can efficiently skip blocks of documents if the total number of hits is not tracked]]
157162

158163
which is roughly equivalent to:
159164

@@ -209,6 +214,7 @@ GET /_search
209214
}
210215
--------------------------------------------------
211216
// CONSOLE
217+
// TEST[warning:Deprecated field [common] used, replaced by [[match] query which can efficiently skip blocks of documents if the total number of hits is not tracked]]
212218

213219
which is roughly equivalent to:
214220

@@ -270,6 +276,7 @@ GET /_search
270276
}
271277
--------------------------------------------------
272278
// CONSOLE
279+
// TEST[warning:Deprecated field [common] used, replaced by [[match] query which can efficiently skip blocks of documents if the total number of hits is not tracked]]
273280

274281
which is roughly equivalent to:
275282

docs/reference/query-dsl/match-query.asciidoc

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,8 @@ GET /_search
122122
[[query-dsl-match-query-cutoff]]
123123
===== Cutoff frequency
124124

125+
deprecated[7.3.0,"This option can be omitted as the <<query-dsl-match-query>> can skip block of documents efficiently, without any configuration, provided that the total number of hits is not tracked."]
126+
125127
The match query supports a `cutoff_frequency` that allows
126128
specifying an absolute or relative document frequency where high
127129
frequency terms are moved into an optional subquery and are only scored
@@ -158,6 +160,7 @@ GET /_search
158160
}
159161
--------------------------------------------------
160162
// CONSOLE
163+
// TEST[warning:Deprecated field [cutoff_frequency] used, replaced by [you can omit this option, the [match] query can skip block of documents efficiently if the total number of hits is not tracked]]
161164

162165
IMPORTANT: The `cutoff_frequency` option operates on a per-shard-level. This means
163166
that when trying it out on test indexes with low document numbers you

modules/analysis-common/src/test/resources/rest-api-spec/test/search.query/50_queries_with_synonyms.yml

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
11
---
22
"Test common terms query with stacked tokens":
3+
- skip:
4+
features: "warnings"
5+
36
- do:
47
indices.create:
58
index: test
@@ -47,6 +50,8 @@
4750
refresh: true
4851

4952
- do:
53+
warnings:
54+
- 'Deprecated field [common] used, replaced by [[match] query which can efficiently skip blocks of documents if the total number of hits is not tracked]'
5055
search:
5156
rest_total_hits_as_int: true
5257
body:
@@ -62,6 +67,8 @@
6267
- match: { hits.hits.2._id: "3" }
6368

6469
- do:
70+
warnings:
71+
- 'Deprecated field [common] used, replaced by [[match] query which can efficiently skip blocks of documents if the total number of hits is not tracked]'
6572
search:
6673
rest_total_hits_as_int: true
6774
body:
@@ -76,6 +83,8 @@
7683
- match: { hits.hits.1._id: "2" }
7784

7885
- do:
86+
warnings:
87+
- 'Deprecated field [common] used, replaced by [[match] query which can efficiently skip blocks of documents if the total number of hits is not tracked]'
7988
search:
8089
rest_total_hits_as_int: true
8190
body:
@@ -90,6 +99,8 @@
9099
- match: { hits.hits.2._id: "3" }
91100

92101
- do:
102+
warnings:
103+
- 'Deprecated field [common] used, replaced by [[match] query which can efficiently skip blocks of documents if the total number of hits is not tracked]'
93104
search:
94105
rest_total_hits_as_int: true
95106
body:
@@ -103,6 +114,8 @@
103114
- match: { hits.hits.0._id: "2" }
104115

105116
- do:
117+
warnings:
118+
- 'Deprecated field [common] used, replaced by [[match] query which can efficiently skip blocks of documents if the total number of hits is not tracked]'
106119
search:
107120
rest_total_hits_as_int: true
108121
body:
@@ -118,6 +131,8 @@
118131
- match: { hits.hits.1._id: "1" }
119132

120133
- do:
134+
warnings:
135+
- 'Deprecated field [common] used, replaced by [[match] query which can efficiently skip blocks of documents if the total number of hits is not tracked]'
121136
search:
122137
rest_total_hits_as_int: true
123138
body:
@@ -132,6 +147,8 @@
132147
- match: { hits.hits.0._id: "2" }
133148

134149
- do:
150+
warnings:
151+
- 'Deprecated field [common] used, replaced by [[match] query which can efficiently skip blocks of documents if the total number of hits is not tracked]'
135152
search:
136153
rest_total_hits_as_int: true
137154
body:
@@ -144,6 +161,8 @@
144161
- match: { hits.hits.0._id: "2" }
145162

146163
- do:
164+
warnings:
165+
- 'Deprecated field [common] used, replaced by [[match] query which can efficiently skip blocks of documents if the total number of hits is not tracked]'
147166
search:
148167
rest_total_hits_as_int: true
149168
body:
@@ -158,6 +177,8 @@
158177
- match: { hits.hits.2._id: "3" }
159178

160179
- do:
180+
warnings:
181+
- 'Deprecated field [cutoff_frequency] used, replaced by [you can omit this option, the [match] query can skip block of documents efficiently if the total number of hits is not tracked]'
161182
search:
162183
rest_total_hits_as_int: true
163184
body:
@@ -172,6 +193,8 @@
172193
- match: { hits.hits.1._id: "2" }
173194

174195
- do:
196+
warnings:
197+
- 'Deprecated field [cutoff_frequency] used, replaced by [you can omit this option, the [match] query can skip block of documents efficiently if the total number of hits is not tracked]'
175198
search:
176199
rest_total_hits_as_int: true
177200
body:
@@ -187,6 +210,8 @@
187210
- match: { hits.hits.2._id: "3" }
188211

189212
- do:
213+
warnings:
214+
- 'Deprecated field [cutoff_frequency] used, replaced by [you can omit this option, the [match] query can skip block of documents efficiently if the total number of hits is not tracked]'
190215
search:
191216
rest_total_hits_as_int: true
192217
body:
@@ -201,6 +226,8 @@
201226
- match: { hits.hits.1._id: "2" }
202227

203228
- do:
229+
warnings:
230+
- 'Deprecated field [cutoff_frequency] used, replaced by [you can omit this option, the [multi_match] query can skip block of documents efficiently if the total number of hits is not tracked]'
204231
search:
205232
rest_total_hits_as_int: true
206233
body:

server/src/main/java/org/apache/lucene/queries/BlendedTermQuery.java

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -278,6 +278,11 @@ public int hashCode() {
278278
return Objects.hash(classHash(), Arrays.hashCode(equalsTerms()));
279279
}
280280

281+
/**
282+
* @deprecated Since max_score optimization landed in 7.0, normal MultiMatchQuery
283+
* will achieve the same result without any configuration.
284+
*/
285+
@Deprecated
281286
public static BlendedTermQuery commonTermsBlendedQuery(Term[] terms, final float[] boosts, final float maxTermFrequency) {
282287
return new BlendedTermQuery(terms, boosts) {
283288
@Override

server/src/main/java/org/apache/lucene/queries/ExtendedCommonTermsQuery.java

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,11 @@
2626
* Extended version of {@link CommonTermsQuery} that allows to pass in a
2727
* {@code minimumNumberShouldMatch} specification that uses the actual num of high frequent terms
2828
* to calculate the minimum matching terms.
29+
*
30+
* @deprecated Since max_optimization optimization landed in 7.0, normal MatchQuery
31+
* will achieve the same result without any configuration.
2932
*/
33+
@Deprecated
3034
public class ExtendedCommonTermsQuery extends CommonTermsQuery {
3135

3236
public ExtendedCommonTermsQuery(Occur highFreqOccur, Occur lowFreqOccur, float maxTermFrequency) {

server/src/main/java/org/elasticsearch/index/query/CommonTermsQueryBuilder.java

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,9 +47,16 @@
4747
* and high-frequency terms are added to an optional boolean clause. The
4848
* optional clause is only executed if the required "low-frequency' clause
4949
* matches.
50+
*
51+
* @deprecated Since max_optimization optimization landed in 7.0, normal MatchQuery
52+
* will achieve the same result without any configuration.
5053
*/
54+
@Deprecated
5155
public class CommonTermsQueryBuilder extends AbstractQueryBuilder<CommonTermsQueryBuilder> {
5256

57+
public static final String COMMON_TERMS_QUERY_DEPRECATION_MSG = "[match] query which can efficiently " +
58+
"skip blocks of documents if the total number of hits is not tracked";
59+
5360
public static final String NAME = "common";
5461

5562
public static final float DEFAULT_CUTOFF_FREQ = 0.01f;
@@ -85,7 +92,9 @@ public class CommonTermsQueryBuilder extends AbstractQueryBuilder<CommonTermsQue
8592

8693
/**
8794
* Constructs a new common terms query.
95+
* @deprecated See {@link CommonTermsQueryBuilder} for more details.
8896
*/
97+
@Deprecated
8998
public CommonTermsQueryBuilder(String fieldName, Object text) {
9099
if (Strings.isEmpty(fieldName)) {
91100
throw new IllegalArgumentException("field name is null or empty");
@@ -99,7 +108,9 @@ public CommonTermsQueryBuilder(String fieldName, Object text) {
99108

100109
/**
101110
* Read from a stream.
111+
* @deprecated See {@link CommonTermsQueryBuilder} for more details.
102112
*/
113+
@Deprecated
103114
public CommonTermsQueryBuilder(StreamInput in) throws IOException {
104115
super(in);
105116
fieldName = in.readString();

server/src/main/java/org/elasticsearch/index/query/MatchQueryBuilder.java

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,8 +42,18 @@
4242
* result of the analysis.
4343
*/
4444
public class MatchQueryBuilder extends AbstractQueryBuilder<MatchQueryBuilder> {
45+
46+
private static final String CUTOFF_FREQUENCY_DEPRECATION_MSG = "you can omit this option, " +
47+
"the [match] query can skip block of documents efficiently if the total number of hits is not tracked";
48+
4549
public static final ParseField ZERO_TERMS_QUERY_FIELD = new ParseField("zero_terms_query");
46-
public static final ParseField CUTOFF_FREQUENCY_FIELD = new ParseField("cutoff_frequency");
50+
/**
51+
* @deprecated Since max_optimization optimization landed in 7.0, normal MatchQuery
52+
* will achieve the same result without any configuration.
53+
*/
54+
@Deprecated
55+
public static final ParseField CUTOFF_FREQUENCY_FIELD =
56+
new ParseField("cutoff_frequency").withAllDeprecated(CUTOFF_FREQUENCY_DEPRECATION_MSG);
4757
public static final ParseField LENIENT_FIELD = new ParseField("lenient");
4858
public static final ParseField FUZZY_TRANSPOSITIONS_FIELD = new ParseField("fuzzy_transpositions");
4959
public static final ParseField FUZZY_REWRITE_FIELD = new ParseField("fuzzy_rewrite");
@@ -235,7 +245,10 @@ public int maxExpansions() {
235245
* Set a cutoff value in [0..1] (or absolute number &gt;=1) representing the
236246
* maximum threshold of a terms document frequency to be considered a low
237247
* frequency term.
248+
*
249+
* @deprecated see {@link MatchQueryBuilder#CUTOFF_FREQUENCY_FIELD} for more details
238250
*/
251+
@Deprecated
239252
public MatchQueryBuilder cutoffFrequency(float cutoff) {
240253
this.cutoffFrequency = cutoff;
241254
return this;

server/src/main/java/org/elasticsearch/index/query/MultiMatchQueryBuilder.java

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,10 @@
5050
* Same as {@link MatchQueryBuilder} but supports multiple fields.
5151
*/
5252
public class MultiMatchQueryBuilder extends AbstractQueryBuilder<MultiMatchQueryBuilder> {
53+
54+
private static final String CUTOFF_FREQUENCY_DEPRECATION_MSG = "you can omit this option, " +
55+
"the [multi_match] query can skip block of documents efficiently if the total number of hits is not tracked";
56+
5357
public static final String NAME = "multi_match";
5458

5559
public static final MultiMatchQueryBuilder.Type DEFAULT_TYPE = MultiMatchQueryBuilder.Type.BEST_FIELDS;
@@ -63,7 +67,8 @@ public class MultiMatchQueryBuilder extends AbstractQueryBuilder<MultiMatchQuery
6367
private static final ParseField SLOP_FIELD = new ParseField("slop");
6468
private static final ParseField ZERO_TERMS_QUERY_FIELD = new ParseField("zero_terms_query");
6569
private static final ParseField LENIENT_FIELD = new ParseField("lenient");
66-
private static final ParseField CUTOFF_FREQUENCY_FIELD = new ParseField("cutoff_frequency");
70+
private static final ParseField CUTOFF_FREQUENCY_FIELD =
71+
new ParseField("cutoff_frequency").withAllDeprecated(CUTOFF_FREQUENCY_DEPRECATION_MSG);
6772
private static final ParseField TIE_BREAKER_FIELD = new ParseField("tie_breaker");
6873
private static final ParseField FUZZY_REWRITE_FIELD = new ParseField("fuzzy_rewrite");
6974
private static final ParseField MINIMUM_SHOULD_MATCH_FIELD = new ParseField("minimum_should_match");
@@ -484,7 +489,11 @@ public boolean lenient() {
484489
* Set a cutoff value in [0..1] (or absolute number &gt;=1) representing the
485490
* maximum threshold of a terms document frequency to be considered a low
486491
* frequency term.
492+
*
493+
* @deprecated Since max_score optimization landed in 7.0, normal MultiMatchQuery
494+
* will achieve the same result without any configuration.
487495
*/
496+
@Deprecated
488497
public MultiMatchQueryBuilder cutoffFrequency(float cutoff) {
489498
this.cutoffFrequency = cutoff;
490499
return this;
@@ -494,7 +503,11 @@ public MultiMatchQueryBuilder cutoffFrequency(float cutoff) {
494503
* Set a cutoff value in [0..1] (or absolute number &gt;=1) representing the
495504
* maximum threshold of a terms document frequency to be considered a low
496505
* frequency term.
506+
*
507+
* @deprecated Since max_score optimization landed in 7.0, normal MultiMatchQuery
508+
* will achieve the same result without any configuration.
497509
*/
510+
@Deprecated
498511
public MultiMatchQueryBuilder cutoffFrequency(Float cutoff) {
499512
this.cutoffFrequency = cutoff;
500513
return this;

server/src/main/java/org/elasticsearch/index/query/QueryBuilders.java

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,10 @@ public static MatchQueryBuilder matchQuery(String name, Object text) {
6666
*
6767
* @param fieldName The field name.
6868
* @param text The query text (to be analyzed).
69+
*
70+
* @deprecated See {@link CommonTermsQueryBuilder}
6971
*/
72+
@Deprecated
7073
public static CommonTermsQueryBuilder commonTermsQuery(String fieldName, Object text) {
7174
return new CommonTermsQueryBuilder(fieldName, text);
7275
}

server/src/main/java/org/elasticsearch/index/search/MatchQuery.java

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -194,6 +194,10 @@ public void setOccur(BooleanClause.Occur occur) {
194194
this.occur = occur;
195195
}
196196

197+
/**
198+
* @deprecated See {@link MatchQueryBuilder#setCommonTermsCutoff(Float)} for more details
199+
*/
200+
@Deprecated
197201
public void setCommonTermsCutoff(Float cutoff) {
198202
this.commonTermsCutoff = cutoff;
199203
}

0 commit comments

Comments
 (0)