Commit 79d9b36
authored
Optimize sort on long field (#48804)
* Optimize sort on numeric long and date fields (#39770)
Optimize sort on numeric long and date fields, when
the system property `es.search.long_sort_optimized` is true.
* Skip optimization if the index has duplicate data (#43121)
Skip sort optimization if the index has 50% or more data
with the same value.
When index has a lot of docs with the same value, sort
optimization doesn't make sense, as DistanceFeatureQuery
will produce same scores for these docs, and Lucene
will use the second sort to tie-break. This could be slower
than usual sorting.
* Sort leaves on search according to the primary numeric sort field (#44021)
This change pre-sort the index reader leaves (segment) prior to search
when the primary sort is a numeric field eligible to the distance feature
optimization. It also adds a tie breaker on `_doc` to the rewritten sort
in order to bypass the fact that leaves will be collected in a random order.
I ran this patch on the http_logs benchmark and the results are very promising:
```
| 50th percentile latency | desc_sort_timestamp | 220.706 | 136544 | 136324 | ms |
| 90th percentile latency | desc_sort_timestamp | 244.847 | 162084 | 161839 | ms |
| 99th percentile latency | desc_sort_timestamp | 316.627 | 172005 | 171688 | ms |
| 100th percentile latency | desc_sort_timestamp | 335.306 | 173325 | 172989 | ms |
| 50th percentile service time | desc_sort_timestamp | 218.369 | 1968.11 | 1749.74 | ms |
| 90th percentile service time | desc_sort_timestamp | 244.182 | 2447.2 | 2203.02 | ms |
| 99th percentile service time | desc_sort_timestamp | 313.176 | 2950.85 | 2637.67 | ms |
| 100th percentile service time | desc_sort_timestamp | 332.924 | 2959.38 | 2626.45 | ms |
| error rate | desc_sort_timestamp | 0 | 0 | 0 | % |
| Min Throughput | asc_sort_timestamp | 0.801824 | 0.800855 | -0.00097 | ops/s |
| Median Throughput | asc_sort_timestamp | 0.802595 | 0.801104 | -0.00149 | ops/s |
| Max Throughput | asc_sort_timestamp | 0.803282 | 0.801351 | -0.00193 | ops/s |
| 50th percentile latency | asc_sort_timestamp | 220.761 | 824.098 | 603.336 | ms |
| 90th percentile latency | asc_sort_timestamp | 251.741 | 853.984 | 602.243 | ms |
| 99th percentile latency | asc_sort_timestamp | 368.761 | 893.943 | 525.182 | ms |
| 100th percentile latency | asc_sort_timestamp | 431.042 | 908.85 | 477.808 | ms |
| 50th percentile service time | asc_sort_timestamp | 218.547 | 820.757 | 602.211 | ms |
| 90th percentile service time | asc_sort_timestamp | 249.578 | 849.886 | 600.308 | ms |
| 99th percentile service time | asc_sort_timestamp | 366.317 | 888.894 | 522.577 | ms |
| 100th percentile service time | asc_sort_timestamp | 430.952 | 908.401 | 477.45 | ms |
| error rate | asc_sort_timestamp | 0 | 0 | 0 | % |
```
So roughly 10x faster for the descending sort and 2-3x faster in the ascending case. Note
that I indexed the http_logs with a single client in order to simulate real time-based indices
where document are indexed in their timestamp order.
Relates #37043
* Remove nested collector in docs response
As we don't use cancellableCollector anymore, it should be removed from
the expected docs response.
* Use collector manager for search when necessary (#45829)
When we optimize sort, we sort segments by their min/max value.
As a collector expects to have segments in order,
we can not use a single collector for sorted segments.
Thus for such a case, we use collectorManager,
where for every segment a dedicated collector will be created.
* Use shared TopFieldCollector manager
Use shared TopFieldCollector manager for sort optimization.
This collector manager is able to exchange minimum competitive
score between collectors
* Correct calculation of avg value to avoid overflow
* Optimize calculating if index has duplicate data1 parent 2ba00c8 commit 79d9b36
File tree
12 files changed
+767
-297
lines changed- buildSrc/src/main/groovy/org/elasticsearch/gradle
- docs/reference/search
- server/src
- main/java/org/elasticsearch/search
- internal
- profile/query
- query
- test/java/org/elasticsearch/search
- query
- sort
- test/framework/src/main/java/org/elasticsearch/test
12 files changed
+767
-297
lines changedLines changed: 3 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
720 | 720 | | |
721 | 721 | | |
722 | 722 | | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
723 | 726 | | |
724 | 727 | | |
725 | 728 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
153 | 153 | | |
154 | 154 | | |
155 | 155 | | |
156 | | - | |
157 | | - | |
158 | | - | |
159 | | - | |
160 | | - | |
161 | | - | |
162 | | - | |
163 | | - | |
164 | | - | |
165 | | - | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
166 | 159 | | |
167 | 160 | | |
168 | 161 | | |
| |||
445 | 438 | | |
446 | 439 | | |
447 | 440 | | |
448 | | - | |
449 | | - | |
450 | | - | |
451 | | - | |
452 | | - | |
453 | | - | |
454 | | - | |
455 | | - | |
456 | | - | |
457 | | - | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
458 | 444 | | |
459 | 445 | | |
460 | 446 | | |
| |||
657 | 643 | | |
658 | 644 | | |
659 | 645 | | |
660 | | - | |
661 | | - | |
662 | | - | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
663 | 649 | | |
664 | 650 | | |
665 | | - | |
666 | | - | |
667 | | - | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
668 | 654 | | |
669 | 655 | | |
670 | | - | |
671 | | - | |
672 | | - | |
673 | | - | |
674 | | - | |
675 | | - | |
676 | | - | |
677 | | - | |
678 | | - | |
679 | | - | |
680 | | - | |
681 | | - | |
682 | | - | |
683 | | - | |
684 | | - | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
685 | 659 | | |
686 | 660 | | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
| 664 | + | |
| 665 | + | |
687 | 666 | | |
688 | 667 | | |
689 | 668 | | |
| |||
Lines changed: 86 additions & 42 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
| 30 | + | |
30 | 31 | | |
31 | 32 | | |
32 | 33 | | |
| |||
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
| 39 | + | |
38 | 40 | | |
39 | 41 | | |
40 | 42 | | |
| 43 | + | |
| 44 | + | |
41 | 45 | | |
42 | 46 | | |
43 | 47 | | |
44 | 48 | | |
45 | 49 | | |
46 | 50 | | |
47 | 51 | | |
| 52 | + | |
| 53 | + | |
48 | 54 | | |
49 | 55 | | |
50 | 56 | | |
51 | 57 | | |
52 | 58 | | |
53 | 59 | | |
| 60 | + | |
54 | 61 | | |
55 | 62 | | |
| 63 | + | |
56 | 64 | | |
57 | 65 | | |
58 | 66 | | |
| |||
131 | 139 | | |
132 | 140 | | |
133 | 141 | | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
134 | 168 | | |
135 | 169 | | |
136 | | - | |
137 | | - | |
138 | | - | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
139 | 218 | | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
140 | 222 | | |
141 | 223 | | |
142 | 224 | | |
| |||
168 | 250 | | |
169 | 251 | | |
170 | 252 | | |
171 | | - | |
| 253 | + | |
172 | 254 | | |
173 | | - | |
174 | 255 | | |
175 | 256 | | |
176 | | - | |
177 | | - | |
178 | | - | |
179 | | - | |
180 | | - | |
181 | | - | |
182 | | - | |
183 | | - | |
184 | | - | |
185 | | - | |
186 | | - | |
187 | | - | |
188 | | - | |
189 | | - | |
190 | | - | |
191 | | - | |
192 | | - | |
193 | | - | |
194 | | - | |
195 | | - | |
196 | | - | |
197 | | - | |
198 | | - | |
199 | | - | |
200 | | - | |
201 | | - | |
202 | | - | |
203 | | - | |
204 | | - | |
205 | | - | |
206 | | - | |
207 | | - | |
208 | | - | |
209 | | - | |
210 | | - | |
211 | | - | |
212 | | - | |
213 | 257 | | |
214 | 258 | | |
215 | 259 | | |
| |||
Lines changed: 0 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
52 | | - | |
53 | | - | |
54 | 52 | | |
55 | 53 | | |
56 | 54 | | |
| |||
Lines changed: 0 additions & 53 deletions
This file was deleted.
Lines changed: 0 additions & 15 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
31 | | - | |
32 | 31 | | |
33 | 32 | | |
34 | 33 | | |
35 | 34 | | |
36 | 35 | | |
37 | 36 | | |
38 | | - | |
39 | 37 | | |
40 | | - | |
41 | 38 | | |
42 | 39 | | |
43 | 40 | | |
| |||
150 | 147 | | |
151 | 148 | | |
152 | 149 | | |
153 | | - | |
154 | | - | |
155 | | - | |
156 | | - | |
157 | | - | |
158 | | - | |
159 | | - | |
160 | | - | |
161 | | - | |
162 | | - | |
163 | | - | |
164 | | - | |
165 | 150 | | |
166 | 151 | | |
167 | 152 | | |
| |||
0 commit comments