1- === 搜索阶段
2- 在最初阶段 _query phase_ 时, ((("distributed search execution", "query phase"))) ((("query phase of distributed search"))) 搜索是广播查询索引中的每一个分片复本,不管是主本还是副本。每个分片执行本地查询,同时 ((("priority queue"))) 创建文档命中后的 _priority queue_ 。
1+ === Query Phase
32
4- .优先队列
3+ During the initial _query phase_ , the((("distributed search execution", "query phase")))((("query phase of distributed search"))) query is broadcast to a shard copy (a
4+ primary or replica shard) of every shard in the index. Each shard executes
5+ the search locally and ((("priority queue"))) builds a _priority queue_ of matching documents.
6+
7+ .Priority Queue
58****
6- _priority queue_ 仅仅是一个含有命中文档的 _top-n_ 过滤后列表。优先队列的大小取决于分页参数 `from` 和 `size` 。例如,如下搜索请求将需要足够大的优先队列来放入100条文档。
9+
10+ A _priority queue_ is just a sorted list that holds the _top-n_ matching
11+ documents. The size of the priority queue depends on the pagination
12+ parameters `from` and `size`. For example, the following search request
13+ would require a priority queue big enough to hold 100 documents:
714
815[source,js]
916--------------------------------------------------
@@ -15,30 +22,52 @@ GET /_search
1522--------------------------------------------------
1623****
1724
18- 查询过程在 <<img-distrib-search>> 中有描述。
25+ The query phase process is depicted in <<img-distrib-search>>.
1926
2027[[img-distrib-search]]
21- .Query phase of distributed s
22- .查询过程分布式搜索
23- image::images/elas_0901.png["查询过程分布式搜索"]
28+ .Query phase of distributed search
29+ image::images/elas_0901.png["Query phase of distributed search"]
2430
25- 查询过程包含以下几个步骤 :
31+ The query phase consists of the following three steps :
2632
27- 1. 客户端发送 `search` 请求到 `Node 3` ,会差生一个大小为 `from + size` 的空优先队列。
33+ 1. The client sends a `search` request to `Node 3` , which creates an empty
34+ priority queue of size `from + size` .
2835
29- 2. `Node 3` 将查询请求前转到每个索引的每个分片中的主本或复本去。每个分片执行本地查询并添加结果到大小为 `from + size` 的本地优先队列中。
36+ 2. `Node 3` forwards the search request to a primary or replica copy of every
37+ shard in the index. Each shard executes the query locally and adds the
38+ results into a local sorted priority queue of size `from + size` .
3039
31- 3. 每个分片返回文档的IDs并且将所有优先队列中文档归类到对应的节点, `Node 3` 合并这些值到其优先队列中来产生一个全局排序后的列表。
40+ 3. Each shard returns the doc IDs and sort values of all the docs in its
41+ priority queue to the coordinating node, `Node 3` , which merges these
42+ values into its own priority queue to produce a globally sorted list of
43+ results.
3244
33- 当查询请求到达节点的时候,节点变成了并列节点。 ((("nodes", "coordinating node for search requests"))) 这个节点任务是广播查询请求到所有相关节点并收集其他节点的返回状态存入全局排序后的集合,状态最终可以返回到客户端。
45+ When a search request is sent to a node, that node becomes the coordinating
46+ node.((("nodes", "coordinating node for search requests"))) It is the job of this node to broadcast the search request to all
47+ involved shards, and to gather their responses into a globally sorted result
48+ set that it can return to the client.
3449
35- 第一步是广播请求到索引中的每个几点钟一个分片复本去。就像 <<distrib-read,document `GET` requests>> 查询请求可以被某个主分片或其副本处理, ((("shards", "handling search requests"))) 则是在结合硬件的时候处理多个复本如何增加查询吞吐率。一个并列节点将在之后的请求中轮询所有的分片复本来分散负载。
50+ The first step is to broadcast the request to a shard copy of every node in
51+ the index. Just like <<distrib-read,document `GET` requests>>, search requests
52+ can be handled by a primary shard or by any of its replicas.((("shards", "handling search requests"))) This is how more
53+ replicas (when combined with more hardware) can increase search throughput.
54+ A coordinating node will round-robin through all shard copies on subsequent
55+ requests in order to spread the load.
3656
37- 每个分片在本地执行查询请求并且创建一个长度为 `from + size` — 的优先队列;换句话说,它自己的查询结果来满足全局查询请求,它返回一个轻量级的结果列表到并列节点上,其中并列节点仅包含文档IDs和排序的任何值,比如 `_score` 。
57+ Each shard executes the query locally and builds a sorted priority queue of
58+ length `from + size` — ; in other words, enough results to satisfy the global
59+ search request all by itself. It returns a lightweight list of results to the
60+ coordinating node, which contains just the doc IDs and any values required for
61+ sorting, such as the `_score` .
3862
39- 并列节点合并了这些分片段到其排序后的优先队列,这些队列代表着全局排序结果集合,以下是查询过程结束。
63+ The coordinating node merges these shard-level results into its own sorted
64+ priority queue, which represents the globally sorted result set. Here the query
65+ phase ends.
4066
4167[NOTE]
4268====
43- 一个索引可被一个或几个主分片组成, ((("indices", "multi-index search"))) 所以一条搜索请求到单独的索引时需要参考多个分片。除了涉及到更多的分片, _multiple_ 或者 _all_ 索引搜索工作方式是一样的。
69+ An index can consist of one or more primary shards,((("indices", "multi-index search"))) so a search request
70+ against a single index needs to be able to combine the results from multiple
71+ shards. A search against _multiple_ or _all_ indices works in exactly the same
72+ way--there are just more shards involved.
4473====
0 commit comments