@@ -136,7 +136,7 @@ POST _reindex
136136// TEST[setup:twitter]
137137
138138You can limit the documents by adding a type to the `source` or by adding a
139- query. This will only copy ++tweet++'s made by `kimchy` into `new_twitter`:
139+ query. This will only copy tweets made by `kimchy` into `new_twitter`:
140140
141141[source,js]
142142--------------------------------------------------
@@ -161,11 +161,13 @@ POST _reindex
161161
162162`index` and `type` in `source` can both be lists, allowing you to copy from
163163lots of sources in one request. This will copy documents from the `_doc` and
164- `post` types in the `twitter` and `blog` index. It'd include the `post` type in
165- the `twitter` index and the `_doc` type in the `blog` index. If you want to be
166- more specific you'll need to use the `query`. It also makes no effort to handle
167- ID collisions. The target index will remain valid but it's not easy to predict
168- which document will survive because the iteration order isn't well defined.
164+ `post` types in the `twitter` and `blog` index. The copied documents would include the
165+ `post` type in the `twitter` index and the `_doc` type in the `blog` index. For more
166+ specific parameters, you can use `query`.
167+
168+ The Reindex API makes no effort to handle ID collisions. For such issues, the target index
169+ will remain valid, but it's not easy to predict which document will survive because
170+ the iteration order isn't well defined.
169171
170172[source,js]
171173--------------------------------------------------
@@ -203,8 +205,8 @@ POST _reindex
203205// CONSOLE
204206// TEST[setup:twitter]
205207
206- If you want a particular set of documents from the twitter index you'll
207- need to sort. Sorting makes the scroll less efficient but in some contexts
208+ If you want a particular set of documents from the ` twitter` index you'll
209+ need to use ` sort` . Sorting makes the scroll less efficient but in some contexts
208210it's worth it. If possible, prefer a more selective query to `size` and `sort`.
209211This will copy 10000 documents from `twitter` into `new_twitter`:
210212
@@ -226,8 +228,8 @@ POST _reindex
226228// TEST[setup:twitter]
227229
228230The `source` section supports all the elements that are supported in a
229- <<search-request-body,search request>>. For instance only a subset of the
230- fields from the original documents can be reindexed using source filtering
231+ <<search-request-body,search request>>. For instance, only a subset of the
232+ fields from the original documents can be reindexed using ` source` filtering
231233as follows:
232234
233235[source,js]
@@ -286,10 +288,10 @@ Set `ctx.op = "delete"` if your script decides that the document must be
286288 deleted from the destination index. The deletion will be reported in the
287289 `deleted` counter in the <<docs-reindex-response-body, response body>>.
288290
289- Setting `ctx.op` to anything else is an error. Setting any
290- other field in `ctx` is an error .
291+ Setting `ctx.op` to anything else will return an error, as will setting any
292+ other field in `ctx`.
291293
292- Think of the possibilities! Just be careful! With great power.... You can
294+ Think of the possibilities! Just be careful; you are able to
293295change:
294296
295297 * `_id`
@@ -299,7 +301,7 @@ change:
299301 * `_routing`
300302
301303Setting `_version` to `null` or clearing it from the `ctx` map is just like not
302- sending the version in an indexing request. It will cause that document to be
304+ sending the version in an indexing request; it will cause the document to be
303305overwritten in the target index regardless of the version on the target or the
304306version type you use in the `_reindex` request.
305307
@@ -310,11 +312,11 @@ preserved unless it's changed by the script. You can set `routing` on the
310312`keep`::
311313
312314Sets the routing on the bulk request sent for each match to the routing on
313- the match. The default.
315+ the match. This is the default value .
314316
315317`discard`::
316318
317- Sets the routing on the bulk request sent for each match to null.
319+ Sets the routing on the bulk request sent for each match to ` null` .
318320
319321`=<some text>`::
320322
@@ -422,7 +424,7 @@ POST _reindex
422424
423425The `host` parameter must contain a scheme, host, and port (e.g.
424426`https://otherhost:9200`). The `username` and `password` parameters are
425- optional and when they are present reindex will connect to the remote
427+ optional, and when they are present `_reindex` will connect to the remote
426428Elasticsearch node using basic auth. Be sure to use `https` when using
427429basic auth or the password will be sent in plain text.
428430
@@ -446,7 +448,7 @@ NOTE: Reindexing from remote clusters does not support
446448
447449Reindexing from a remote server uses an on-heap buffer that defaults to a
448450maximum size of 100mb. If the remote index includes very large documents you'll
449- need to use a smaller batch size. The example below sets the batch size `10`
451+ need to use a smaller batch size. The example below sets the batch size to `10`
450452which is very, very small.
451453
452454[source,js]
@@ -477,8 +479,8 @@ POST _reindex
477479
478480It is also possible to set the socket read timeout on the remote connection
479481with the `socket_timeout` field and the connection timeout with the
480- `connect_timeout` field. Both default to thirty seconds. This example
481- sets the socket read timeout to one minute and the connection timeout to ten
482+ `connect_timeout` field. Both default to 30 seconds. This example
483+ sets the socket read timeout to one minute and the connection timeout to 10
482484seconds:
483485
484486[source,js]
@@ -533,14 +535,14 @@ for details. `timeout` controls how long each write request waits for unavailabl
533535shards to become available. Both work exactly how they work in the
534536<<docs-bulk,Bulk API>>. As `_reindex` uses scroll search, you can also specify
535537the `scroll` parameter to control how long it keeps the "search context" alive,
536- eg `?scroll=10m`, by default it's 5 minutes.
538+ (e.g. `?scroll=10m`). The default value is 5 minutes.
537539
538540`requests_per_second` can be set to any positive decimal number (`1.4`, `6`,
539- `1000`, etc) and throttles rate at which reindex issues batches of index
541+ `1000`, etc) and throttles the rate at which `_reindex` issues batches of index
540542operations by padding each batch with a wait time. The throttling can be
541543disabled by setting `requests_per_second` to `-1`.
542544
543- The throttling is done by waiting between batches so that scroll that reindex
545+ The throttling is done by waiting between batches so that the ` scroll` which `_reindex`
544546uses internally can be given a timeout that takes into account the padding.
545547The padding time is the difference between the batch size divided by the
546548`requests_per_second` and the time spent writing. By default the batch size is
@@ -552,9 +554,9 @@ target_time = 1000 / 500 per second = 2 seconds
552554wait_time = target_time - write_time = 2 seconds - .5 seconds = 1.5 seconds
553555--------------------------------------------------
554556
555- Since the batch is issued as a single `_bulk` request large batch sizes will
557+ Since the batch is issued as a single `_bulk` request, large batch sizes will
556558cause Elasticsearch to create many requests and then wait for a while before
557- starting the next set. This is "bursty" instead of "smooth". The default is `-1`.
559+ starting the next set. This is "bursty" instead of "smooth". The default value is `-1`.
558560
559561[float]
560562[[docs-reindex-response-body]]
@@ -606,12 +608,12 @@ The JSON response looks like this:
606608
607609`took`::
608610
609- The number of milliseconds from start to end of the whole operation.
611+ The total milliseconds the entire operation took .
610612
611613`timed_out`::
612614
613615This flag is set to `true` if any of the requests executed during the
614- reindex has timed out.
616+ reindex timed out.
615617
616618`total`::
617619
@@ -657,7 +659,7 @@ The number of requests per second effectively executed during the reindex.
657659
658660`throttled_until_millis`::
659661
660- This field should always be equal to zero in a delete by query response. It only
662+ This field should always be equal to zero in a `_delete_by_query` response. It only
661663has meaning when using the <<docs-reindex-task-api, Task API>>, where it
662664indicates the next time (in milliseconds since epoch) a throttled request will be
663665executed again in order to conform to `requests_per_second`.
@@ -681,7 +683,7 @@ GET _tasks?detailed=true&actions=*reindex
681683--------------------------------------------------
682684// CONSOLE
683685
684- The responses looks like:
686+ The response looks like:
685687
686688[source,js]
687689--------------------------------------------------
@@ -726,9 +728,9 @@ The responses looks like:
726728// NOTCONSOLE
727729// We can't test tasks output
728730
729- <1> this object contains the actual status. It is just like the response json
730- with the important addition of the `total` field. `total` is the total number
731- of operations that the reindex expects to perform. You can estimate the
731+ <1> this object contains the actual status. It is identical to the response JSON
732+ except for the important addition of the `total` field. `total` is the total number
733+ of operations that the `_reindex` expects to perform. You can estimate the
732734progress by adding the `updated`, `created`, and `deleted` fields. The request
733735will finish when their sum is equal to the `total` field.
734736
@@ -743,7 +745,7 @@ GET /_tasks/taskId:1
743745
744746The advantage of this API is that it integrates with `wait_for_completion=false`
745747to transparently return the status of completed tasks. If the task is completed
746- and `wait_for_completion=false` was set on it them it'll come back with a
748+ and `wait_for_completion=false` was set, it will return a
747749`results` or an `error` field. The cost of this feature is the document that
748750`wait_for_completion=false` creates at `.tasks/task/${taskId}`. It is up to
749751you to delete that document.
@@ -761,10 +763,10 @@ POST _tasks/task_id:1/_cancel
761763--------------------------------------------------
762764// CONSOLE
763765
764- The `task_id` can be found using the tasks API above .
766+ The `task_id` can be found using the Tasks API.
765767
766- Cancelation should happen quickly but might take a few seconds. The task status
767- API above will continue to list the task until it is wakes to cancel itself.
768+ Cancelation should happen quickly but might take a few seconds. The Tasks
769+ API will continue to list the task until it wakes to cancel itself.
768770
769771
770772[float]
@@ -780,9 +782,9 @@ POST _reindex/task_id:1/_rethrottle?requests_per_second=-1
780782--------------------------------------------------
781783// CONSOLE
782784
783- The `task_id` can be found using the tasks API above.
785+ The `task_id` can be found using the Tasks API above.
784786
785- Just like when setting it on the `_reindex` API `requests_per_second`
787+ Just like when setting it on the Reindex API, `requests_per_second`
786788can be either `-1` to disable throttling or any decimal number
787789like `1.7` or `12` to throttle to that level. Rethrottling that speeds up the
788790query takes effect immediately but rethrotting that slows down the query will
@@ -806,7 +808,7 @@ POST test/_doc/1?refresh
806808--------------------------------------------------
807809// CONSOLE
808810
809- But you don't like the name `flag` and want to replace it with `tag`.
811+ but you don't like the name `flag` and want to replace it with `tag`.
810812`_reindex` can create the other index for you:
811813
812814[source,js]
@@ -836,7 +838,7 @@ GET test2/_doc/1
836838// CONSOLE
837839// TEST[continued]
838840
839- and it'll look like :
841+ which will return :
840842
841843[source,js]
842844--------------------------------------------------
@@ -854,8 +856,6 @@ and it'll look like:
854856--------------------------------------------------
855857// TESTRESPONSE
856858
857- Or you can search by `tag` or whatever you want.
858-
859859[float]
860860[[docs-reindex-slice]]
861861=== Slicing
@@ -902,7 +902,7 @@ POST _reindex
902902// CONSOLE
903903// TEST[setup:big_twitter]
904904
905- Which you can verify works with :
905+ You can verify this works by :
906906
907907[source,js]
908908----------------------------------------------------------------
@@ -912,7 +912,7 @@ POST new_twitter/_search?size=0&filter_path=hits.total
912912// CONSOLE
913913// TEST[continued]
914914
915- Which results in a sensible `total` like this one:
915+ which results in a sensible `total` like this one:
916916
917917[source,js]
918918----------------------------------------------------------------
@@ -928,7 +928,7 @@ Which results in a sensible `total` like this one:
928928[[docs-reindex-automatic-slice]]
929929==== Automatic slicing
930930
931- You can also let reindex automatically parallelize using <<sliced-scroll>> to
931+ You can also let `_reindex` automatically parallelize using <<sliced-scroll>> to
932932slice on `_uid`. Use `slices` to specify the number of slices to use:
933933
934934[source,js]
@@ -946,7 +946,7 @@ POST _reindex?slices=5&refresh
946946// CONSOLE
947947// TEST[setup:big_twitter]
948948
949- Which you also can verify works with :
949+ You can also this verify works by :
950950
951951[source,js]
952952----------------------------------------------------------------
@@ -955,7 +955,7 @@ POST new_twitter/_search?size=0&filter_path=hits.total
955955// CONSOLE
956956// TEST[continued]
957957
958- Which results in a sensible `total` like this one:
958+ which results in a sensible `total` like this one:
959959
960960[source,js]
961961----------------------------------------------------------------
@@ -979,7 +979,7 @@ section above, creating sub-requests which means it has some quirks:
979979sub-requests are "child" tasks of the task for the request with `slices`.
980980* Fetching the status of the task for the request with `slices` only contains
981981the status of completed slices.
982- * These sub-requests are individually addressable for things like cancellation
982+ * These sub-requests are individually addressable for things like cancelation
983983and rethrottling.
984984* Rethrottling the request with `slices` will rethrottle the unfinished
985985sub-request proportionally.
@@ -992,20 +992,20 @@ are distributed proportionally to each sub-request. Combine that with the point
992992above about distribution being uneven and you should conclude that the using
993993`size` with `slices` might not result in exactly `size` documents being
994994`_reindex`ed.
995- * Each sub-requests gets a slightly different snapshot of the source index
995+ * Each sub-request gets a slightly different snapshot of the source index,
996996though these are all taken at approximately the same time.
997997
998998[float]
999999[[docs-reindex-picking-slices]]
10001000===== Picking the number of slices
10011001
10021002If slicing automatically, setting `slices` to `auto` will choose a reasonable
1003- number for most indices. If you're slicing manually or otherwise tuning
1003+ number for most indices. If slicing manually or otherwise tuning
10041004automatic slicing, use these guidelines.
10051005
10061006Query performance is most efficient when the number of `slices` is equal to the
1007- number of shards in the index. If that number is large, (for example,
1008- 500) choose a lower number as too many `slices` will hurt performance. Setting
1007+ number of shards in the index. If that number is large (e.g. 500),
1008+ choose a lower number as too many `slices` will hurt performance. Setting
10091009`slices` higher than the number of shards generally does not improve efficiency
10101010and adds overhead.
10111011
@@ -1018,10 +1018,10 @@ documents being reindexed and cluster resources.
10181018[float]
10191019=== Reindex daily indices
10201020
1021- You can use `_reindex` in combination with <<modules-scripting-painless, Painless>>
1022- to reindex daily indices to apply a new template to the existing documents.
1021+ You can use `_reindex` in combination with <<modules-scripting-painless, Painless>>
1022+ to reindex daily indices to apply a new template to the existing documents.
10231023
1024- Assuming you have indices consisting of documents as following :
1024+ Assuming you have indices consisting of documents as follows :
10251025
10261026[source,js]
10271027----------------------------------------------------------------
@@ -1032,12 +1032,12 @@ PUT metricbeat-2016.05.31/_doc/1?refresh
10321032----------------------------------------------------------------
10331033// CONSOLE
10341034
1035- The new template for the `metricbeat-*` indices is already loaded into Elasticsearch
1035+ The new template for the `metricbeat-*` indices is already loaded into Elasticsearch,
10361036but it applies only to the newly created indices. Painless can be used to reindex
10371037the existing documents and apply the new template.
10381038
10391039The script below extracts the date from the index name and creates a new index
1040- with `-1` appended. All data from `metricbeat-2016.05.31` will be reindex
1040+ with `-1` appended. All data from `metricbeat-2016.05.31` will be reindexed
10411041into `metricbeat-2016.05.31-1`.
10421042
10431043[source,js]
@@ -1059,7 +1059,7 @@ POST _reindex
10591059// CONSOLE
10601060// TEST[continued]
10611061
1062- All documents from the previous metricbeat indices now can be found in the `*-1` indices.
1062+ All documents from the previous metricbeat indices can now be found in the `*-1` indices.
10631063
10641064[source,js]
10651065----------------------------------------------------------------
@@ -1069,13 +1069,13 @@ GET metricbeat-2016.05.31-1/_doc/1
10691069// CONSOLE
10701070// TEST[continued]
10711071
1072- The previous method can also be used in combination with <<docs-reindex-change-name, change the name of a field>>
1073- to only load the existing data into the new index, but also rename fields if needed.
1072+ The previous method can also be used in conjunction with <<docs-reindex-change-name, change the name of a field>>
1073+ to load only the existing data into the new index and rename any fields if needed.
10741074
10751075[float]
10761076=== Extracting a random subset of an index
10771077
1078- Reindex can be used to extract a random subset of an index for testing:
1078+ `_reindex` can be used to extract a random subset of an index for testing:
10791079
10801080[source,js]
10811081----------------------------------------------------------------
@@ -1100,5 +1100,5 @@ POST _reindex
11001100// CONSOLE
11011101// TEST[setup:big_twitter]
11021102
1103- <1> Reindex defaults to sorting by `_doc` so `random_score` won't have any
1103+ <1> `_reindex` defaults to sorting by `_doc` so `random_score` will not have any
11041104effect unless you override the sort to `_score`.
0 commit comments