Make reindexing managed by a persistent task #43382

Tim-Brooks · 2019-06-19T16:38:18Z

This is related to #42612. Currently the reindexing transport action
creates a task on the local coordinator node. Unfortunately this is not
resilient to coordinator node failures. This commit adds a new action
that creates a reindexing job as a persistent task.

elasticmachine · 2019-06-19T16:38:20Z

Pinging @elastic/es-distributed

Tim-Brooks · 2019-06-19T16:39:05Z

This PR still needs a few cleanups. I will assign reviewers and comment when it is ready.

Tim-Brooks · 2019-07-10T03:30:22Z

@henningandersen @ywelsch I have adjusted this PR for some change in the direction of returning the task-id from the allocated persistent task. I have not adjusted it to work with rethrottling yet. But it appears that the other test are passing.

henningandersen

LGTM.

This approach looks good. Apart from minor comments I think it is ready to merge to feature branch. Feel free to defer some of them to follow-ups if that is easier.

henningandersen · 2019-07-11T12:00:55Z

modules/reindex/src/main/java/org/elasticsearch/index/reindex/RestReindexAction.java

+             * task can't totally validate until it starts but this is better than
+             * nothing.
+             */
+            ActionRequestValidationException validationException = internal.getReindexRequest().validate();


This can go outside the if now.

henningandersen · 2019-07-11T12:05:33Z

modules/reindex/src/main/java/org/elasticsearch/index/reindex/StartReindexJobAction.java

+        }
+    }
+
+    public static class Response extends AcknowledgedResponse {


It seems we either return acknowledged=true or an exception. Could this just be an ActionResponse then?

henningandersen · 2019-07-11T12:07:04Z

.../reindex/src/main/java/org/elasticsearch/index/reindex/TransportGetReindexJobTaskAction.java

+
+import java.util.List;
+
+public class TransportGetReindexJobTaskAction extends TransportTasksAction<ReindexTask, GetReindexJobTaskAction.Request,


This looks unused now?

henningandersen · 2019-07-11T12:08:38Z

modules/reindex/src/main/java/org/elasticsearch/index/reindex/TransportRethrottleAction.java

+            rethrottle(logger, clusterService.localNode().getId(), client, bulkByScrollTask, request.getRequestsPerSecond(), listener);
+        } else if (task instanceof ReindexTask) {
+            BulkByScrollTask childTask = null;
+            for (Task task1 : taskManager.getTasks().values()) {


task1 rename to childCandidate?

henningandersen · 2019-07-11T12:28:12Z

server/src/main/java/org/elasticsearch/index/reindex/ReindexJobState.java

+        return jobException;
+    }
+
+    public TaskId getTaskId() {


Maybe rename to getEphemeralTaskId(), it could make call-site code a little easier to read (at least I mixed it up with the persistent task id).

henningandersen · 2019-07-11T12:37:01Z

server/src/main/java/org/elasticsearch/tasks/TaskInfo.java

        builder.field("running_time_in_nanos", runningTimeNanos);
        builder.field("cancellable", cancellable);
-        if (parentTaskId.isSet()) {
+        if (parentTaskId.isSet() && parentTaskId.getNodeId().equals("cluster") == false) {


Let us add a todo here, to ensure we revisit this. I think this means that cluster: parent ids no longer appear in the _tasks output, which could harm diagnostics/troubleshooting.

Tim-Brooks · 2019-07-15T22:49:59Z

@ywelsch - This is probably ready for another look. I have address Henning's comments. The primary issue currently is due to a task submitting another task, there are some races with the child tasks being available for a follow-up rethrottle action. I have currently resolved this for testing purposes by a spin loop waiting for all the tasks to be available in Rethrottle action.

I think the fix is to extract the BulkByScrollTask and reindex logic out to a different place that allows the entire setup to be synchronous with the persistent task starting (so a single ephemeral task). However, this work will continue to make this PR larger and extend the scope. I guess I just want to know if we should include that in this PR, or add that as a meta issue (it would probably be the next task I would work on, but it would be in a separate PR).

ywelsch

I've left a few more minor comments, and one item (rethrottle) that we probably need to discuss

modules/reindex/src/main/java/org/elasticsearch/index/reindex/StartReindexJobAction.java

modules/reindex/src/main/java/org/elasticsearch/index/reindex/TransportReindexAction.java

modules/reindex/src/main/java/org/elasticsearch/index/reindex/TransportRethrottleAction.java

...es/reindex/src/main/java/org/elasticsearch/index/reindex/TransportStartReindexJobAction.java

server/src/main/java/org/elasticsearch/index/reindex/ReindexTask.java

Tim-Brooks · 2019-07-17T16:58:12Z

@ywelsch I made the changes.

ywelsch

LGTM

Tim-Brooks added 19 commits June 4, 2019 14:16

WIP

e3191f1

Work on task

c5cf831

Merge remote-tracking branch 'upstream/master' into persistent_reindex

d3f76d9

WIP

03c55d6

WIP

8a29bb7

WIP

7272fc0

Changes

15d6a45

Merge remote-tracking branch 'upstream/master' into persistent_reindex

a8fdc23

Changes

7dc09e3

Changes

addb374

Merge remote-tracking branch 'upstream/master' into persistent_reindex

6e856c4

WIP

7265547

Merge remote-tracking branch 'upstream/master' into persistent_reindex

e354574

Merge remote-tracking branch 'upstream/master' into persistent_reindex

2037fdc

Work on test

a8ffa25

REmove

4023500

Changes

22ba62b

Change

1fb3f57

Merge remote-tracking branch 'upstream/master' into persistent_reindex

ccb1a00

Tim-Brooks added >enhancement v8.0.0 :Distributed Indexing/Reindex Issues relating to reindex that are not caused by issues further down labels Jun 19, 2019

Tim-Brooks changed the title ~~Persistent reindex~~ Make reindexing managed by a persistent task Jun 19, 2019

Tim-Brooks changed the base branch from master to reindex_v2 June 19, 2019 16:38

Tim-Brooks added 4 commits June 19, 2019 11:03

License

02c3e59

Security changes

9ee47c9

Security fixes

0db2bae

Small cleanup

b394a10

ywelsch requested review from henningandersen and ywelsch July 10, 2019 08:10

Tim-Brooks added 5 commits July 10, 2019 16:03

Changes

99b906a

Fix

a3e7f1f

Rethrottle

1c26b82

Add validation

59c671f

Changes

ff67a35

henningandersen approved these changes Jul 11, 2019

View reviewed changes

Tim-Brooks added 8 commits July 11, 2019 14:16

Tests

1e3de1a

Fix test

9dc3c1a

Merge branch 'reindex_v2' into persistent_reindex

d42bd72

Change

41837ff

Dispatch

a847c4e

Wait

ce5ec9b

Merge branch 'reindex_v2' into persistent_reindex

6b59df0

Changes

68586d9

ywelsch suggested changes Jul 16, 2019

View reviewed changes

Tim-Brooks added 2 commits July 16, 2019 16:54

Review changes

f347160

Changes

9fd28bf

Tim-Brooks requested a review from ywelsch July 17, 2019 16:57

ywelsch approved these changes Jul 17, 2019

View reviewed changes

Tim-Brooks added 2 commits July 17, 2019 15:05

Merge branch 'reindex_v2' into persistent_reindex

555c8a0

Fix

b1beb76

Tim-Brooks merged commit 480c545 into elastic:reindex_v2 Jul 18, 2019

Tim-Brooks deleted the persistent_reindex branch April 30, 2020 18:24

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021


		import java.util.List;

		public class TransportGetReindexJobTaskAction extends TransportTasksAction<ReindexTask, GetReindexJobTaskAction.Request,

Make reindexing managed by a persistent task #43382

Make reindexing managed by a persistent task #43382

Uh oh!

Conversation

Tim-Brooks commented Jun 19, 2019

Uh oh!

elasticmachine commented Jun 19, 2019

Uh oh!

Tim-Brooks commented Jun 19, 2019

Uh oh!

Tim-Brooks commented Jul 10, 2019

Uh oh!

henningandersen left a comment

Choose a reason for hiding this comment

Uh oh!

henningandersen Jul 11, 2019

Choose a reason for hiding this comment

Uh oh!

henningandersen Jul 11, 2019

Choose a reason for hiding this comment

Uh oh!

henningandersen Jul 11, 2019

Choose a reason for hiding this comment

Uh oh!

henningandersen Jul 11, 2019

Choose a reason for hiding this comment

Uh oh!

henningandersen Jul 11, 2019

Choose a reason for hiding this comment

Uh oh!

henningandersen Jul 11, 2019

Choose a reason for hiding this comment

Uh oh!

Tim-Brooks commented Jul 15, 2019

Uh oh!

ywelsch left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Tim-Brooks commented Jul 17, 2019

Uh oh!

ywelsch left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants