Commit a99fc23
authored
Handle serialization exceptions during publication (#41781)
Today if an exception is thrown when serializing a cluster state during
publication then the master enters a poisoned state where it cannot publish any
more cluster states, but nor does it stand down as master, yielding repeated
exceptions of the following form:
```
failed to commit cluster state version [12345]
org.elasticsearch.cluster.coordination.FailedToCommitClusterStateException: publishing failed
at org.elasticsearch.cluster.coordination.Coordinator.publish(Coordinator.java:1045) ~[elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.cluster.service.MasterService.publish(MasterService.java:252) [elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:238) [elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:142) [elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) [elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) [elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) [elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252) [elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215) [elasticsearch-7.0.0.jar:7.0.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_144]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_144]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: cannot start publishing next value before accepting previous one
at org.elasticsearch.cluster.coordination.CoordinationState.handleClientValue(CoordinationState.java:280) ~[elasticsearch-7.0.0.jar:7.0.0]
at org.elasticsearch.cluster.coordination.Coordinator.publish(Coordinator.java:1030) ~[elasticsearch-7.0.0.jar:7.0.0]
... 11 more
```
This is because it already created the publication request using
`CoordinationState#handleClientValue()` but then it fails before accepting it.
This commit addresses this by performing the serialization before calling
`handleClientValue()`.
Relates #41090, which was the source of such a serialization exception.1 parent 502b395 commit a99fc23
File tree
2 files changed
+68
-2
lines changed- server/src
- main/java/org/elasticsearch/cluster/coordination
- test/java/org/elasticsearch/cluster/coordination
2 files changed
+68
-2
lines changedLines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
478 | 478 | | |
479 | 479 | | |
480 | 480 | | |
481 | | - | |
482 | 481 | | |
483 | 482 | | |
484 | 483 | | |
| |||
997 | 996 | | |
998 | 997 | | |
999 | 998 | | |
1000 | | - | |
1001 | 999 | | |
1002 | 1000 | | |
| 1001 | + | |
| 1002 | + | |
1003 | 1003 | | |
1004 | 1004 | | |
1005 | 1005 | | |
| |||
Lines changed: 66 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| 29 | + | |
29 | 30 | | |
30 | 31 | | |
31 | 32 | | |
| |||
54 | 55 | | |
55 | 56 | | |
56 | 57 | | |
| 58 | + | |
57 | 59 | | |
58 | 60 | | |
59 | 61 | | |
| |||
63 | 65 | | |
64 | 66 | | |
65 | 67 | | |
| 68 | + | |
66 | 69 | | |
67 | 70 | | |
68 | 71 | | |
| |||
93 | 96 | | |
94 | 97 | | |
95 | 98 | | |
| 99 | + | |
96 | 100 | | |
97 | 101 | | |
98 | 102 | | |
| |||
135 | 139 | | |
136 | 140 | | |
137 | 141 | | |
| 142 | + | |
138 | 143 | | |
139 | 144 | | |
140 | 145 | | |
| |||
1149 | 1154 | | |
1150 | 1155 | | |
1151 | 1156 | | |
| 1157 | + | |
| 1158 | + | |
| 1159 | + | |
| 1160 | + | |
| 1161 | + | |
| 1162 | + | |
| 1163 | + | |
| 1164 | + | |
| 1165 | + | |
| 1166 | + | |
| 1167 | + | |
| 1168 | + | |
| 1169 | + | |
| 1170 | + | |
| 1171 | + | |
| 1172 | + | |
| 1173 | + | |
| 1174 | + | |
| 1175 | + | |
| 1176 | + | |
| 1177 | + | |
| 1178 | + | |
| 1179 | + | |
| 1180 | + | |
| 1181 | + | |
| 1182 | + | |
| 1183 | + | |
| 1184 | + | |
| 1185 | + | |
| 1186 | + | |
| 1187 | + | |
| 1188 | + | |
| 1189 | + | |
| 1190 | + | |
| 1191 | + | |
| 1192 | + | |
| 1193 | + | |
| 1194 | + | |
| 1195 | + | |
| 1196 | + | |
| 1197 | + | |
| 1198 | + | |
| 1199 | + | |
| 1200 | + | |
| 1201 | + | |
| 1202 | + | |
| 1203 | + | |
| 1204 | + | |
| 1205 | + | |
| 1206 | + | |
| 1207 | + | |
| 1208 | + | |
| 1209 | + | |
| 1210 | + | |
| 1211 | + | |
| 1212 | + | |
| 1213 | + | |
| 1214 | + | |
| 1215 | + | |
| 1216 | + | |
| 1217 | + | |
1152 | 1218 | | |
1153 | 1219 | | |
1154 | 1220 | | |
| |||
0 commit comments