Skip to content

Various TimeSeriesLifecycleActionsIT failures, NPE in moveClusterStateToErrorStep #51375

@polyfractal

Description

@polyfractal

I'm not really sure how to classify this test failure, but I noticed an NPE so figured it was worthy to try and track. Something probably went off the rails at some point and the rest is just fallout from that, but I don't know enough to tell what the cause is.

Interestingly, the yaml tests fail due to a timeout later so perhaps it is tied to whatever is causing the yaml tests to timeout too.

https://gradle-enterprise.elastic.co/s/474sinachygny

REPRODUCE WITH: ./gradlew ':x-pack:plugin:ilm:qa:multi-node:integTestRunner' --tests "org.elasticsearch.xpack.ilm.TimeSeriesLifecycleActionsIT.testMoveToStepRereadsPolicy" \
  -Dtests.seed=426FF3BECB3DFDF2 \
  -Dtests.security.manager=true \
  -Dtests.locale=mt \
  -Dtests.timezone=America/Glace_Bay \
  -Dcompiler.java=13 \
  -Druntime.java=11

REPRODUCE WITH: ./gradlew ':x-pack:plugin:ilm:qa:multi-node:integTestRunner' --tests "org.elasticsearch.xpack.ilm.TimeSeriesLifecycleActionsIT.testILMRolloverOnManuallyRolledIndex" \
  -Dtests.seed=426FF3BECB3DFDF2 \
  -Dtests.security.manager=true \
  -Dtests.locale=mt \
  -Dtests.timezone=America/Glace_Bay \
  -Dcompiler.java=13 \
  -Druntime.java=11

REPRODUCE WITH: ./gradlew ':x-pack:plugin:ilm:qa:multi-node:integTestRunner' --tests "org.elasticsearch.xpack.ilm.TimeSeriesLifecycleActionsIT.testMoveToStepRereadsPolicy" \
  -Dtests.seed=426FF3BECB3DFDF2 \
  -Dtests.security.manager=true \
  -Dtests.locale=mt \
  -Dtests.timezone=America/Glace_Bay \
  -Dcompiler.java=13 \
  -Druntime.java=11

REPRODUCE WITH: ./gradlew ':x-pack:plugin:ilm:qa:multi-node:integTestRunner' --tests "org.elasticsearch.xpack.ilm.TimeSeriesLifecycleActionsIT.testILMRolloverOnManuallyRolledIndex" \
  -Dtests.seed=426FF3BECB3DFDF2 \
  -Dtests.security.manager=true \
  -Dtests.locale=mt \
  -Dtests.timezone=America/Glace_Bay \
  -Dcompiler.java=13 \
  -Druntime.java=11

So it looks like a test fails because it wasn't on the right step:

 org.elasticsearch.xpack.ilm.TimeSeriesLifecycleActionsIT > testMoveToStepRereadsPolicy FAILED
15:10:41     org.elasticsearch.client.ResponseException: method [POST], host [http://127.0.0.1:32840], URI [_ilm/move/test-1], status line [HTTP/1.1 400 Bad Request]
15:10:41     {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"index [test-1] is not on current step [{\"phase\":\"hot\",\"action\":\"rollover\",\"name\":\"check-rollover-ready\"}]"}],"type":"illegal_argument_exception","reason":"index [test-1] is not on current step [{\"phase\":\"hot\",\"action\":\"rollover\",\"name\":\"check-rollover-ready\"}]"},"status":400}
15:10:41         at __randomizedtesting.SeedInfo.seed([426FF3BECB3DFDF2:AC845E4FC75748A8]:0)
15:10:41         at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:283)
15:10:41         at org.elasticsearch.client.RestClient.performRequest(RestClient.java:261)
15:10:41         at org.elasticsearch.client.RestClient.performRequest(RestClient.java:235)
15:10:41         at org.elasticsearch.xpack.ilm.TimeSeriesLifecycleActionsIT.testMoveToStepRereadsPolicy(TimeSeriesLifecycleActionsIT.java:879)
15:10:42 org.elasticsearch.xpack.ilm.TimeSeriesLifecycleActionsIT > testILMRolloverOnManuallyRolledIndex FAILED
15:10:42     org.elasticsearch.client.ResponseException: method [PUT], host [http://[::1]:41018], URI [/jkpkacsspx-000001], status line [HTTP/1.1 500 Internal Server Error]
15:10:42     {"error":{"root_cause":[{"type":"illegal_state_exception","reason":"alias [alias] has more than one write index [jkpkacsspx-000001,test-000002]"}],"type":"illegal_state_exception","reason":"alias [alias] has more than one write index [jkpkacsspx-000001,test-000002]"},"status":500}
15:10:42         at __randomizedtesting.SeedInfo.seed([426FF3BECB3DFDF2:B06F05619DCCFAE7]:0)
15:10:42         at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:283)
15:10:42         at org.elasticsearch.client.RestClient.performRequest(RestClient.java:261)
15:10:42         at org.elasticsearch.client.RestClient.performRequest(RestClient.java:235)
15:10:42         at org.elasticsearch.xpack.ilm.TimeSeriesLifecycleActionsIT.createIndexWithSettings(TimeSeriesLifecycleActionsIT.java:1561)
15:10:42         at org.elasticsearch.xpack.ilm.TimeSeriesLifecycleActionsIT.testILMRolloverOnManuallyRolledIndex(TimeSeriesLifecycleActionsIT.java:1034)

But there are also errors about too many shards

ERROR","creation_date":"1579806667403","step_time":"1579806670611"},"error_details":"{\"type\":\"illegal_argument_exception\",\"reason\":\"the number of target shards [8] must be less that the number of source shards [6]\",\"stack_trace\":\"java.lang.IllegalArgumentException: the number of target shards [8] must be less that the number of source shards [6]\\n\\tat 

And existing indices:

ERROR][o.e.x.i.IndexLifecycleRunner] [integTest-0] policy [AlIWG] for index [dqnlkipzun-000001] failed on step [{"phase":"hot","action":"rollover","name":"check-rollover-ready"}]. Moving to ERROR step
15:14:03 »  org.elasticsearch.ResourceAlreadyExistsException: index [dqnlkipzun-000002/3JcIEtUYSwOwTFC2m2fYnQ] already exists
15:14:03 »  	at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService.validateIndexName(MetaDataCreateIndexService.java:156) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
15:14:03 »  	at org.elasticsearch.action.admin.indices.rollover.TransportRolloverAction.masterOperation(TransportRolloverAction.java:133) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
15:14:03 »  	at org.elasticsearch.action.admin.indices.rollover.TransportRolloverAction.masterOperation(TransportRolloverAction.java:72) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
15:14:03 »  	at 

and at one point an NPE somewhere in the error handling chain

 Caused by: java.lang.NullPointerException: unable to move to an error step where there is no current step, state: {}
15:14:03 »  	at java.util.Objects.requireNonNull(Objects.java:246) ~[?:?]
15:14:03 »  	at org.elasticsearch.xpack.ilm.IndexLifecycleTransition.moveClusterStateToErrorStep(IndexLifecycleTransition.java:133) ~[?:?]
15:14:03 »  	at org.elasticsearch.xpack.ilm.ExecuteStepsUpdateTask.moveToErrorStep(ExecuteStepsUpdateTask.java:210) ~[?:?]
15:14:03 »  	at org.elasticsearch.xpack.ilm.ExecuteStepsUpdateTask.execute(ExecuteStepsUpdateTask.java:99) ~[?:?]
15:14:03 »  	at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
15:14:03 »  	at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:702) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
15:14:03 »  	at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:324) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
15:14:03 »  	at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:219) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions