Skip to content

Conversation

@davidkyle
Copy link
Member

@davidkyle davidkyle commented Jan 23, 2019

TooManyJobsIT occasionally fails with the the mysterious message Had to resort to force-closing job, something went wrong?. The something that went wrong is closing all jobs in the test teardown

   > Throwable #1: java.lang.RuntimeException: Had to resort to force-closing job, something went wrong?
...
   > Caused by: java.util.concurrent.ExecutionException: RemoteTransportException[[node_t1][127.0.0.1:33786][cluster:admin/xpack/ml/job/close]]; 
nested: IllegalStateException[Timed out when waiting for persistent tasks after 20s];

The close request timed out after 20 seconds which isn't surprising in this case as there were 285 jobs to close. TooManyJobsIT is testing that the xpack.ml.max_open_jobs limit is respected and doesn't have to create hundreds of jobs to do so as xpack.ml.max_open_jobs is set by the test.

Closes #30300

@davidkyle davidkyle added >test Issues or PRs that are addressing/adding tests v7.0.0 :ml Machine learning v6.7.0 labels Jan 23, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

Copy link

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davidkyle davidkyle merged commit e1226f6 into elastic:master Jan 24, 2019
@davidkyle davidkyle deleted the fix-toomanyjobs branch January 24, 2019 09:18
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jan 24, 2019
* elastic/master:
  Optimize warning header de-duplication (elastic#37725)
  Bubble exceptions up in ClusterApplierService (elastic#37729)
  SQL: Improve handling of invalid args for PERCENTILE/PERCENTILE_RANK (elastic#37803)
  Remove unused ThreadBarrier class (elastic#37666)
  Add built-in user and role for code plugin (elastic#37030)
  Consolidate testclusters tests into a single project (elastic#37362)
  Fix docs for MappingUpdatedAction
  SQL: Introduce SQL DATE data type (elastic#37693)
  disabling bwc test while backporting elastic#37639
  Mute ClusterDisruptionIT testAckedIndexing
  Set acking timeout to 0 on dynamic mapping update (elastic#31140)
  Remove index audit output type (elastic#37707)
  Mute FollowerFailOverIT testReadRequestsReturnsLatestMappingVersion
  [ML] Increase close job timeout and lower the max number (elastic#37770)
  Remove Custom Listeners from SnapshotsService (elastic#37629)
  Use m_m_nodes from Zen1 master for Zen2 bootstrap (elastic#37701)
  Fix index filtering in follow info api. (elastic#37752)
  Use project dependency instead of substitutions for distributions (elastic#37730)
  Update authenticate to allow unknown fields (elastic#37713)
  Deprecate HLRC EmptyResponse used by security (elastic#37540)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:ml Machine learning >test Issues or PRs that are addressing/adding tests v6.7.0 v7.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] Had to resort to force-closing job, something went wrong?

4 participants