Skip to content

Conversation

@jasontedor
Copy link
Member

@jasontedor jasontedor commented May 30, 2018

This commit removes ML and persistent task custom metadata from the cluster state API to avoid the possibility of sending such custom metadata to a client that can not understand. This can arise in a rolling upgrade to the default distribution from a prior version that did not have X-Pack.

Relates #30731, relates #30857

This commit removes ML and persistent task custom metadata from the
cluster state API to avoid the possibility of sending such custom
metadata to a client that can not understand. This can arise in a
rolling upgrade to the default distribution from a prior version that
did not have X-Pack.
@jasontedor jasontedor added >breaking blocker review discuss :Distributed Coordination/Task Management Issues for anything around the Tasks API - both persistent and node level. v7.0.0 v6.3.0 :ml Machine learning v6.4.0 labels May 30, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@jasontedor jasontedor requested review from droberts195 and ywelsch May 30, 2018 02:03
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@Override
public Version getMinimalSupportedVersion() {
return Version.V_5_4_0;
return MINIMAL_SUPPORTED_VERSION;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure that's OK? this is also used when publishing a cluster state to nodes. What happens when people upgrade from 6.2 + xpack to 6.3 + xpack in a rolling fashion?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The effect of losing the persistent tasks in a 6.2 -> 6.3 upgrade would be that running ML jobs and datafeeds would be killed and would have to be manually restarted.

It's currently documented best practice to stop ML jobs during upgrades. We are trying to make things more robust so this advice can be removed, but because the advice exists today I don't think it would be a complete disaster if ML jobs got killed during a rolling upgrade to 6.3 if it wasn't followed. (For upgrades beyond 6.3 this wouldn't be acceptable, as ML will be supported in Cloud and they'll be relying on ML jobs staying open during rolling upgrades from 6.3 to higher versions.)

We could also potentially document the es.persistent_tasks.custom_metadata_minimum_version system property and tell people who want to leave ML jobs running during a rolling upgrade to 6.3 that they need to set it.

@jasontedor
Copy link
Member Author

We are going to end up taking so different approaches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

blocker >breaking :Distributed Coordination/Task Management Issues for anything around the Tasks API - both persistent and node level. :ml Machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants