-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[DOCS] Clarify use of CCS on ML nodes #66616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Pinging @elastic/ml-core (:ml) |
droberts195
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should spell out a couple of things more explicitly:
- Adding
remote_cluster_clientto a dedicated ML or transform node still keeps it "dedicated" - We strongly recommend to do this unless you have a very good reason not to, because we have seen through several support cases that it's completely baffling to end users when their searches work perfectly from dev console but fail when used in ML jobs
You probably don't want to accept my suggestions verbatim as I don't think they're worded very well, but hopefully you can come up with some words that say the same but sound better.
docs/reference/modules/node.asciidoc
Outdated
| <1> The `xpack.ml.enabled` setting is enabled by default. | ||
|
|
||
| If you want the node to use {ccs}, add `remote_cluster_client` to the list of | ||
| roles. See <<remote-node>>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need some clarification on whether a node with ml and remote_cluster_client is still considered a dedicated ML node. I think it should still be:
| roles. See <<remote-node>>. | |
| roles. See <<remote-node>>. A node with both the `ml` and `remote_cluster_client` roles | |
| but no others is still considered a dedicated {ml} node. |
docs/reference/modules/node.asciidoc
Outdated
| ---- | ||
|
|
||
| If you want the node to use {ccs}, add `remote_cluster_client` to the list of | ||
| roles. See <<remote-node>>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| roles. See <<remote-node>>. | |
| roles. See <<remote-node>>. A node with both the `transform` and `remote_cluster_client` roles | |
| but no others is still considered a dedicated {transform} node. |
|
|
||
| If you want the node to use {ccs}, add `remote_cluster_client` to the list of | ||
| roles. See <<remote-node>>. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| If you are using {ccs} in any way it is strongly recommended to allow {ml} nodes to do remote searches unless you have a very good reason not to. It will likely lead to confusion and frustration if cross cluster searches usually work perfectly but fail when used in {ml} job or {dfeed} configs. |
|
|
||
| If you want the node to use {ccs}, add `remote_cluster_client` to the list of | ||
| roles. See <<remote-node>>. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| If you are using {ccs} in any way it is strongly recommended to allow {transform} nodes to do remote searches unless you have a very good reason not to. It will likely lead to confusion and frustration if cross cluster searches usually work perfectly but fail when used in {transform} configs. |
|
Thanks for the feedback @droberts195 ! I've updated the node page, as well as the ML settings and transform settings pages. I also carried over a clarification that data nodes are by default transform nodes, since that seemed to be missing from the transform settings page. If that detail is no longer correct, however, I can remove it from both locations. |
| node, it does not have the `transform` role. However, by default all generic | ||
| data nodes are also {transform} nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now very complicated. I don't think you should use the phrase "generic data nodes", because what does that mean?
If you specify node.data: true, don't specify node.roles at all and don't specify node.transform at all then the node is a transform node. (node.data and node.transform are old deprecated settings that will not work at all in 8.x. So the "by default" statement used to make sense before node.roles was introduced.)
If you specify node.roles: [ data ] then the node is not a transform node.
If you don't specify any node role settings (old or new) then the node is both a data node and a transform node (and every other type of node). So the "by default" bit makes sense in this case, but for production it's an edge case.
A data/transform node that specifies node.roles needs to explicitly say both, for example, node.roles: [ data, transform ] or node.roles: [ data_hot, transform, remote_cluster_client ].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation! I've removed "All data nodes are also transform nodes" from the node settings page and re-worked the details in the transform and ML settings pages to try to make this clearer.
droberts195
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The remote_cluster_client role is also on by default if you don't have any node-related settings in your config.
But LGTM apart from that so feel free to merge without further review.
| If you use the `node.roles` setting, then all required roles must be explicitly | ||
| set. Consult <<modules-node>> to learn more. | ||
| By default, every node is a {ml} node. If you set `node.roles`, however, | ||
| you must explicitly specify all the required roles for the node. To learn more, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this sentence:
If you set
node.roles, however, you must explicitly specify all the required roles for the node.
Would be better in docs/reference/modules/node.asciidoc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's already covered in the following sentence on that page: "If you set node.roles, the node is assigned only the roles you specify."
Co-authored-by: David Roberts <[email protected]>
On closer examination, the list of default roles is duplicated in this section: https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-node.html#node-roles, so I've added the transform role there and removed the earlier (redundant) paragraph. |
Co-authored-by: David Roberts <[email protected]>
Co-authored-by: David Roberts <[email protected]>
|
Pinging @elastic/es-docs (Team:Docs) |
Can we please drop this sentence (and its transform twin)? I believe this adds confusion. In reality it means Imagine the user who has orchestration in place to deploy their cluster. It is likely, going forwards, that this orchestration would deploy with @lcawl Sorry about being late to the party. |
Related to #66533
This PR updates https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-node.html to suggest adding the remote_cluster_client role to dedicated machine learning and transform nodes.
Preview