-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Phase 2 support for operator privileges: Cluster settings #66684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Phase 2 support for operator privileges: Cluster settings #66684
Conversation
|
@tvernum As talked during the meeting, for this draft PR, I am seeking feedback mainly around the approach that makes the snapshot code aware of the information provided by operator privileges check. As said in the PR description, it is done by having the security code flag a new field on the Of course besides the above main issue, any other comments are always welcome. Thanks! |
I am curious about whether this is ever needed? If the operator controls some settings, I would think we never want to restore those. For instance, if the operator restores an env from production to dev every weekend, I would think the production settings may not be valid for the dev environment (like rate-limiter or autoscaling policies)? Are there examples where this is desired behavior? |
I think this behaviour is useful when moving cluster to different zones or provision (replicate) a cluster from existing ones. Also this behaviour is backward compatible when operator privileges are enabled on Cloud, i.e. there is no new logic (on Cloud side) needed for any downstream/subsequent actions after restore. With this being said, I'll forward this question to Cloud folks to get more definitive answers. |
|
I'd start from the question of what should the behaviour be when operator privileges are not enabled (remembering that it's an Enterprise feature, so many on-prem customers will not have it). I think it would be surprising (to the point of being incorrect) if restoring a cluster from a snapshot did not restore IP Filters (as an example). For example, you attempt a version upgrade, it goes wrong, so you wipe your cluster and restore from snapshot but your IP Filters aren't restored & your clusters is now open. That seems wrong. I would argue that the same is true if I restore a cluster as an operator, when operator privileges are enabled, because
|
|
Pinging @elastic/es-security (Team:Security) |
|
|
||
| public void maybeInterceptRequest(ThreadContext threadContext, TransportRequest request) { | ||
| if (request instanceof RestoreSnapshotRequest) { | ||
| ((RestoreSnapshotRequest) request).skipOperatorOnly(shouldProcess()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code always sets the value of skipOperatorOnly because:
- Similar to other authorization decision, the value is computed and local to every node
- The value is always set properly so that a transport client cannot override this field (it is not exposed to Rest)
| @Override | ||
| public void maybeInterceptRequest(ThreadContext threadContext, TransportRequest request) { | ||
| if (request instanceof RestoreSnapshotRequest) { | ||
| ((RestoreSnapshotRequest) request).skipOperatorOnly(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, this field is always set even when operator privileges are not enabled to retain the default behaviour. See also above.
|
@tvernum @henningandersen This PR is now ready for review. Thanks! |
Yes, the 6 that are changed in this PR are good - thanks! |
tvernum
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
...in/java/org/elasticsearch/action/admin/cluster/snapshots/restore/RestoreSnapshotRequest.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| public void testOutcomeOfSuperuserPerformingOperatorOnlyActionWillDependOnWhetherFeatureIsEnabled() { | ||
| public void testOperatorOnlyActionOrSettingWillNotBeActionableByNormalSuperuser() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What value do we get from combining these into 1 test case? It seems more natural to have 2 separate cases for action and settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No real value. I split it and also the corresponding one that test the operator user.
| /** | ||
| * Returns <code>true</code> if this setting is dynamically updateable by operators, otherwise <code>false</code> | ||
| */ | ||
| public final boolean isDynamicOperator() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether this should just be isOperatorOnly()
I think it would read better in OperatorOnlyRegistry.checkClusterUpdateSettings (because that code is really focused on "is this an operator setting" rather than "is this dynamic, but only updateable by operators")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes senses since operator settings must be dynamic in the first place, though this assumes some understanding of the concept. But overall I agree it reads better in the call sites.
henningandersen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @ywangd , I added a few minor things to address.
| } | ||
| } | ||
|
|
||
| @SuppressWarnings("unchecked") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not spot the place where this is needed, perhaps it can be removed or alternatively moved to the specific line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! It is indeed a copy/paste artifact ~~
| } | ||
|
|
||
| @SuppressWarnings("unchecked") | ||
| public void testSnapshotRestoreBehaviourOfOperatorSettings() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may want the opposite test too - checking that without operator privileges enabled, we do restore properties marked as operator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought these would be covered by existing tests already, but I was wrong. The settings are from xpack and most snapshot related tests are in core. I added a new yml test to cover this and PUT _cluster/settings.
| assertTrue((boolean) operatorPrivileges.get("enabled")); | ||
| } | ||
|
|
||
| public void testUpdateOperatorSettings() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want a test to validate that we can set operator settings when operator privileges are disabled too. Could also be in the single node test rather than rest test, if easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a yml test for it. See also above.
| private boolean includeAliases = true; | ||
| private Settings indexSettings = EMPTY_SETTINGS; | ||
| private String[] ignoreIndexSettings = Strings.EMPTY_ARRAY; | ||
| private boolean skipOperatorOnly = false; // this field does not get serialised because it is always set locally by authz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this is not included in toString, would be nice to add it for debugging purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The toString method relies on toXContent, which I intentionally left out for this new field. To support toString, I extracted the internals of the toXContent method into a new toXContentFragment method and use it for the string building.
| if (request.includeGlobalState()) { | ||
| if (metadata.persistentSettings() != null) { | ||
| Settings settings = metadata.persistentSettings(); | ||
| clusterSettings.validateUpdate(settings); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully this never causes an issue, but I think this line should go below the modification of the settings below to ensure the updated settings are valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good catch, thanks! It may not cause real issue like you said, nevertheless it is better to be safe.
| XContentParser parser = XContentType.JSON.xContent().createParser( | ||
| NamedXContentRegistry.EMPTY, null, BytesReference.bytes(builder).streamInput()); | ||
| Map<String, Object> map = parser.mapOrdered(); | ||
| assertFalse(map.containsKey("skip_operator_only")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it was nicer to verify that toXContent gives same result regardless of the skipOperatorOnly flag? I.e., invoke it on two instances with different skipOperatorOnly flag and verify the maps are identical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
…apshots/restore/RestoreSnapshotRequest.java Co-authored-by: Tim Vernum <[email protected]>
|
@elasticmachine update branch |
Thanks @droberts195 But somehow I counted 9 of them:
Could you please advise? Thanks! |
Yes, sorry, you are right that there are 9. I'm happy with all of those. |
) Add a new OperatorDynamic enum to differentiate between operator-only and regular dynamic cluster settings. The Setting constructor validates that Dynamic and OperatorDynamic cannot be both specified. Operator-only settings behave as the follows: * When the feature is enabled, operator-only settings cannot be updated with PUT cluster settings API unless the user is an operator. * The restore behaviour for operator-only settings will be identical for either operator or non-operator user. That is, when operator privileges are enabled, operator-only settings will not be restored. Otherwise (if the feature is disabled), the behaviour is the same as of today.
…68563) Add a new OperatorDynamic enum to differentiate between operator-only and regular dynamic cluster settings. The Setting constructor validates that Dynamic and OperatorDynamic cannot be both specified. Operator-only settings behave as the follows: * When the feature is enabled, operator-only settings cannot be updated with PUT cluster settings API unless the user is an operator. * The restore behaviour for operator-only settings will be identical for either operator or non-operator user. That is, when operator privileges are enabled, operator-only settings will not be restored. Otherwise (if the feature is disabled), the behaviour is the same as of today.
Add documentation for operator privilegs. The docs cover features delivered by phase 1 (#65256) and 2 (#66684). Co-authored-by: Tim Vernum <[email protected]> Co-authored-by: lcawl <[email protected]> Co-authored-by: Adam Locke <[email protected]>
Add documentation for operator privilegs. The docs cover features delivered by phase 1 (elastic#65256) and 2 (elastic#66684). Co-authored-by: Tim Vernum <[email protected]> Co-authored-by: lcawl <[email protected]> Co-authored-by: Adam Locke <[email protected]>
Add documentation for operator privilegs. The docs cover features delivered by phase 1 (elastic#65256) and 2 (elastic#66684). Co-authored-by: Tim Vernum <[email protected]> Co-authored-by: lcawl <[email protected]> Co-authored-by: Adam Locke <[email protected]>
Add documentation for operator privilegs. The docs cover features delivered by phase 1 (#65256) and 2 (#66684). Co-authored-by: Tim Vernum <[email protected]> Co-authored-by: lcawl <[email protected]> Co-authored-by: Adam Locke <[email protected]>
Add documentation for operator privilegs. The docs cover features delivered by phase 1 (#65256) and 2 (#66684). Co-authored-by: Tim Vernum <[email protected]> Co-authored-by: lcawl <[email protected]> Co-authored-by: Adam Locke <[email protected]>
Phase 2 of operator privileges is to support operator-only settings. This includes:
For the restore process, even if the user is not an operator, the restore will not fail, but rather succeeds by skipping any operator-only settings. This means, if the current cluster state has values for an operator-only setting, it will have the same value after restore. When the user is an operator, the restore behaviour is essentially the same as of today.(EDIT: after discussions, the restore behaviour for operator-only settings will be the same for either operator or non-operator user. When operator privileges are enabled, operator-only settings will not be restored. Otherwise (if the feature is disabled), the behaviour is the same as of today.
Operator and non-operator settings are differentiated with a new
DynamicOperatorproperty. TheSettingconstructor enforces thatDynamicandDynamicOperatorcannot be both specified.I spent quite some time exploring how the restore code should be made aware of operator privileges. The challenge is that they are located in different packages. There are a few options, but at the end I decided to let the
SnapshotRestoreRequestcarry a flag to indicate whether or not to filter operator-only settings. This flag is set inAuthorizationServicewhere operator privileges are enforced. This is similar to howRequestInterceptorworks except it is for a cluster action instead of index actions. I personally think it is a reasonable, lightweight approach that does the job to separate the concerns of security and snapshot/restore. We could choose a more formal and heavier approach by introducing a newPluginmethod (or even interface) to allow the main to pull settings filters from plugins. But I think it's an overkill just for this purpose.The PR does have javaRestTest to test overall features, but otherwise lacks unit tests. I'll be adding them once the overall approach is approved.