Skip to content

Conversation

@grcevski
Copy link
Contributor

This PR adds support for /_autoscaling/policy for file based settings. The PR uses similar approach in how we handled ILM policies, where we load the state handlers through SPI.

Relates to #89183

@grcevski grcevski added >enhancement :Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.5.0 labels Aug 29, 2022
@elasticsearchmachine
Copy link
Collaborator

Hi @grcevski, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticsearchmachine elasticsearchmachine removed the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Aug 29, 2022

private RerouteService rerouteService;

private AllocationService allocationService;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking for suggestions if this can be made better. Right now the allocationDeciders are injected in the TransportActions that autoscaling uses, but I can't use injection since we made a design decision to use SPI for the reserved state handlers.

Essentially, I need a way to get to the allocationDeciders to be able to create/pass the validator in to the transport action transformation methods.

/**
* Autoscaling provider implementation for the {@link ReservedClusterStateHandlerProvider} service interface
*/
public class LocalStateReservedAutoscalingStateHandlerProvider implements ReservedClusterStateHandlerProvider {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a clone of the regular plugin SPI state handler provider for integration test purposes.

state = TransportPutAutoscalingPolicyAction.putAutoscalingPolicy(
state,
request,
policyValidatorHolder.get(clusterService.getAllocationService().getAllocationDeciders())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps there's a better way to get to the allocationDeciders without injection?

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly focused on how to provide the allocation deciders and left a comment on that. I'd prefer to defer the rest of the review until that is tackled.

return rerouteService;
}

public void setAllocationService(AllocationService allocationService) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer to either:

  1. Support injection for the reserved state plugins.
  2. Add an explicit AllocationDecidersProvider object that is passed to Plugin.createComponents that can be used to access this.

Putting it here is polluting this class I think with responsibilities that does not concern it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer to either:

  1. Support injection for the reserved state plugins.
  2. Add an explicit AllocationDecidersProvider object that is passed to Plugin.createComponents that can be used to access this.

Putting it here is polluting this class I think with responsibilities that does not concern it.

Thanks for reviewing this Henning. I took approach 2. and added the AllocationDeciders provider.

@grcevski grcevski requested a review from a team as a code owner August 31, 2022 23:27
@grcevski grcevski requested review from AndersonQ and aleksmaus and removed request for a team August 31, 2022 23:27
@grcevski
Copy link
Contributor Author

grcevski commented Sep 1, 2022

@elasticsearchmachine run elasticsearch-ci/bwc

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got what I asked for 🙂 , unfortunately it resulted in quite many changed files. Can we make the allocationDeciders change in a separate PR first, such that this PR gets easier to review? I added a couple of comments to the allocationDeciders change that should be carried over to the new PR as well.

repositoriesServiceReference::get,
tracer
tracer,
clusterModule.getAllocationService()::getAllocationDeciders
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since getAllcationDeciders returns a final field, I think this means that we have the list of deciders already and might as well pass the object directly instead of a supplier of it?

Comment on lines 125 to 126
var capacityServiceHolder = new AutoscalingCalculateCapacityService.Holder(this);
reservedAutoscalingPolicyAction.set(new ReservedAutoscalingPolicyAction(capacityServiceHolder, allocationDecidersSupplier));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we now have the list of deciders directly, could we not as well just set it directly here rather than pass it around in several places? I.e., add:

this.allocationDeciders.set(deciders);

and remove it from the argument to ReservedAutoscalingPolicyAction as well as from being injected into the two transport services that gets it (only to pass it to the plugin).

@grcevski
Copy link
Contributor Author

grcevski commented Sep 6, 2022

I got what I asked for 🙂 , unfortunately it resulted in quite many changed files. Can we make the allocationDeciders change in a separate PR first, such that this PR gets easier to review? I added a couple of comments to the allocationDeciders change that should be carried over to the new PR as well.

Sounds good. I'll split this off in another PR and mark this in draft until that's merged. Thanks again for the review!

@grcevski grcevski marked this pull request as draft September 6, 2022 18:03

// package private for testing
Path operatorSettingsDir() {
public Path operatorSettingsDir() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposed for easier test writing.

/**
* Autoscaling provider implementation for the {@link ReservedClusterStateHandlerProvider} service interface
*/
public class LocalStateReservedAutoscalingStateHandlerProvider implements ReservedClusterStateHandlerProvider {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mock version of the SPI handler provider so that we can write Java integration tests. It implements equals and hashcode so we deduplicate the plugin, as it can be discovered multiple times in the MockNode because all plugins are loaded by the same classloader.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not fix that in MockNode instead? I am not sure I follow the equals/hashCode problem, but sounds like we are relying on MockNode using a Set to collect these providers?

@grcevski grcevski marked this pull request as ready for review September 7, 2022 17:57
@elasticsearchmachine
Copy link
Collaborator

Hi @grcevski, I've updated the changelog YAML for you.

@grcevski
Copy link
Contributor Author

grcevski commented Sep 7, 2022

@elasticsearchmachine run elasticsearch-ci/part-1

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Left a number of comments I'd prefer to see included.

import java.util.Objects;

/**
* Autoscaling provider implementation for the {@link ReservedClusterStateHandlerProvider} service interface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment explaining why we need this test override? It is not clear to me and I'd rather read it than find out.

/**
* Autoscaling provider implementation for the {@link ReservedClusterStateHandlerProvider} service interface
*/
public class LocalStateReservedAutoscalingStateHandlerProvider implements ReservedClusterStateHandlerProvider {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not fix that in MockNode instead? I am not sure I follow the equals/hashCode problem, but sounds like we are relying on MockNode using a Set to collect these providers?

this.clusterServiceHolder.set(clusterService);
this.allocationDeciders.set(allocationDeciders);
var capacityServiceHolder = new AutoscalingCalculateCapacityService.Holder(this);
this.reservedAutoscalingPolicyAction.set(new ReservedAutoscalingPolicyAction(capacityServiceHolder));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not simply defer the construction of this until reservedClusterStateHandlers? That seems more intuitive to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I avoided doing that because then I'd have to also keep a local reference to the AutoscalingCalculateCapacityService.Holder, I need it to construct the action object.


@Override
public TransformState transform(Object source, TransformState prevState) throws Exception {
var requests = prepare(source);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do the casting here rather than in prepare? That avoids SuppressWarnings and also seems more appropriate to deal with input validation immediately in this method.

}

@SuppressWarnings("unchecked")
public Collection<PutAutoscalingPolicyAction.Request> prepare(Object input) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be private?

return NAME;
}

@SuppressWarnings("unchecked")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be moved to the statement?

}

@Override
public TransformState transform(Object source, TransformState prevState) throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this not get the right type instead? Seems it is T at the interface level and the call in ReservedStateUpdateTask can be massaged to fix this. The service does "guarantee" it, since it only passes output of fromXContent to this method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the caller side in ReservedStateUpdateTask I have a composite object containing the data for all of the handlers. I parse everything first in a map of String key to data type required by the handler, but I can't retain the type information because the data types are in classes provided by plugins and the server doesn't know of them.


prevState = updatedState;
updatedState = processJSON(action, prevState, json);
assertThat(updatedState.keys(), containsInAnyOrder("my_autoscaling_policy", "my_autoscaling_policy_1"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also validate that the roles and deciders are correct in cluster state?

@grcevski grcevski merged commit 9d774d9 into elastic:main Sep 13, 2022
@grcevski grcevski deleted the operator/autoscaling branch September 13, 2022 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Core/Infra/Core Core issues without another label >enhancement Team:Core/Infra Meta label for core/infra team v8.5.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants