-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Cost based scaling down of pods #1888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,382 @@ | ||
| --- | ||
| title: Cost based scaling down of pods | ||
| authors: | ||
| - "@ingvagabund" | ||
| owning-sig: sig-scheduling | ||
| participating-sigs: | ||
| - sig-apps | ||
| reviewers: | ||
| - TBD | ||
| approvers: | ||
| - TBD | ||
| editor: TBD | ||
| creation-date: 2020-06-30 | ||
| last-updated: 2020-07-20 | ||
| status: provisional | ||
| --- | ||
|
|
||
| # Cost based scaling down of pods | ||
|
|
||
| ## Table of Contents | ||
|
|
||
| <!-- toc --> | ||
| - [Summary](#summary) | ||
| - [Motivation](#motivation) | ||
| - [Goals](#goals) | ||
| - [Non-Goals](#non-goals) | ||
| - [Proposal](#proposal) | ||
| - [Examples of strategies](#examples-of-strategies) | ||
| - [Balancing duplicates among topological domains](#balancing-duplicates-among-topological-domains) | ||
| - [Pods not tolerating taints first](#pods-not-tolerating-taints-first) | ||
| - [Minimizing pod anti-affinity](#minimizing-pod-anti-affinity) | ||
| - [Rank normalization and weighted sum](#rank-normalization-and-weighted-sum) | ||
| - [User Stories [optional]](#user-stories-optional) | ||
| - [Story 1](#story-1) | ||
| - [Story 2](#story-2) | ||
| - [Story 3](#story-3) | ||
| - [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints) | ||
| - [Phases](#phases) | ||
| - [Option A (field in a pod status)](#option-a-field-in-a-pod-status) | ||
| - [Option B (CRD for a pod group)](#option-b-crd-for-a-pod-group) | ||
| - [Workflow example](#workflow-example) | ||
| - [Risks and Mitigations](#risks-and-mitigations) | ||
| - [Design Details](#design-details) | ||
| - [Test Plan](#test-plan) | ||
| - [Graduation Criteria](#graduation-criteria) | ||
| - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) | ||
| - [Version Skew Strategy](#version-skew-strategy) | ||
| - [Implementation History](#implementation-history) | ||
| - [Alternatives [optional]](#alternatives-optional) | ||
| <!-- /toc --> | ||
|
|
||
| ## Summary | ||
|
|
||
| Cost ranking pods through an external component and scaling down pods based | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would try to be more explicit on what the current state is vs what is being proposed. |
||
| on the cost allows to employ various scheduling strategies to keep a cluster | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. descheduling/eviction strategies? |
||
| from diverging from an optimal distribution of resources. | ||
| Providing an external solution for selecting the right victim allows to improve ability | ||
| to preserve various conditions such us balancing pods among failure domains, keeping | ||
| aligned with security requirements or respecting application policies. | ||
| Allowing controllers to be free of any scheduling strategy, yet to be aware | ||
| of impact of removing pods on the overall cluster scheduling plan, helps to reduce | ||
| cost of re-scheduling resources. | ||
|
|
||
| ## Motivation | ||
|
|
||
| Scaling down a set of pods does not always results in optimal selection of victims. | ||
| The scheduler relies on filters and scores which may distribute the pods wrt. topology | ||
| spreading and/or load balancing constraints (e.g. pods uniformly balanced among zones). | ||
| Application specific workloads may prefer to scale down short-running pods and favor long-running pods. | ||
| Selecting a victim with a trivial logic can unbalance the topology spreading | ||
| or have jobs that accumulated work to be lost in vain. | ||
| Given it's a natural property of a cluster to shift workloads in time, | ||
| decision made by a scheduler at some time is as good as its ability to predict future demands. | ||
| The default kubernetes scheduler was constructed with a goal to provide high throughput | ||
| at the cost of being simple. Thus, it is quite easy to diverge from the scheduling plan. | ||
| In contrast, descheduler allows to help to re-balance the plan and get closer to | ||
| the scheduler constraints. Yet, it is designed to run and adjust the cluster periodically (e.g. each hour). | ||
| Therefore, unusable for scaling down purposes (which require immediate action). | ||
|
|
||
| On the other hand each controller with a scale down operation has its own | ||
| implementation of a victim selection logic. | ||
| The decision making logic does not take into account a scheduling plan. | ||
| Extending each such controller with additional logic to support various scheduling | ||
| constraints is impractical. In cases a proprietary solution for scaling down is required, | ||
| it's impossible. Also, controllers do not necessarily have a whole cluster overview | ||
| so its decision does not have to be optimal. | ||
| Therefore, it's more feasible to locate the logic outside of a controller. | ||
|
|
||
| In order to support more informed scaling down operation while keeping scheduling plan in mind, | ||
| additional decision logic that can be extended based on applications requirements is needed. | ||
|
|
||
| ### Goals | ||
|
|
||
| - Controllers with scale down operation are allowed to select a victim while still respecting a scheduling plan | ||
| - External component is available that can rank pods based on how much they diverge from a scheduling plan when deleted | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These should be goals for the KEP. Is the plan to just provide the base binary and the framework or is it also to provide strategies A, B and C. |
||
|
|
||
| ### Non-Goals | ||
|
|
||
| - Allow to employ strategies that require cost re-computation after scaling up/down (with a support from controllers, e.g. backing-off) | ||
|
|
||
| ## Proposal | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At the high level, I think this is a useful feature for various types of workloads (batch or serving) and placement strategies (spreading, binpacking etc.). |
||
|
|
||
| Proposed solution is to implement an optional cost-based component that will be watching all | ||
| pods (or its subset) and nodes (potentially other objects) present in a cluster. | ||
| Assigning each pod a cost based on a set of scheduling constraints. | ||
| At the same time extending controllers logic to utilize the pod cost when selecting a victim during scale down operation. | ||
|
|
||
| The component will allow to select a different list of scheduling constraints for each targeted | ||
| set of pods. Each pod in a set will be given a cost based on how much important it is in the set. | ||
| The constraints can follow the same rules as the scheduler (through importing scheduling plugins) | ||
| or be custom made (e.g. wrt. to application or proprietary requirements). | ||
| The component will implement a mechanism for ranking pods. | ||
| <!-- Either by annotating a pod, updating its status, setting a new field in pod's spec | ||
| or creating a new CRD which will carry a cost. --> | ||
| Each controller will have a choice to either ignore the cost or take it into account | ||
| when scaling down. | ||
|
|
||
| This way, the logic for selecting a victim for the scaling down operation will be | ||
| separated from each controller. Allowing each consumer to provide its own | ||
| logic for assigning costs. Yet, having all controllers to consume the cost uniformly. | ||
|
|
||
| Given the default scheduler is not a source of truth about how a pod should be distributed | ||
| after it was scheduled, scaling down strategies can exercise completely different approaches. | ||
|
|
||
| Examples of scheduling constraints: | ||
| - choose pods running on a node which have a `PreferNoSchedule` taint first | ||
| - choose youngest/oldest pods first | ||
| - choose pods minimizing topology skew among failure domains (e.g. availability zones) | ||
|
Comment on lines
+125
to
+128
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does that mean the component will provide different sets of "cost" according to different strategies? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, each strategy will provide a different set of costs for a pod group. If configured and implemented properly, suitable weighted sum of ranks can give a final rank as is done in scheduler when scoring multiple nodes today. |
||
|
|
||
| The goal of the proposal is not to provide specific strategies for more informed scaling down operation. | ||
| The primary goal is to provide a mechanism and have controllers implement the mechanism. | ||
| Allowing consumers of the new component to define their own strategies. | ||
|
|
||
| ### Examples of strategies | ||
|
|
||
| Strategies can be divided into two categories: | ||
| - scaling down/up a pod group does not require rank re-computation | ||
| - scaling down/up a pod group requires rank re-computation | ||
|
|
||
| #### Balancing duplicates among topological domains | ||
|
|
||
| - Evict pods while minimizing skew between topological domains | ||
| - Each pod can be given a cost based on how old/young it is in the same domain: | ||
| - if a pod is the first one in the domain, rank the pod with cost `1` | ||
| - if a pod was created second to the domain, rank the pod with cost `2` | ||
| - continue this way until all pods in all domains are ranked | ||
| - higher rank of a pod, the sooner the pod gets removed | ||
|
|
||
| #### Pods not tolerating taints first | ||
|
|
||
| - Evict pods that do not tolerate taints before pods that tolerate taints. | ||
| - Each pod can be given a cost based on how many taints are not tolerated | ||
| - higher rank of a pod, the sooner the pod gets removed | ||
|
|
||
| #### Minimizing pod anti-affinity | ||
|
|
||
| - Evict pods maximizing anti-affinity first | ||
| - Pod that improves anti-affinity on a node gets higher rank | ||
| - Given multiple pod groups can be part of anti-affinity group, scaling down | ||
| a single pod in a group requires re-computation of pod ranks of all pods | ||
| taking part. Also, only a single pod can be scaled down at a time. | ||
| Otherwise, ranks no longer have to provide optimal victim selection. | ||
|
|
||
| In the provided examples the first two strategies do not require rank re-computation. | ||
|
|
||
| ### Rank normalization and weighted sum | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are you suggesting that multiple controllers could influence the cost of evicting a pod? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can make the assumption that the cost of a pod is updated by a single controller, the implementation detail of that controller is not relevant in this KEP I think. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Checking something. In this proposal the cost of Pod eviction is part of its actual state (and not the I'd expect eviction cost to be calculated by one thing using lots of plugins (much like how scoring works for scheduling), and that one thing would form part of the control plane. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, it would be part of
I feel that the implementation detail of the controller updating the cost is not important here. I don't think this KEP needs to take a stand or promote implementing a controller in k/k. Such a controller warrants its own KEP if we feel we need one in-tree. |
||
|
|
||
| In order to allow pod ranking by multiple strategies/constraints, it's important | ||
| to normalize ranks. On the other hand, rank normalization requires all strategies | ||
| to re-compute all ranks every time a pod is created/deleted. To eliminate the need | ||
| to re-compute, each strategy can introduce a threshold where every pod rank | ||
| exceeding the threshold gets rounded to the threshold. | ||
| E.g. if a topology domain has at least 10 pods, 11-th and other pods get the same | ||
| rank as 10-th pod. | ||
| With the threshold based normalization multiple strategies can rank a pod group | ||
| which can be used to compute weighted rank through all relevant strategies. | ||
|
|
||
| ### User Stories [optional] | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks!!! This one is a use case for the descheduler directly. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How are we going to serve the different stories? Different people want different strategies? Is it going to be a configuration for the controller or something that Pods can define? If that's something we don't want to discuss in this KEP, it should be added to Non-goal. |
||
|
|
||
| #### Story 1 | ||
|
|
||
| From [@pnovotnak](https://github.com/kubernetes/kubernetes/issues/4301#issuecomment-328685358): | ||
|
|
||
| ``` | ||
| I have a number of scientific programs that I've wrapped with code to talk | ||
| to a message broker that do not checkpoint state. The cost of deleting the resource | ||
| increases over time (some of these tasks take hours), until it completes the current unit of work. | ||
|
|
||
| Choosing a pod by most idle resources would also work in my case. | ||
| ``` | ||
|
|
||
| #### Story 2 | ||
|
|
||
| From [@cpwood](https://github.com/kubernetes/kubernetes/issues/4301#issuecomment-436587548) | ||
|
|
||
| ``` | ||
| For my use case, I'd prefer Kubernetes to choose its victims from pods that are running on nodes which have a PreferNoSchedule taint. | ||
| ``` | ||
|
|
||
| #### Story 3 | ||
|
|
||
| From [@barucoh](https://github.com/kubernetes/kubernetes/issues/89922) | ||
|
|
||
| ``` | ||
| A deployment with 3 replicas with anti-affinity rule to spread across 2 AZs scaled down to 2 replicas in only 1 AZ. | ||
| ``` | ||
|
|
||
| ### Implementation Details/Notes/Constraints | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we should center the discussion here around descheduler. What we need to propose is an abstract concept of eviction cost, with the cost being comparable among the pods that belong to the same group/collection (have the same owner ref for example) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If that's the intent of the KEP, it should be specified in Goals/Non-Goals |
||
|
|
||
| Currently, the descheduler does not allow to immediately react on changes in a cluster. | ||
| Yet, with some modification another instance of the descheduler (with different set of strategies) | ||
| might be ran in watch mode and rank each pod as it comes. | ||
| Also, once the scheduling framework gets migrated into its own repository, | ||
| scheduling plugins can be vendored as well to provide some of the core scheduling logic. | ||
|
|
||
| The pod ranking is best-effort so in case a controller is to delete more than one pod | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO "Delete more than one pod" would be a common req, "selects all the pods with the smallest cost and remove" may flip and imbalance situation easily. So is it possible to ensure the "cost" design to support "Delete X pods in a batch"? No need to be perfect in the beginning, but keep the batch-deletion req in mind from day 1 is necessary. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have added a few words on this topic. There are two categories of strategies defined. One category where there's no need to recompute costs. So it's safe to scale down with 2 or more replicas at the same time. Other category where each time a pod is added/removed, all the costs need to be re-computed. I am mainly focusing on the first category to provide basic functionality. With a space to discuss the second category. |
||
| it selects all the pods with the highest cost and remove those. | ||
| In case a pod fails to be deleted during the scale down operation and results in resuming the operation in the next cycle, | ||
| it may happen pods get ranked differently and a different set of victim pods gets selected. | ||
|
|
||
| Once a pod is removed, ranks of others pods might be required to get re-computed. | ||
| Unless strategies that do not require re-computation are deployed. | ||
| By default, all pods owned by a controller template has to be ranked. | ||
| Otherwise, a controller falls back to each original selection victim logic. | ||
| Resp. it can be configured to wait or back-off. | ||
| Also, the ranking strategies can be configured to target only selected sets of pods. | ||
| Thus, allowing a controller to employ cost based selection only when more sophisticated | ||
| logic is required and available. | ||
|
|
||
| During alpha phase, each controller utilizing the pod ranking will feature gate the new logic. | ||
| Starting by utilizing a pod annotation (e.g. `scheduling.alpha.kubernetes.io/cost`) | ||
| which can be eventually promoted to either a field in pod's spec or moved under CRD (see further). | ||
|
|
||
| If strategies requiring rank re-computation are employed, it's more practical to define | ||
| a CRD for a pod group and have all the costs in a single place to avoid desynchronization | ||
| of ranks among pods. | ||
|
|
||
| #### Phases | ||
|
|
||
| Phase 1: | ||
| - add support for strategies which do not need rank re-computation of a pod group | ||
| - only a single strategy can be ran to rank pods (unless threshold based normalization is applied) | ||
| - use annotations to hold a single pod cost | ||
|
|
||
| Phase 2A: | ||
| - promote pod cost annotation to a pod status field | ||
| - no synchronization of pods in a pod group, harder support of strategies which require rank re-computation | ||
|
|
||
| Phase 2B: | ||
| - use a CRD to hold costs of all pods in a pod group (to synchronize re-computation of ranks) | ||
| - add support for strategies which require rank re-computation | ||
|
|
||
| ### Option A (field in a pod status) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 to this option, I don't think we need more than that, this is a simple and abstract concept will be easy to reason about. |
||
|
|
||
| Store a post cost/rank under pod's status so it can be updated only by component | ||
| who has permission to update pod status. | ||
|
|
||
| ```go | ||
| // PodStatus represents information about the status of a pod. Status may trail the actual | ||
| // state of a system, especially if the node that hosts the pod cannot contact the control | ||
| // plane. | ||
| type PodStatus struct { | ||
| ... | ||
| // More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#pod-cost | ||
| // +optional | ||
| Cost int `json:"cost,omitempty" protobuf:"bytes,...,opt,name=cost"` | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I imagine by default the controller (e.g., the ReplicaSet controller) that creates the pods would set this to the highest conceptual value for new pods because we assume the scheduler will make the best placement possible. Another controller would be responsible for updating the cost to something lower at runtime based on cluster state. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it feasible to have the scheduler record its actual opinion of placement, or to leave room for adding that? For example: a three-zone cluster, even pod spreading, scaling from 2 to 3. The scheduler places the Pod in the zone with no replica and scores this as “I could not have done better”. Now schedule 3 more Pods. The first two go leaving this.
The scheduler considers the 3rd Pod and sees it doesn't fit on Node There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I find the word "cost" very confusing for the intend of this proposal. I think "Penalty" would be more appropriate. Then, that makes it more clear that a 0 penalty means less chance of being evicted. |
||
| ... | ||
| } | ||
| ``` | ||
|
|
||
| Very simple, reading the field directly from a pod status. | ||
| No additional apimachinery logic. | ||
|
|
||
| ### Option B (CRD for a pod group) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't see this CRD adding value compared to option 1. |
||
|
|
||
| ```yaml | ||
| apiVersion: scheduling.k8s.io/v1alpha1 | ||
| kind: PodGroupCost | ||
| metadata: | ||
| name: rc-guestbook-fronted | ||
| namespace: rc-guestbook-fronted-namespace | ||
| spec: | ||
| owner: | ||
| kind: ReplicationController | ||
| name: rc-guestbook-fronted // may be redundant | ||
| costs: | ||
| "rc-guestbook-fronted-pod1": 4 | ||
| "rc-guestbook-fronted-pod2": 8 | ||
| ... | ||
| "rc-guestbook-fronted-podn": 2 | ||
| ``` | ||
|
|
||
| More suitable for keeping all pod costs from a pod group in sync. | ||
| Controllers will need to take into account the new CRD (adding informers). | ||
| A CR will live in the same namespace as underlying pod group (RC, RC, etc.). | ||
|
|
||
| ### Workflow example | ||
|
|
||
| **Scenario**: pod group of 12 pods, 3 AZs (2 nodes per each AZ), pods are evenly spread among all zones | ||
|
|
||
| 1. Assuming a pod group is supposed to respect topology spreading and scale down | ||
| operation is to minimize topology skew between domains. | ||
| 1. **Ranking component**: The component is configured to rank pods based on their presence in a topology domain | ||
| 1. **Ranking component**: The component notices the pods, analyzes the pod group and ranks the pods in the following manner (`{PodName: Rank}`): | ||
| - AZ1: {P1: 1, P2: 2, P3: 3, P4: 4} (P1 getting 1 as it was created first in the domain, P2 getting 2, etc.) | ||
| - AZ2: {P5: 1, P6: 2, P7: 3, P8: 4} | ||
| - AZ3: {P9: 1, P10: 2, P11: 3, P12: 4} | ||
| 1. **A controller**: Scale down operation of the pod group is requested | ||
| 1. **A controller**: Scale down logic of a controller selects one of P4, P8 or P12 as a victim (e.g. P8) | ||
| 1. Topology skew is now `1` | ||
| 1. **Ranking component**: No need to re-compute ranks since the ranking does not depend on the pod group size | ||
| 1. **A controller**: Scaling down one more time selects one of {P4, P12} | ||
| 1. Topology skew is still `1` | ||
|
|
||
| ### Risks and Mitigations | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you separate the risks with a numbered list? |
||
|
|
||
| It may happen the ranking component does not rank all relevant pods in time. | ||
| In that case a controller can either choose to ignore the cost. Or, it can back-off | ||
| with a configurable timeout and retry the scale down operation once all pods in | ||
| a given set are ranked. | ||
|
|
||
| From the security perspective a malicious code might assign pod a different cost | ||
| with a goal to remove more vital pods to harm a running application. | ||
| How much is using annotation safe? Might be better to use pod status | ||
| so only clients with pod/status update RBAC are allowed to change the cost. | ||
|
|
||
| In case a strategy needs to re-compute costs after scale down operation and | ||
| the component stops working (for any reason), a controller might scale down | ||
| incorrect pod(s) in the next request. More reasons to constraint strategies | ||
| to not need to re-compute pod costs. | ||
|
|
||
| In case a scaling down process is too quick, the component may be too slow to | ||
| recompute all scores and provide suboptimal/incorrect costs. | ||
|
|
||
| In case two or more controllers own a pod group (through labels), scaling down the group by one | ||
| controller can result in scale up the same group by another controller. | ||
| Entering an endless loop of scaling up and down. Which may result in unexpected | ||
| behavior. Leaving a subgroup of pods unranked. | ||
|
|
||
| Deployment upgrades might have different expectations when exercising a rolling update. | ||
| They could just completely ignore the costs. Unless, it's acceptable to scale down by one | ||
| and wait until the costs are recomputed when needed. | ||
|
|
||
| ## Design Details | ||
|
|
||
| ### Test Plan | ||
|
|
||
| **Scaling down respects pod ranking**: | ||
| In the simplest case the component ranks pods in a group. | ||
| The goal is to validate all pods are scaled down in an order | ||
| respecting ranks of all pods in a pod group. | ||
|
|
||
| **A controller ignores ranks if at least one pod is missing a rank**: | ||
| Testing the case where not every pod in a pod group is ranked. | ||
| A controller falls down to its original behavior if not all pods | ||
| in a pod group are ranked after specified timeout (back-off simulation). | ||
|
|
||
| **In case a strategy requiring re-computation after a pod group size changed**: | ||
| - a controller will not scale down by two (only by one) replicas | ||
| - a controller will not scale down by one until pod ranks are re-computed after previous scale down operation | ||
|
|
||
| ### Graduation Criteria | ||
|
|
||
| - Alpha: Initial support for taking pod cost into account when scaling down in controllers. Disabled by default. | ||
| - Beta: Enabled by default | ||
|
|
||
| ### Upgrade / Downgrade Strategy | ||
|
|
||
| Scaling down based on a pod cost is optional. If no cost is present, scaling down falls back to the original behavior. | ||
|
|
||
| ### Version Skew Strategy | ||
|
|
||
| A controller either recognizes pod's cost or it does not. | ||
|
|
||
| ## Implementation History | ||
|
|
||
| - KEP Started on 06/30/2020 | ||
|
|
||
| ## Alternatives [optional] | ||
|
|
||
| - Controllers might use a webhook and talk to the component directly to select a victim | ||
| - Some controllers might improve their decision logic to cover specific use cases (e.g. introduce new policy for sorting pods based on information located in pod objects) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you are using the old template. Please update