-
Notifications
You must be signed in to change notification settings - Fork 15.1k
KEP-4671 Add docs for Workload API and Gang scheduling #53296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev-1.35
Are you sure you want to change the base?
Conversation
👷 Deploy Preview for kubernetes-io-vnext-staging processing.
|
✅ Pull request preview available for checkingBuilt without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify project configuration. |
erictune
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Text looks good.
Should the new concepts page be linked from somewhere?
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: erictune The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR.
Because Pod is a stable API, you also need to update the Pod documentation. You need to do this work even though the new APIs are only alpha.
Explain that the behavior of Pod depends on whether the reader, a cluster administrator, has or has not enabled the relevant feature gates.
Watch out for putting new documentation in one page. It's tempting to do that because what you are documenting is part of one package of improvements; however, readers learn about different elements of Kubernetes in different pages, and these improvements touch on several of those (not just scheduling).
I would put most of the new content into the Workloads
section of the docs, for example by adding a section about Pod groups, at one of:
• https://kubernetes.io/docs/concepts/workloads/pod-groups/
• https://kubernetes.io/docs/concepts/workloads/pods/groups/
(I prefer the former, personally; PodGroup is an API separate from Pod).
Gang scheduling, however, I would place at
• https://kubernetes.io/docs/concepts/scheduling-eviction/gang-scheduling/
You can also, either for alpha or beta, work with SIG Docs to add a new tutorial. If you do, various other pages can and should link there.
| weight: 120 | ||
| --- | ||
|
|
||
| This page provides an overview of Workload Aware Scheduling (WAS), a Kubernetes feature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| This page provides an overview of Workload Aware Scheduling (WAS), a Kubernetes feature | |
| This page provides an overview of _workload aware scheduling_ (WAS), a Kubernetes feature |
|
|
||
| {{< feature-state feature_gate_name="GenericWorkload" >}} | ||
|
|
||
| The `Workload` API resource, available from the `scheduling.k8s.io/v1alpha1` API group, allows you to logically group a set of Pods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following the style guide, you wouldn't use backticks around Workload. I recommend removing them.
| # ... | ||
| ``` | ||
|
|
||
| ## Scheduling Policies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should update the Policies concept page to hyperlink to wherever we end up putting this section.
| --- | ||
|
|
||
| Enables the GangScheduling plugin in kube-scheduler, which implements "all-or-nothing" | ||
| scheduling algorithm. The Workload API is used to express the requirements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) Workload could be a hyperlink
| --- | ||
|
|
||
| Enables the scheduling.k8s.io/v1alpha1 Workload API to express scheduling requirements | ||
| at the workload level. Pods can now reference a specific Workload PodGroup using the spec.workloadRef field. No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) PodGroup and Workload could be hyperlinks.
| fromVersion: "1.35" | ||
| --- | ||
|
|
||
| Enables the scheduling.k8s.io/v1alpha1 Workload API to express scheduling requirements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it doesn't, surely. You must also enable that API group separately?
| spec: | ||
| # controllerRef provides a link to the object that manages this Workload, | ||
| # such as a Kubernetes Job. This is for tooling and observability. | ||
| controllerRef: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we may need to explain the difference between "the Job controller" (which is a controller) and "a Job" (which represents a desired and observed state that the Job controller operates on)
| because no single node has enough capacity for them. The job cannot run, | ||
| but the scheduled Pods waste expensive resources that other applications could use. | ||
|
|
||
| Workload Aware Scheduling introduces a mechanism for the scheduler to identify and manage a group of Pods as a single, atomic workload. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aim to write the documentation mostly as if the feature is already generally available, and then garnish it with caveats about it actually being alpha.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good documentation is often timeless
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't add this file at all.
|
/sig scheduling node |
|
|
||
| ## What is Workload Aware Scheduling? | ||
|
|
||
| The default Kubernetes scheduler makes decisions for one Pod at a time. This model works sufficiently good for stateless applications, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't exactly true. The default scheduler's behavior, at the time this doc is live, depends on whether you have enabled the GangScheduling feature gate.
v1.35 K8s will, of course, support gang scheduling (as alpha), in-tree.
|
@lmktfy thank you for your valuable review. Just to be on the same page:
What Pod documentation are you referring to? Are you talking about mentioning WorkloadReference somewhere in the “https://kubernetes.io/docs/concepts/workloads/pods/” section, or somewhere else?
So I should split the documentation page into two parts: move the part about the PodGroups to https://kubernetes.io/docs/concepts/workloads/pods-groups/, and the part about Gang Scheduling to https://kubernetes.io/docs/concepts/scheduling-eviction/gang-scheduling/, right? Should I describe the part about (whole) Workload API in the PodGroups docs or somewhere else?
Good idea, let's do that for the beta. |
Description
This PR adds feature gates docs and a new Workload Aware Scheduling tab to the scheduling docs based on KEP-4671.
Issue
KEP: kubernetes/enhancements#4671