Skip to content

KEP-3008: QoS-class resources #3004

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 92 commits into
base: master
Choose a base branch
from

Conversation

marquiz
Copy link
Contributor

@marquiz marquiz commented Oct 7, 2021

  • One-line PR description:
    New KEP for adding class-based resources to CRI protocol
  • Other comments:

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 7, 2021
@marquiz
Copy link
Contributor Author

marquiz commented Oct 7, 2021

/cc @kad @haircommander

@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/node Categorizes an issue or PR as relevant to SIG Node. labels Oct 7, 2021
@kikisdeliveryservice
Copy link
Member

kikisdeliveryservice commented Oct 8, 2021

Hi @marquiz

All KEP PRs must have an open issue in k/enhancements (this repo). Please open an issue and fill it out completely and rename this PR to KEP-issue number but in the title of this PR and in your README.md

Thanks!

@marquiz marquiz mentioned this pull request Oct 11, 2021
4 tasks
@marquiz marquiz changed the title KEP-NNN: Class-based resources in CRI KEP-3008: Class-based resources in CRI Oct 11, 2021
@marquiz marquiz force-pushed the devel/kep-class-based-resources branch from 3c8bfb5 to ead1a96 Compare October 11, 2021 06:48
@marquiz
Copy link
Contributor Author

marquiz commented Oct 11, 2021

All KEP PRs must have an open issue in k/enhancements (this repo). Please open an issue and fill it out completely and rename this PR to KEP-issue number but in the title of this PR and in your README.md

Thanks for the guidance @kikisdeliveryservice. Done

@marquiz marquiz force-pushed the devel/kep-class-based-resources branch 3 times, most recently from ff20ea2 to 5f2e9fe Compare October 19, 2021 11:46
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 17, 2022
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 1, 2022
@marquiz
Copy link
Contributor Author

marquiz commented Feb 1, 2022

/retitle KEP-3008: Class-based resources

@k8s-ci-robot k8s-ci-robot changed the title KEP-3008: Class-based resources in CRI KEP-3008: Class-based resources Feb 1, 2022
@marquiz marquiz marked this pull request as ready for review February 11, 2022 13:18
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 11, 2022
@marquiz marquiz force-pushed the devel/kep-class-based-resources branch 3 times, most recently from 8786f80 to bf746cc Compare February 22, 2022 14:39
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: marquiz
Once this PR has been reviewed and has the lgtm label, please assign johnbelamaric for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@marquiz
Copy link
Contributor Author

marquiz commented Feb 1, 2024

Updated details on cluster autoscaler

@liggitt liggitt removed this from the v1.30 milestone Mar 14, 2024
- bump versions in kep.yaml
- fix typos
- added a use case for per-container OOM kill behavior
@marquiz
Copy link
Contributor Author

marquiz commented Apr 10, 2024

Small update, added per-container OOM kill behavior as a use case. There is a growing list of use cases for Kubernetes-managed QoS resources with links to ongoing efforts (e.g. swap and OOM kill)

@mkarrmann
Copy link

@marquiz As the person who's been most actively pushing for the OOM config option and just found about this KEP, I strongly agree that this is a great fit! This design is seems awesome and is a really nice generalization of the specific problem I was hoping to have solved!

container whose memory limit is higher than the requests, but that is treated
by the node as `Guaranteed`.

Taking this idea further, QoS-class resources could also make it possible to
Copy link
Contributor

@kannon92 kannon92 Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe kubernetes/kubernetes#78848 is a good tangigle use case to mention.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kannon92 for the reference, I'll add it to the proposal. This is already speculated in Splitting Pod QoS Class section but I wasn't aware of this open issue.

QoS-class resource. Likely benefits of using the QoS-class resources mechanism
would be to be able to set per-namespace defaults with LimitRanges and allow
permission-control to high-priority classes with ResourceQuotas.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea is smarter oom setups (systemd-oomd for example).

systemd-oomd allows one to set settings for oom (psi based pressure and swap usage) on a cgroup slice. I could envision a future where one can create a QoS and that gives knobs to control systemd-oomd for more aggressive oom.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://fedoraproject.org/wiki/Changes/EnableSystemdOomd

I was playing around with this and I explored the ideas of setting different knobs based on the existing kubepods slices.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. So you think a about pre-defined oom kill classes (in contrast to e.g. exposing every possible systemd-oomd knob to the user). At first thought, this sounds like a good fit. I'll add this to the proposal too, as a possible future use case.

that kubelet could evict a running pod that request QoS-class resources that
are no more available on the node. This should be relatively straightforward to
implement as kubelet knows what QoS-class resources are available on the node
and also monitors all running pods.
Copy link
Contributor

@kannon92 kannon92 Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some areas that eviction manager is starting to show its age.

  • If one wants to add more disks to kubelet, we lose the ability for eviction manager to monitor it.
  • PSI based eviction
  • Swap based eviction
  • Moving things out of root partition (logs come in mind)

I've been thinking that eventually we are going to need some kind of pluggable eviction manager based on certain resources. It should be general but I haven't really made much progress on it. wanted to throw this out there as it may be worth considering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kannon92 thanks for the idea. Do you have any PoC implementation on this? If you have, it would be nice to wire it up to my code and see how it works.


<!-- References -->
[intel-rdt]: https://www.intel.com/content/www/us/en/architecture-and-technology/resource-director-technology.html
[linux-resctrl]: https://www.kernel.org/doc/html/latest/x86/resctrl.html
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This URL now 404s

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New link:

Suggested change
[linux-resctrl]: https://www.kernel.org/doc/html/latest/x86/resctrl.html
[linux-resctrl]: https://docs.kernel.org/arch/x86/resctrl.html

@Madhu-1
Copy link

Madhu-1 commented Jun 26, 2024

Hey @marquiz, it's great to see that this is making progress! Do you have any updates on the timeline for when this version is planned to be released? The KEP indicates it's planned for 1.31, but I was wondering if it might end up being pushed to 1.32 or later instead.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 24, 2024
@kad
Copy link
Member

kad commented Sep 24, 2024

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 24, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 23, 2024
@kad
Copy link
Member

kad commented Dec 24, 2024

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 24, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 7, 2025
@kad
Copy link
Member

kad commented Apr 7, 2025

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 7, 2025
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-review Categorizes an issue or PR as actively needing an API review. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
Status: Assigned
Status: Needs Triage
Development

Successfully merging this pull request may close these issues.