From 07b02d051f6d265b1b59a55c757f8e86ecb184f9 Mon Sep 17 00:00:00 2001 From: Jon Cope Date: Tue, 23 Sep 2025 16:44:50 -0500 Subject: [PATCH 1/8] init microshift featureGate enhancement --- .../enabling-user-specfied-featuregates.md | 447 ++++++++++++++++++ 1 file changed, 447 insertions(+) create mode 100644 enhancements/microshift/enabling-user-specfied-featuregates.md diff --git a/enhancements/microshift/enabling-user-specfied-featuregates.md b/enhancements/microshift/enabling-user-specfied-featuregates.md new file mode 100644 index 0000000000..995bf92e7e --- /dev/null +++ b/enhancements/microshift/enabling-user-specfied-featuregates.md @@ -0,0 +1,447 @@ +--- +title: enabling-user-specified-featuregates +authors: + - TBD +reviewers: + - TBD # MicroShift core team for configuration changes + - TBD # Kubernetes upstream expert for feature gate implications + - TBD # OpenShift platform team for alignment with OpenShift defaults +approvers: + - TBD # MicroShift principal engineer +api-approvers: + - None # Configuration file changes only, no API modifications +creation-date: 2025-01-XX # You'll need to fill in today's date +last-updated: 2025-01-XX +tracking-link: + - TBD # Link to USHIFT-6080 or the main epic ticket +see-also: + - TBD # Any related enhancements if applicable +--- + +# Enabling User-Specified FeatureGates in MicroShift + +## Summary + +MicroShift currently inherits feature gates from its Kubernetes and OpenShift upstream components but lacks a controlled mechanism for users to experiment with additional feature gates or override defaults. This enhancement proposes adding configuration support for Kubernetes and OpenShift feature gates through the MicroShift configuration file, while ensuring that default feature gates remain aligned with OpenShift during automated rebases. This capability will enable users to experiment with alpha and beta Kubernetes features like CPUManager's `prefer-align-cpus-by-uncorecache` in a supported and deterministic way, addressing edge computing use cases where users want to evaluate advanced resource management capabilities. + +## Motivation + +MicroShift users in edge computing environments want to experiment with upcoming Kubernetes features that are in alpha or beta stages to evaluate their potential benefits for specific use cases. Currently, users cannot configure feature gates in a supported way, preventing them from experimenting with capabilities like advanced CPU management, enhanced scheduling features, or experimental storage options that might improve performance in their resource-constrained edge environments. + +Additionally, the lack of automated feature gate alignment during OpenShift rebases has caused issues like [USHIFT-2813](https://issues.redhat.com/browse/USHIFT-2813), where conflicting feature gate defaults between different Kubernetes components (such as kube-controller-manager and kube-apiserver) led to compatibility problems during version upgrades. This enhancement addresses both user configuration needs and operational stability requirements. + +### User Stories + +* As a MicroShift administrator, I want to experiment with the CPUManager `prefer-align-cpus-by-uncorecache` feature gate so that I can evaluate whether it improves CPU allocation for my high-performance computing workloads on edge devices with specific CPU topologies. + +* As a MicroShift administrator, I want to configure feature gates through the MicroShift configuration file so that I can experiment with upcoming Kubernetes features in a controlled and supported manner. + +* As a MicroShift developer, I want feature gates to remain automatically aligned with OpenShift defaults during rebases so that I can avoid compatibility issues and manual intervention during version upgrades. + +* As an edge computing platform operator, I want to experiment with specific alpha/beta features across my development and testing environments so that I can evaluate their potential benefits before considering them for production use. + +### Goals + +* Enable user configuration of Kubernetes and OpenShift feature gates through the MicroShift configuration file +* Maintain automatic alignment of default feature gates with OpenShift during rebases +* Provide a controlled and deterministic way to experiment with alpha and beta features +* Prevent feature gate misalignment issues during Kubernetes version upgrades +* Support edge computing experimentation with advanced resource management features + +### Non-Goals + +* Modifying OpenShift's feature gate defaults or upstream Kubernetes behavior +* Supporting feature gates that fundamentally conflict with MicroShift's architecture +* Automatic enablement of experimental features without explicit user configuration for experimentation + +## Proposal + +This enhancement proposes adding feature gate configuration support to MicroShift by extending `/etc/microshift/config.yaml` to mirror OpenShift's FeatureGate custom resource specification. The configuration will support both predefined feature sets and custom feature gate combinations, ensuring consistency with OpenShift's FeatureGate API patterns. + +The implementation includes: + +1. **FeatureGate Configuration Schema**: Extend MicroShift's configuration file to include `featureGates` section matching the OpenShift FeatureGate CRD spec fields (`featureSet` and `customNoUpgrade`) +2. **Predefined Feature Sets**: Support for OpenShift's predefined feature sets like `TechPreviewNoUpgrade` +3. **Custom Feature Gates**: Support for individual feature gate enablement/disablement via `customNoUpgrade` configuration +4. **Automated Rebase Integration**: Maintain feature gate alignment with OpenShift defaults during rebases + +This approach ensures that users can experiment with the same feature gate capabilities as OpenShift while maintaining MicroShift's file-based configuration pattern. Default feature gate values will continue to be inherited from OpenShift to ensure consistency across the platform. + +### Workflow Description + +**MicroShift Administrator** is a human user responsible for configuring and managing MicroShift deployments. + +**MicroShift Developer** is a human user responsible for maintaining MicroShift codebase and rebases. + +#### User Configuration Workflow +1. MicroShift Administrator identifies a need for specific feature gates (e.g., `CPUManagerPolicyAlphaOptions`) +2. Administrator chooses between two configuration approaches: + - **Predefined Feature Set**: Configure `featureGates.featureSet: TechPreviewNoUpgrade` for a curated set of preview features + - **Custom Feature Gates**: Configure `featureGates.featureSet: CustomNoUpgrade` and specify individual features in `featureGates.customNoUpgrade.enabled/disabled` lists +3. Administrator updates `/etc/microshift/config.yaml` with the chosen configuration +4. Administrator restarts MicroShift service +5. MicroShift parses the FeatureGate configuration and passes settings to relevant Kubernetes components where validation occurs +6. The features become available according to the configured state + +#### Automated Rebase Workflow +1. CI automation initiates OpenShift rebase process +2. Automated tooling compares feature gate defaults between MicroShift and OpenShift components +3. Conflicts or misalignments are detected and flagged +4. Developer resolves conflicts to ensure MicroShift maintains the same default feature gates as OpenShift +5. Default feature gate alignment with OpenShift is maintained automatically + +### API Extensions + +This enhancement extends MicroShift's configuration file schema only. No new CRDs, admission webhooks, conversion webhooks, aggregated API servers, or finalizers are introduced. The configuration file structure will be extended to include a `featureGates` section that mirrors the OpenShift FeatureGate CRD specification, providing consistency with OpenShift's feature gate configuration patterns while maintaining MicroShift's file-based configuration approach. + +### Topology Considerations + +#### Hypershift / Hosted Control Planes + +This enhancement is not applicable to Hypershift/Hosted Control Planes as feature gate configuration in hosted environments would be managed through the hosting cluster's OpenShift FeatureGate API rather than through MicroShift configuration. + +#### Standalone Clusters + +This enhancement is primarily designed for standalone MicroShift deployments where administrators need direct control over feature gate configuration through the local configuration file. + +#### Single-node Deployments or MicroShift + +This enhancement is specific to MicroShift only and does not affect single-node OpenShift (SNO) deployments. + +For MicroShift, feature gates configured through this mechanism will affect all Kubernetes components running within the MicroShift instance, including: + +- kubelet +- kube-apiserver +- kube-controller-manager +- kube-scheduler + +The resource consumption impact will be minimal as this enhancement only adds configuration parsing and pass-through functionality. The actual resource impact will depend on which feature gates are enabled by users and their specific behaviors. + +### Implementation Details/Notes/Constraints + +#### Configuration Schema Extension + +The MicroShift configuration file will be extended to include a new `featureGates` section that mirrors the OpenShift FeatureGate CRD specification: + +**Predefined Feature Set Configuration:** +```yaml +featureGates: + featureSet: TechPreviewNoUpgrade +``` + +**Custom Feature Gates Configuration:** +```yaml +featureGates: + featureSet: CustomNoUpgrade + customNoUpgrade: + enabled: + - "CPUManagerPolicyAlphaOptions" + - "MemoryQoS" + disabled: + - "SomeDefaultEnabledFeature" +``` + +**Configuration Rules:** +- The `featureSet` field is required when configuring feature gates +- When using `customNoUpgrade`, the `featureSet` must be set to `CustomNoUpgrade` +- The `customNoUpgrade` field is only valid when `featureSet: CustomNoUpgrade` + +This configuration will be parsed during MicroShift startup and the feature gate settings will be passed to the appropriate Kubernetes components via their command-line arguments or configuration files. + +#### Component Integration + +Feature gates will be applied to the following MicroShift components, which are integrated into the MicroShift runtime rather than running as separate processes: +- **kubelet**: Feature gates specified in kubelet configuration file +- **kube-apiserver**: Feature gates specified in kube-apiserver configuration file +- **kube-controller-manager**: Feature gates specified in kube-controller-manager configuration file +- **kube-scheduler**: Feature gates specified in kube-scheduler configuration file + +MicroShift will generate or modify the appropriate configuration files for each component based on the user's feature gate settings in the MicroShift configuration file. + +#### Rebase Automation + +To address the requirements from [USHIFT-2813](https://issues.redhat.com/browse/USHIFT-2813), the rebase process will include: + +1. **Feature Gate Inventory**: Automated tooling to extract default feature gate settings from each Kubernetes component +2. **Conflict Detection**: Comparison logic to identify conflicting defaults between components +3. **Alignment Verification**: CI checks to ensure MicroShift's defaults match OpenShift's defaults +4. **Override Mechanism**: Use of Kubernetes' `OverrideDefault` method where necessary to resolve conflicts + +#### Validation and Error Handling + +- Invalid feature gate names will be caught by the Kubernetes components themselves +- MicroShift will log configuration parsing errors but delegate feature gate validation to the components +- Conflicting feature gate settings between user configuration and component requirements will result in component startup failures with appropriate error messages + +### Risks and Mitigations + +**Risk: Feature Gate Conflicts Between Components** +Components may have conflicting default values for the same feature gate, as experienced in [USHIFT-2813](https://issues.redhat.com/browse/USHIFT-2813). + +*Mitigation:* Implement automated detection during the rebase process to identify conflicts early and establish a consistent approach to resolve conflicting feature gate defaults across all components. + +**Risk: Experimenting with Unstable Alpha Features** +Users experimenting with alpha-stage feature gates may encounter instability or data loss in their MicroShift deployments. + +*Mitigation:* Emphasize that experimentation should be conducted in non-production environments. Feature gate validation will be handled by the Kubernetes components themselves. + +**Risk: Feature Gate Misalignment During Rebases** +Manual rebase processes may miss feature gate changes in OpenShift, leading to divergent behavior. + +*Mitigation:* Integrate automated feature gate alignment checks into the CI rebase process to ensure MicroShift maintains the same defaults as OpenShift automatically. + +**Risk: Configuration Errors** +Invalid feature gate configurations could prevent MicroShift components from starting. + +*Mitigation:* Leverage Kubernetes component validation for feature gate names and values. Provide clear error messages and documentation for troubleshooting configuration issues. + +**Risk: Security Implications** +Some feature gates may expose new attack vectors or security vulnerabilities. + +*Mitigation:* Security review will follow standard MicroShift processes. Feature gates that fundamentally conflict with MicroShift's security model will be documented as unsupported. + +### Drawbacks + +**Increased Configuration Complexity** +Adding feature gate configuration increases the complexity of MicroShift's configuration surface area. Users must understand both the feature gates themselves and their potential interactions, which could lead to misconfigurations in edge deployments where troubleshooting access is limited. + +**Maintenance Burden** +This enhancement requires ongoing maintenance to keep feature gate handling aligned with OpenShift during rebases. The automated tooling and processes need to be maintained and updated as Kubernetes and OpenShift evolve their feature gate mechanisms. + +**Support Complexity** +Enabling alpha and beta features through user configuration means support teams may encounter issues related to experimental functionality that behaves differently across Kubernetes versions or has incomplete implementations. + +**Edge Device Risk** +Edge deployments often have limited remote access for troubleshooting. If users enable experimental feature gates that cause instability, recovering these devices may require physical access or complex recovery procedures. + +**Upgrade Limitations and Irreversible Changes** +Enabling `TechPreviewNoUpgrade` feature set cannot be undone and prevents both minor version updates and major upgrades. Once enabled, the cluster permanently loses the ability to perform standard updates. Similarly, `CustomNoUpgrade` configurations prevent upgrades/updates until reset to default settings. These feature sets are explicitly not recommended for production clusters due to their irreversible nature and update limitations, which conflicts with the typical edge deployment requirement for reliable, long-term operation and maintenance. + +## Alternatives (Not Implemented) + +No significant alternatives were considered for this enhancement. The configuration file approach aligns with MicroShift's existing patterns and provides the required user-configurable feature gates with automated OpenShift alignment. + +## Open Questions [optional] + +This is where to call out areas of the design that require closure before deciding +to implement the design. For instance, + > 1. This requires exposing previously private resources which contain sensitive + information. Can we do this? + +## Test Plan + +**Note:** *Section not required until targeted at a release.* + +Consider the following in developing a test plan for this enhancement: +- Will there be e2e and integration tests, in addition to unit tests? +- How will it be tested in isolation vs with other components? +- What additional testing is necessary to support managed OpenShift service-based offerings? + +No need to outline all of the test cases, just the general strategy. Anything +that would count as tricky in the implementation and anything particularly +challenging to test should be called out. + +All code is expected to have adequate tests (eventually with coverage +expectations). + +## Graduation Criteria + +**Note:** *Section not required until targeted at a release.* + +Define graduation milestones. + +These may be defined in terms of API maturity, or as something else. Initial proposal +should keep this high-level with a focus on what signals will be looked at to +determine graduation. + +Consider the following in developing the graduation criteria for this +enhancement: + +- Maturity levels + - [`alpha`, `beta`, `stable` in upstream Kubernetes][maturity-levels] + - `Dev Preview`, `Tech Preview`, `GA` in OpenShift +- [Deprecation policy][deprecation-policy] + +Clearly define what graduation means by either linking to the [API doc definition](https://kubernetes.io/docs/concepts/overview/kubernetes-api/#api-versioning), +or by redefining what graduation means. + +In general, we try to use the same stages (alpha, beta, GA), regardless how the functionality is accessed. + +[maturity-levels]: https://git.k8s.io/community/contributors/devel/sig-architecture/api_changes.md#alpha-beta-and-stable-versions +[deprecation-policy]: https://kubernetes.io/docs/reference/using-api/deprecation-policy/ + +**If this is a user facing change requiring new or updated documentation in [openshift-docs](https://github.com/openshift/openshift-docs/), +please be sure to include in the graduation criteria.** + +**Examples**: These are generalized examples to consider, in addition +to the aforementioned [maturity levels][maturity-levels]. + +### Dev Preview -> Tech Preview + +- Ability to utilize the enhancement end to end +- End user documentation, relative API stability +- Sufficient test coverage +- Gather feedback from users rather than just developers +- Enumerate service level indicators (SLIs), expose SLIs as metrics +- Write symptoms-based alerts for the component(s) + +### Tech Preview -> GA + +- More testing (upgrade, downgrade, scale) +- Sufficient time for feedback +- Available by default +- Backhaul SLI telemetry +- Document SLOs for the component +- Conduct load testing +- User facing documentation created in [openshift-docs](https://github.com/openshift/openshift-docs/) + +**For non-optional features moving to GA, the graduation criteria must include +end to end tests.** + +### Removing a deprecated feature + +- Announce deprecation and support policy of the existing feature +- Deprecate the feature + +## Upgrade / Downgrade Strategy + +If applicable, how will the component be upgraded and downgraded? Make sure this +is in the test plan. + +Consider the following in developing an upgrade/downgrade strategy for this +enhancement: +- What changes (in invocations, configurations, API use, etc.) is an existing + cluster required to make on upgrade in order to keep previous behavior? +- What changes (in invocations, configurations, API use, etc.) is an existing + cluster required to make on upgrade in order to make use of the enhancement? + +Upgrade expectations: +- Each component should remain available for user requests and + workloads during upgrades. Ensure the components leverage best practices in handling [voluntary + disruption](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/). Any exception to + this should be identified and discussed here. +- Micro version upgrades - users should be able to skip forward versions within a + minor release stream without being required to pass through intermediate + versions - i.e. `x.y.N->x.y.N+2` should work without requiring `x.y.N->x.y.N+1` + as an intermediate step. +- Minor version upgrades - you only need to support `x.N->x.N+1` upgrade + steps. So, for example, it is acceptable to require a user running 4.3 to + upgrade to 4.5 with a `4.3->4.4` step followed by a `4.4->4.5` step. +- While an upgrade is in progress, new component versions should + continue to operate correctly in concert with older component + versions (aka "version skew"). For example, if a node is down, and + an operator is rolling out a daemonset, the old and new daemonset + pods must continue to work correctly even while the cluster remains + in this partially upgraded state for some time. + +Downgrade expectations: +- If an `N->N+1` upgrade fails mid-way through, or if the `N+1` cluster is + misbehaving, it should be possible for the user to rollback to `N`. It is + acceptable to require some documented manual steps in order to fully restore + the downgraded cluster to its previous state. Examples of acceptable steps + include: + - Deleting any CVO-managed resources added by the new version. The + CVO does not currently delete resources that no longer exist in + the target version. + +## Version Skew Strategy + +How will the component handle version skew with other components? +What are the guarantees? Make sure this is in the test plan. + +Consider the following in developing a version skew strategy for this +enhancement: +- During an upgrade, we will always have skew among components, how will this impact your work? +- Does this enhancement involve coordinating behavior in the control plane and + in the kubelet? How does an n-2 kubelet without this feature available behave + when this feature is used? +- Will any other components on the node change? For example, changes to CSI, CRI + or CNI may require updating that component before the kubelet. + +## Operational Aspects of API Extensions + +Describe the impact of API extensions (mentioned in the proposal section, i.e. CRDs, +admission and conversion webhooks, aggregated API servers, finalizers) here in detail, +especially how they impact the OCP system architecture and operational aspects. + +- For conversion/admission webhooks and aggregated apiservers: what are the SLIs (Service Level + Indicators) an administrator or support can use to determine the health of the API extensions + + Examples (metrics, alerts, operator conditions) + - authentication-operator condition `APIServerDegraded=False` + - authentication-operator condition `APIServerAvailable=True` + - openshift-authentication/oauth-apiserver deployment and pods health + +- What impact do these API extensions have on existing SLIs (e.g. scalability, API throughput, + API availability) + + Examples: + - Adds 1s to every pod update in the system, slowing down pod scheduling by 5s on average. + - Fails creation of ConfigMap in the system when the webhook is not available. + - Adds a dependency on the SDN service network for all resources, risking API availability in case + of SDN issues. + - Expected use-cases require less than 1000 instances of the CRD, not impacting + general API throughput. + +- How is the impact on existing SLIs to be measured and when (e.g. every release by QE, or + automatically in CI) and by whom (e.g. perf team; name the responsible person and let them review + this enhancement) + +- Describe the possible failure modes of the API extensions. +- Describe how a failure or behaviour of the extension will impact the overall cluster health + (e.g. which kube-controller-manager functionality will stop working), especially regarding + stability, availability, performance and security. +- Describe which OCP teams are likely to be called upon in case of escalation with one of the failure modes + and add them as reviewers to this enhancement. + +## Support Procedures + +Describe how to +- detect the failure modes in a support situation, describe possible symptoms (events, metrics, + alerts, which log output in which component) + + Examples: + - If the webhook is not running, kube-apiserver logs will show errors like "failed to call admission webhook xyz". + - Operator X will degrade with message "Failed to launch webhook server" and reason "WehhookServerFailed". + - The metric `webhook_admission_duration_seconds("openpolicyagent-admission", "mutating", "put", "false")` + will show >1s latency and alert `WebhookAdmissionLatencyHigh` will fire. + +- disable the API extension (e.g. remove MutatingWebhookConfiguration `xyz`, remove APIService `foo`) + + - What consequences does it have on the cluster health? + + Examples: + - Garbage collection in kube-controller-manager will stop working. + - Quota will be wrongly computed. + - Disabling/removing the CRD is not possible without removing the CR instances. Customer will lose data. + Disabling the conversion webhook will break garbage collection. + + - What consequences does it have on existing, running workloads? + + Examples: + - New namespaces won't get the finalizer "xyz" and hence might leak resource X + when deleted. + - SDN pod-to-pod routing will stop updating, potentially breaking pod-to-pod + communication after some minutes. + + - What consequences does it have for newly created workloads? + + Examples: + - New pods in namespace with Istio support will not get sidecars injected, breaking + their networking. + +- Does functionality fail gracefully and will work resume when re-enabled without risking + consistency? + + Examples: + - The mutating admission webhook "xyz" has FailPolicy=Ignore and hence + will not block the creation or updates on objects when it fails. When the + webhook comes back online, there is a controller reconciling all objects, applying + labels that were not applied during admission webhook downtime. + - Namespaces deletion will not delete all objects in etcd, leading to zombie + objects when another namespace with the same name is created. + +## Infrastructure Needed [optional] + +Use this section if you need things from the project. Examples include a new +subproject, repos requested, github details, and/or testing infrastructure. \ No newline at end of file From 2a2ff34e543ca292484cf879d909ce5d9fc78c27 Mon Sep 17 00:00:00 2001 From: Jon Cope Date: Fri, 26 Sep 2025 14:13:03 -0500 Subject: [PATCH 2/8] draft for review --- .../enabling-user-specfied-featuregates.md | 351 +++++++----------- 1 file changed, 124 insertions(+), 227 deletions(-) diff --git a/enhancements/microshift/enabling-user-specfied-featuregates.md b/enhancements/microshift/enabling-user-specfied-featuregates.md index 995bf92e7e..6a4fbae013 100644 --- a/enhancements/microshift/enabling-user-specfied-featuregates.md +++ b/enhancements/microshift/enabling-user-specfied-featuregates.md @@ -1,69 +1,58 @@ --- title: enabling-user-specified-featuregates authors: - - TBD -reviewers: - - TBD # MicroShift core team for configuration changes - - TBD # Kubernetes upstream expert for feature gate implications + - copejon +reviewers: + - pacevedom # MicroShift core team for configuration changes - TBD # OpenShift platform team for alignment with OpenShift defaults approvers: - TBD # MicroShift principal engineer api-approvers: - None # Configuration file changes only, no API modifications -creation-date: 2025-01-XX # You'll need to fill in today's date -last-updated: 2025-01-XX +creation-date: 2025-09-24 # You'll need to fill in today's date +last-updated: 2025-09-24 tracking-link: - - TBD # Link to USHIFT-6080 or the main epic ticket + - # Link to USHIFT-6080 or the main epic ticket see-also: - - TBD # Any related enhancements if applicable + - "" --- # Enabling User-Specified FeatureGates in MicroShift ## Summary -MicroShift currently inherits feature gates from its Kubernetes and OpenShift upstream components but lacks a controlled mechanism for users to experiment with additional feature gates or override defaults. This enhancement proposes adding configuration support for Kubernetes and OpenShift feature gates through the MicroShift configuration file, while ensuring that default feature gates remain aligned with OpenShift during automated rebases. This capability will enable users to experiment with alpha and beta Kubernetes features like CPUManager's `prefer-align-cpus-by-uncorecache` in a supported and deterministic way, addressing edge computing use cases where users want to evaluate advanced resource management capabilities. +MicroShift currently inherits feature gates from its OpenShift components but lacks a controlled mechanism for users to experiment with additional feature gates or override defaults. This enhancement proposes adding configuration support for Kubernetes and OpenShift feature gates through the MicroShift configuration file. This capability will enable users to experiment with alpha and beta OpenShift and Kubernetes features like CPUManager's `prefer-align-cpus-by-uncorecache` in a supported and deterministic way, addressing edge computing use cases where users want to evaluate advanced resource management capabilities. ## Motivation MicroShift users in edge computing environments want to experiment with upcoming Kubernetes features that are in alpha or beta stages to evaluate their potential benefits for specific use cases. Currently, users cannot configure feature gates in a supported way, preventing them from experimenting with capabilities like advanced CPU management, enhanced scheduling features, or experimental storage options that might improve performance in their resource-constrained edge environments. -Additionally, the lack of automated feature gate alignment during OpenShift rebases has caused issues like [USHIFT-2813](https://issues.redhat.com/browse/USHIFT-2813), where conflicting feature gate defaults between different Kubernetes components (such as kube-controller-manager and kube-apiserver) led to compatibility problems during version upgrades. This enhancement addresses both user configuration needs and operational stability requirements. - ### User Stories -* As a MicroShift administrator, I want to experiment with the CPUManager `prefer-align-cpus-by-uncorecache` feature gate so that I can evaluate whether it improves CPU allocation for my high-performance computing workloads on edge devices with specific CPU topologies. - -* As a MicroShift administrator, I want to configure feature gates through the MicroShift configuration file so that I can experiment with upcoming Kubernetes features in a controlled and supported manner. - -* As a MicroShift developer, I want feature gates to remain automatically aligned with OpenShift defaults during rebases so that I can avoid compatibility issues and manual intervention during version upgrades. - -* As an edge computing platform operator, I want to experiment with specific alpha/beta features across my development and testing environments so that I can evaluate their potential benefits before considering them for production use. +* As a MicroShift administrator, I want to configure feature gates through the MicroShift configuration file so that I can experiment with alpha/beta OpenShift features in a controlled and supported manner. ### Goals * Enable user configuration of Kubernetes and OpenShift feature gates through the MicroShift configuration file -* Maintain automatic alignment of default feature gates with OpenShift during rebases * Provide a controlled and deterministic way to experiment with alpha and beta features -* Prevent feature gate misalignment issues during Kubernetes version upgrades -* Support edge computing experimentation with advanced resource management features ### Non-Goals -* Modifying OpenShift's feature gate defaults or upstream Kubernetes behavior -* Supporting feature gates that fundamentally conflict with MicroShift's architecture -* Automatic enablement of experimental features without explicit user configuration for experimentation +* Modify OpenShift's feature gate defaults +* Vetting feature gates for compatibility with MicroShift +* Validating custom feature gate settings for correctness, e.g. spelling, case, and punctuation +* Automatic enablement of experimental features without explicit user configuration +* Providing upgrade support to customized clusters ## Proposal -This enhancement proposes adding feature gate configuration support to MicroShift by extending `/etc/microshift/config.yaml` to mirror OpenShift's FeatureGate custom resource specification. The configuration will support both predefined feature sets and custom feature gate combinations, ensuring consistency with OpenShift's FeatureGate API patterns. +This enhancement proposes adding feature gate configuration support to MicroShift by extending `/etc/microshift/config.yaml` with a configuration schema inspired by OpenShift's FeatureGate custom resource specification. The configuration will support both predefined feature sets and custom feature gate combinations, ensuring consistency with OpenShift's FeatureGate API patterns. The implementation includes: -1. **FeatureGate Configuration Schema**: Extend MicroShift's configuration file to include `featureGates` section matching the OpenShift FeatureGate CRD spec fields (`featureSet` and `customNoUpgrade`) +1. **FeatureGate Configuration Schema**: Extend MicroShift's configuration file to include `featureGates` section inspired by OpenShift's FeatureGate CRD spec fields (`featureSet` and `customNoUpgrade`) 2. **Predefined Feature Sets**: Support for OpenShift's predefined feature sets like `TechPreviewNoUpgrade` 3. **Custom Feature Gates**: Support for individual feature gate enablement/disablement via `customNoUpgrade` configuration -4. **Automated Rebase Integration**: Maintain feature gate alignment with OpenShift defaults during rebases This approach ensures that users can experiment with the same feature gate capabilities as OpenShift while maintaining MicroShift's file-based configuration pattern. Default feature gate values will continue to be inherited from OpenShift to ensure consistency across the platform. @@ -71,8 +60,6 @@ This approach ensures that users can experiment with the same feature gate capab **MicroShift Administrator** is a human user responsible for configuring and managing MicroShift deployments. -**MicroShift Developer** is a human user responsible for maintaining MicroShift codebase and rebases. - #### User Configuration Workflow 1. MicroShift Administrator identifies a need for specific feature gates (e.g., `CPUManagerPolicyAlphaOptions`) 2. Administrator chooses between two configuration approaches: @@ -81,18 +68,11 @@ This approach ensures that users can experiment with the same feature gate capab 3. Administrator updates `/etc/microshift/config.yaml` with the chosen configuration 4. Administrator restarts MicroShift service 5. MicroShift parses the FeatureGate configuration and passes settings to relevant Kubernetes components where validation occurs -6. The features become available according to the configured state - -#### Automated Rebase Workflow -1. CI automation initiates OpenShift rebase process -2. Automated tooling compares feature gate defaults between MicroShift and OpenShift components -3. Conflicts or misalignments are detected and flagged -4. Developer resolves conflicts to ensure MicroShift maintains the same default feature gates as OpenShift -5. Default feature gate alignment with OpenShift is maintained automatically +6. The features are enabled / disabled according to the configured state ### API Extensions -This enhancement extends MicroShift's configuration file schema only. No new CRDs, admission webhooks, conversion webhooks, aggregated API servers, or finalizers are introduced. The configuration file structure will be extended to include a `featureGates` section that mirrors the OpenShift FeatureGate CRD specification, providing consistency with OpenShift's feature gate configuration patterns while maintaining MicroShift's file-based configuration approach. +This enhancement extends MicroShift's configuration file schema only. No new CRDs, admission webhooks, conversion webhooks, aggregated API servers, or finalizers are introduced. The configuration file structure will be extended to include a `featureGates` section inspired by the OpenShift FeatureGate CRD specification, providing consistency with OpenShift's feature gate configuration patterns while maintaining MicroShift's file-based configuration approach. ### Topology Considerations @@ -121,7 +101,7 @@ The resource consumption impact will be minimal as this enhancement only adds co #### Configuration Schema Extension -The MicroShift configuration file will be extended to include a new `featureGates` section that mirrors the OpenShift FeatureGate CRD specification: +The MicroShift configuration file will be extended to include a new `featureGates` section inspired by the OpenShift FeatureGate CRD specification: **Predefined Feature Set Configuration:** ```yaml @@ -158,15 +138,6 @@ Feature gates will be applied to the following MicroShift components, which are MicroShift will generate or modify the appropriate configuration files for each component based on the user's feature gate settings in the MicroShift configuration file. -#### Rebase Automation - -To address the requirements from [USHIFT-2813](https://issues.redhat.com/browse/USHIFT-2813), the rebase process will include: - -1. **Feature Gate Inventory**: Automated tooling to extract default feature gate settings from each Kubernetes component -2. **Conflict Detection**: Comparison logic to identify conflicting defaults between components -3. **Alignment Verification**: CI checks to ensure MicroShift's defaults match OpenShift's defaults -4. **Override Mechanism**: Use of Kubernetes' `OverrideDefault` method where necessary to resolve conflicts - #### Validation and Error Handling - Invalid feature gate names will be caught by the Kubernetes components themselves @@ -175,21 +146,11 @@ To address the requirements from [USHIFT-2813](https://issues.redhat.com/browse/ ### Risks and Mitigations -**Risk: Feature Gate Conflicts Between Components** -Components may have conflicting default values for the same feature gate, as experienced in [USHIFT-2813](https://issues.redhat.com/browse/USHIFT-2813). - -*Mitigation:* Implement automated detection during the rebase process to identify conflicts early and establish a consistent approach to resolve conflicting feature gate defaults across all components. - **Risk: Experimenting with Unstable Alpha Features** Users experimenting with alpha-stage feature gates may encounter instability or data loss in their MicroShift deployments. *Mitigation:* Emphasize that experimentation should be conducted in non-production environments. Feature gate validation will be handled by the Kubernetes components themselves. -**Risk: Feature Gate Misalignment During Rebases** -Manual rebase processes may miss feature gate changes in OpenShift, leading to divergent behavior. - -*Mitigation:* Integrate automated feature gate alignment checks into the CI rebase process to ensure MicroShift maintains the same defaults as OpenShift automatically. - **Risk: Configuration Errors** Invalid feature gate configurations could prevent MicroShift components from starting. @@ -205,9 +166,6 @@ Some feature gates may expose new attack vectors or security vulnerabilities. **Increased Configuration Complexity** Adding feature gate configuration increases the complexity of MicroShift's configuration surface area. Users must understand both the feature gates themselves and their potential interactions, which could lead to misconfigurations in edge deployments where troubleshooting access is limited. -**Maintenance Burden** -This enhancement requires ongoing maintenance to keep feature gate handling aligned with OpenShift during rebases. The automated tooling and processes need to be maintained and updated as Kubernetes and OpenShift evolve their feature gate mechanisms. - **Support Complexity** Enabling alpha and beta features through user configuration means support teams may encounter issues related to experimental functionality that behaves differently across Kubernetes versions or has incomplete implementations. @@ -223,225 +181,164 @@ No significant alternatives were considered for this enhancement. The configurat ## Open Questions [optional] -This is where to call out areas of the design that require closure before deciding -to implement the design. For instance, - > 1. This requires exposing previously private resources which contain sensitive - information. Can we do this? +1. **How does OpenShift handle upgrades when custom feature gates are configured?** -## Test Plan + This requires clarification of OpenShift's actual implementation behavior: + - Does OpenShift actively **block/prevent** upgrades when TechPreviewNoUpgrade/CustomNoUpgrade is configured? + - Or does OpenShift **allow** upgrades to proceed but the resulting cluster becomes unsupported? -**Note:** *Section not required until targeted at a release.* + Understanding OpenShift's approach will inform whether MicroShift should implement active blocking logic (pre-upgrade checks that fail) or simply document that upgrades with custom feature gates are unsupported while allowing them to proceed technically. -Consider the following in developing a test plan for this enhancement: -- Will there be e2e and integration tests, in addition to unit tests? -- How will it be tested in isolation vs with other components? -- What additional testing is necessary to support managed OpenShift service-based offerings? +2. **How should feature gate compatibility be validated across MicroShift versions?** -No need to outline all of the test cases, just the general strategy. Anything -that would count as tricky in the implementation and anything particularly -challenging to test should be called out. + Unlike OpenShift which has extensive CI testing across feature combinations, MicroShift may have limited resources for testing all feature gate combinations across version upgrades. The approach for ensuring compatibility and providing user guidance needs definition. -All code is expected to have adequate tests (eventually with coverage -expectations). +## Test Plan -## Graduation Criteria +The testing strategy focuses on verifying the passthrough functionality - that custom feature gate configurations are correctly parsed and passed to the appropriate Kubernetes components. Since this is strictly a configuration passthrough feature, testing validates the parsing and delivery mechanism rather than feature gate functionality itself. -**Note:** *Section not required until targeted at a release.* +### Unit Tests -Define graduation milestones. +**Configuration Parsing:** +- Validate parsing of `featureSet` values (TechPreviewNoUpgrade, CustomNoUpgrade, Default) +- Test parsing of `customNoUpgrade.enabled` and `customNoUpgrade.disabled` lists +- Verify configuration schema validation and error handling for malformed configurations +- Test default behavior when feature gates section is not configured -These may be defined in terms of API maturity, or as something else. Initial proposal -should keep this high-level with a focus on what signals will be looked at to -determine graduation. +**Component Configuration Generation:** +- Test that feature gates are correctly written to kubelet configuration files +- Verify feature gates are properly formatted in kube-apiserver configuration +- Test feature gates are correctly applied to kube-controller-manager configuration +- Validate feature gates are properly set in kube-scheduler configuration +- Test that feature gates are applied to the correct components based on their scope -Consider the following in developing the graduation criteria for this -enhancement: +### Robot Framework Integration Tests -- Maturity levels - - [`alpha`, `beta`, `stable` in upstream Kubernetes][maturity-levels] - - `Dev Preview`, `Tech Preview`, `GA` in OpenShift -- [Deprecation policy][deprecation-policy] +**Passthrough Verification:** +- Test that custom feature gates specified in MicroShift configuration appear in component configurations after service restart +- Verify TechPreviewNoUpgrade preset results in correct feature gates being passed to all components +- Test CustomNoUpgrade configuration with specific enabled/disabled lists are correctly applied to component configurations +- Validate that configuration changes only take effect after MicroShift service restart -Clearly define what graduation means by either linking to the [API doc definition](https://kubernetes.io/docs/concepts/overview/kubernetes-api/#api-versioning), -or by redefining what graduation means. +**Configuration Error Handling:** +- Test MicroShift behavior with invalid feature gate names (passthrough with component validation) +- Verify appropriate error reporting when components reject invalid feature gate configurations +- Test handling of conflicting settings (same feature gate in both enabled and disabled lists) -In general, we try to use the same stages (alpha, beta, GA), regardless how the functionality is accessed. +### Testing Scope Limitations -[maturity-levels]: https://git.k8s.io/community/contributors/devel/sig-architecture/api_changes.md#alpha-beta-and-stable-versions -[deprecation-policy]: https://kubernetes.io/docs/reference/using-api/deprecation-policy/ +**Component Behavior Verification:** +This enhancement does not test whether feature gates actually modify Kubernetes component behavior - that is the responsibility of upstream Kubernetes and OpenShift testing. Testing is limited to verifying the configuration passthrough mechanism works correctly. -**If this is a user facing change requiring new or updated documentation in [openshift-docs](https://github.com/openshift/openshift-docs/), -please be sure to include in the graduation criteria.** +**Upgrade Testing:** +Since upgrades are not supported when custom feature gates are configured, no additional upgrade testing is required for this enhancement. Default upgrade behavior without custom feature gates is already covered by existing MicroShift test suites. + +## Graduation Criteria -**Examples**: These are generalized examples to consider, in addition -to the aforementioned [maturity levels][maturity-levels]. +The feature is planned to be released as GA directly. ### Dev Preview -> Tech Preview -- Ability to utilize the enhancement end to end -- End user documentation, relative API stability -- Sufficient test coverage -- Gather feedback from users rather than just developers -- Enumerate service level indicators (SLIs), expose SLIs as metrics -- Write symptoms-based alerts for the component(s) +N/A ### Tech Preview -> GA -- More testing (upgrade, downgrade, scale) -- Sufficient time for feedback +- Ability to utilize the enhancement end to end +- End user documentation completed and published +- Sufficient test coverage including Robot Framework integration tests - Available by default -- Backhaul SLI telemetry -- Document SLOs for the component -- Conduct load testing -- User facing documentation created in [openshift-docs](https://github.com/openshift/openshift-docs/) - -**For non-optional features moving to GA, the graduation criteria must include -end to end tests.** +- End-to-end tests validating configuration passthrough functionality ### Removing a deprecated feature -- Announce deprecation and support policy of the existing feature -- Deprecate the feature +N/A ## Upgrade / Downgrade Strategy -If applicable, how will the component be upgraded and downgraded? Make sure this -is in the test plan. - -Consider the following in developing an upgrade/downgrade strategy for this -enhancement: -- What changes (in invocations, configurations, API use, etc.) is an existing - cluster required to make on upgrade in order to keep previous behavior? -- What changes (in invocations, configurations, API use, etc.) is an existing - cluster required to make on upgrade in order to make use of the enhancement? - -Upgrade expectations: -- Each component should remain available for user requests and - workloads during upgrades. Ensure the components leverage best practices in handling [voluntary - disruption](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/). Any exception to - this should be identified and discussed here. -- Micro version upgrades - users should be able to skip forward versions within a - minor release stream without being required to pass through intermediate - versions - i.e. `x.y.N->x.y.N+2` should work without requiring `x.y.N->x.y.N+1` - as an intermediate step. -- Minor version upgrades - you only need to support `x.N->x.N+1` upgrade - steps. So, for example, it is acceptable to require a user running 4.3 to - upgrade to 4.5 with a `4.3->4.4` step followed by a `4.4->4.5` step. -- While an upgrade is in progress, new component versions should - continue to operate correctly in concert with older component - versions (aka "version skew"). For example, if a node is down, and - an operator is rolling out a daemonset, the old and new daemonset - pods must continue to work correctly even while the cluster remains - in this partially upgraded state for some time. - -Downgrade expectations: -- If an `N->N+1` upgrade fails mid-way through, or if the `N+1` cluster is - misbehaving, it should be possible for the user to rollback to `N`. It is - acceptable to require some documented manual steps in order to fully restore - the downgraded cluster to its previous state. Examples of acceptable steps - include: - - Deleting any CVO-managed resources added by the new version. The - CVO does not currently delete resources that no longer exist in - the target version. +**Default Configuration (no custom feature gates):** +Upgrades and downgrades proceed normally using standard MicroShift procedures with no additional considerations for feature gate handling. -## Version Skew Strategy +**Custom Feature Gate Configurations:** +Upgrades and downgrades are not supported when custom feature gates are configured (TechPreviewNoUpgrade or CustomNoUpgrade). Users must remove all custom feature gate configurations and return to default settings before attempting any version changes. -How will the component handle version skew with other components? -What are the guarantees? Make sure this is in the test plan. +This limitation aligns with OpenShift's approach where TechPreviewNoUpgrade and CustomNoUpgrade feature sets explicitly prevent cluster upgrades to avoid compatibility issues with experimental features. -Consider the following in developing a version skew strategy for this -enhancement: -- During an upgrade, we will always have skew among components, how will this impact your work? -- Does this enhancement involve coordinating behavior in the control plane and - in the kubelet? How does an n-2 kubelet without this feature available behave - when this feature is used? -- Will any other components on the node change? For example, changes to CSI, CRI - or CNI may require updating that component before the kubelet. +## Version Skew Strategy -## Operational Aspects of API Extensions +This enhancement introduces upgrade limitations when custom feature gates are configured to prevent compatibility issues across version boundaries. -Describe the impact of API extensions (mentioned in the proposal section, i.e. CRDs, -admission and conversion webhooks, aggregated API servers, finalizers) here in detail, -especially how they impact the OCP system architecture and operational aspects. +### Default Configuration +When no custom feature gates are configured, standard MicroShift version skew handling applies with no additional considerations. -- For conversion/admission webhooks and aggregated apiservers: what are the SLIs (Service Level - Indicators) an administrator or support can use to determine the health of the API extensions +### Custom Feature Gate Limitations +When custom feature gates are configured (TechPreviewNoUpgrade or CustomNoUpgrade), upgrades and downgrades between minor versions are not expected to work. Users must remove custom feature gate configurations before attempting minor version changes. - Examples (metrics, alerts, operator conditions) - - authentication-operator condition `APIServerDegraded=False` - - authentication-operator condition `APIServerAvailable=True` - - openshift-authentication/oauth-apiserver deployment and pods health +### Component Version Alignment +All Kubernetes components (kubelet, kube-apiserver, kube-controller-manager, kube-scheduler) are packaged together within each MicroShift release, eliminating internal component version skew concerns. Feature gate configuration is applied during startup with no runtime coordination required between components. -- What impact do these API extensions have on existing SLIs (e.g. scalability, API throughput, - API availability) +### Feature Gate Inconsistencies Between Components +It is possible that one component's feature gate settings disable an existing default feature gate while another component enables it, creating inconsistent behavior across components. However, resolving such inconsistencies is not within the scope of this proposal - this enhancement provides a passthrough mechanism only and does not validate feature gate compatibility between components. - Examples: - - Adds 1s to every pod update in the system, slowing down pod scheduling by 5s on average. - - Fails creation of ConfigMap in the system when the webhook is not available. - - Adds a dependency on the SDN service network for all resources, risking API availability in case - of SDN issues. - - Expected use-cases require less than 1000 instances of the CRD, not impacting - general API throughput. +## Operational Aspects of API Extensions -- How is the impact on existing SLIs to be measured and when (e.g. every release by QE, or - automatically in CI) and by whom (e.g. perf team; name the responsible person and let them review - this enhancement) +This enhancement does not introduce any API extensions (CRDs, admission webhooks, conversion webhooks, aggregated API servers, or finalizers). The feature operates entirely through configuration file changes and does not modify the OpenShift API surface or behavior. -- Describe the possible failure modes of the API extensions. -- Describe how a failure or behaviour of the extension will impact the overall cluster health - (e.g. which kube-controller-manager functionality will stop working), especially regarding - stability, availability, performance and security. -- Describe which OCP teams are likely to be called upon in case of escalation with one of the failure modes - and add them as reviewers to this enhancement. +All operational aspects are handled through existing MicroShift configuration mechanisms and component startup procedures. ## Support Procedures -Describe how to -- detect the failure modes in a support situation, describe possible symptoms (events, metrics, - alerts, which log output in which component) +### Detecting Feature Gate Configuration Issues + +**MicroShift Service Startup Failures:** +- **Symptoms**: MicroShift service fails to start after configuration changes +- **Log locations**: `journalctl -u microshift.service` +- **Error patterns**: Component startup failures with feature gate validation errors +- **Detection**: Service status shows failed state, component logs show unknown feature gate names - Examples: - - If the webhook is not running, kube-apiserver logs will show errors like "failed to call admission webhook xyz". - - Operator X will degrade with message "Failed to launch webhook server" and reason "WehhookServerFailed". - - The metric `webhook_admission_duration_seconds("openpolicyagent-admission", "mutating", "put", "false")` - will show >1s latency and alert `WebhookAdmissionLatencyHigh` will fire. +**Component-Specific Failures:** +- **kubelet errors**: Check `journalctl -u microshift.service` for kubelet initialization failures +- **kube-apiserver errors**: Look for API server startup errors in MicroShift service logs +- **Controller/scheduler errors**: Component initialization failures logged in MicroShift service output -- disable the API extension (e.g. remove MutatingWebhookConfiguration `xyz`, remove APIService `foo`) +### Disabling Feature Gate Configuration - - What consequences does it have on the cluster health? +**Remove Custom Feature Gates:** +1. Edit `/etc/microshift/config.yaml` +2. Remove or comment out the `featureGates` section +3. Restart MicroShift service: `sudo systemctl restart microshift` - Examples: - - Garbage collection in kube-controller-manager will stop working. - - Quota will be wrongly computed. - - Disabling/removing the CRD is not possible without removing the CR instances. Customer will lose data. - Disabling the conversion webhook will break garbage collection. +**Reset to Default Configuration:** +```yaml +# Remove entire featureGates section or set to: +featureGates: + featureSet: Default +``` - - What consequences does it have on existing, running workloads? +**Consequences of Disabling:** +- **Cluster health**: No impact on core MicroShift functionality +- **Existing workloads**: Workloads using experimental features may lose functionality +- **New workloads**: Will use default feature gate behavior only - Examples: - - New namespaces won't get the finalizer "xyz" and hence might leak resource X - when deleted. - - SDN pod-to-pod routing will stop updating, potentially breaking pod-to-pod - communication after some minutes. +### Edge Environment Troubleshooting - - What consequences does it have for newly created workloads? +**Remote Diagnostics:** +- Feature gate configuration issues are logged in standard MicroShift service logs +- Use `microshift get nodes` to verify basic cluster functionality +- Check component status through `microshift get pods -A` for system pod health - Examples: - - New pods in namespace with Istio support will not get sidecars injected, breaking - their networking. +**Recovery Procedures:** +- Configuration changes only require MicroShift service restart, not full system reboot +- Invalid configurations prevent service startup but do not affect system stability +- Greenboot integration ensures automatic rollback if feature gates prevent successful startup -- Does functionality fail gracefully and will work resume when re-enabled without risking - consistency? +### Graceful Failure and Recovery - Examples: - - The mutating admission webhook "xyz" has FailPolicy=Ignore and hence - will not block the creation or updates on objects when it fails. When the - webhook comes back online, there is a controller reconciling all objects, applying - labels that were not applied during admission webhook downtime. - - Namespaces deletion will not delete all objects in etcd, leading to zombie - objects when another namespace with the same name is created. +**Configuration Changes:** +- Invalid feature gate configurations fail fast during service startup +- No partial application of settings - either all feature gates apply or none do +- Recovery is immediate upon fixing configuration and restarting service +- No data consistency risks from feature gate configuration changes ## Infrastructure Needed [optional] -Use this section if you need things from the project. Examples include a new -subproject, repos requested, github details, and/or testing infrastructure. \ No newline at end of file +No additional infrastructure is needed for this enhancement. The feature uses existing MicroShift configuration mechanisms and testing infrastructure. \ No newline at end of file From ccd9d4cb89ffe45130957b0b40f2be4fa2c29af9 Mon Sep 17 00:00:00 2001 From: Jon Cope Date: Mon, 29 Sep 2025 19:04:31 -0500 Subject: [PATCH 3/8] corrected feature gate inheritence statement fixed typo in non-goals added DevPreviewNoUpgrade to featureSets featuregates are irreversible fixed typo in filename --- ...> enabling-user-specified-featuregates.md} | 36 +++++++++---------- 1 file changed, 18 insertions(+), 18 deletions(-) rename enhancements/microshift/{enabling-user-specfied-featuregates.md => enabling-user-specified-featuregates.md} (87%) diff --git a/enhancements/microshift/enabling-user-specfied-featuregates.md b/enhancements/microshift/enabling-user-specified-featuregates.md similarity index 87% rename from enhancements/microshift/enabling-user-specfied-featuregates.md rename to enhancements/microshift/enabling-user-specified-featuregates.md index 6a4fbae013..fcb9381a46 100644 --- a/enhancements/microshift/enabling-user-specfied-featuregates.md +++ b/enhancements/microshift/enabling-user-specified-featuregates.md @@ -3,25 +3,25 @@ title: enabling-user-specified-featuregates authors: - copejon reviewers: - - pacevedom # MicroShift core team for configuration changes - - TBD # OpenShift platform team for alignment with OpenShift defaults + - "@pacevedom, MicroShift Team Lead" + - "@pmtk, MicroShift Team Engineer" approvers: - - TBD # MicroShift principal engineer + - "@jerpeter1" # MicroShift principal engineer api-approvers: - None # Configuration file changes only, no API modifications creation-date: 2025-09-24 # You'll need to fill in today's date -last-updated: 2025-09-24 +last-updated: 2025-09-29 tracking-link: - - # Link to USHIFT-6080 or the main epic ticket + - https://issues.redhat.com/browse/USHIFT-6177 see-also: - "" --- -# Enabling User-Specified FeatureGates in MicroShift +# Enabling User Specified FeatureGates ## Summary -MicroShift currently inherits feature gates from its OpenShift components but lacks a controlled mechanism for users to experiment with additional feature gates or override defaults. This enhancement proposes adding configuration support for Kubernetes and OpenShift feature gates through the MicroShift configuration file. This capability will enable users to experiment with alpha and beta OpenShift and Kubernetes features like CPUManager's `prefer-align-cpus-by-uncorecache` in a supported and deterministic way, addressing edge computing use cases where users want to evaluate advanced resource management capabilities. +MicroShift disables all feature gates from OpenShift by default while hardcoding only a few relevant ones, and lacks a controlled mechanism for users to experiment with additional feature gates or override defaults. This enhancement proposes adding configuration support for Kubernetes and OpenShift feature gates through the MicroShift configuration file. This capability will enable users to experiment with alpha and beta OpenShift and Kubernetes features like CPUManager's `prefer-align-cpus-by-uncorecache` in a supported and deterministic way, addressing edge computing use cases where users want to evaluate advanced resource management capabilities. ## Motivation @@ -38,8 +38,8 @@ MicroShift users in edge computing environments want to experiment with upcoming ### Non-Goals -* Modify OpenShift's feature gate defaults -* Vetting feature gates for compatibility with MicroShift +* Modify MicroShift's existing feature gate defaults +* Vetting custom feature gates for compatibility with MicroShift * Validating custom feature gate settings for correctness, e.g. spelling, case, and punctuation * Automatic enablement of experimental features without explicit user configuration * Providing upgrade support to customized clusters @@ -51,7 +51,7 @@ This enhancement proposes adding feature gate configuration support to MicroShif The implementation includes: 1. **FeatureGate Configuration Schema**: Extend MicroShift's configuration file to include `featureGates` section inspired by OpenShift's FeatureGate CRD spec fields (`featureSet` and `customNoUpgrade`) -2. **Predefined Feature Sets**: Support for OpenShift's predefined feature sets like `TechPreviewNoUpgrade` +2. **Predefined Feature Sets**: Support for OpenShift's predefined feature sets like `TechPreviewNoUpgrade` and `DevPreviewNoUpgrade` 3. **Custom Feature Gates**: Support for individual feature gate enablement/disablement via `customNoUpgrade` configuration This approach ensures that users can experiment with the same feature gate capabilities as OpenShift while maintaining MicroShift's file-based configuration pattern. Default feature gate values will continue to be inherited from OpenShift to ensure consistency across the platform. @@ -63,7 +63,7 @@ This approach ensures that users can experiment with the same feature gate capab #### User Configuration Workflow 1. MicroShift Administrator identifies a need for specific feature gates (e.g., `CPUManagerPolicyAlphaOptions`) 2. Administrator chooses between two configuration approaches: - - **Predefined Feature Set**: Configure `featureGates.featureSet: TechPreviewNoUpgrade` for a curated set of preview features + - **Predefined Feature Set**: Configure `featureGates.featureSet: TechPreviewNoUpgrade` or `DevPreviewNoUpgrade` for a curated set of preview features - **Custom Feature Gates**: Configure `featureGates.featureSet: CustomNoUpgrade` and specify individual features in `featureGates.customNoUpgrade.enabled/disabled` lists 3. Administrator updates `/etc/microshift/config.yaml` with the chosen configuration 4. Administrator restarts MicroShift service @@ -173,7 +173,7 @@ Enabling alpha and beta features through user configuration means support teams Edge deployments often have limited remote access for troubleshooting. If users enable experimental feature gates that cause instability, recovering these devices may require physical access or complex recovery procedures. **Upgrade Limitations and Irreversible Changes** -Enabling `TechPreviewNoUpgrade` feature set cannot be undone and prevents both minor version updates and major upgrades. Once enabled, the cluster permanently loses the ability to perform standard updates. Similarly, `CustomNoUpgrade` configurations prevent upgrades/updates until reset to default settings. These feature sets are explicitly not recommended for production clusters due to their irreversible nature and update limitations, which conflicts with the typical edge deployment requirement for reliable, long-term operation and maintenance. +Enabling `TechPreviewNoUpgrade`, `DevPreviewNoUpgrade`, or `CustomNoUpgrade` feature sets cannot be undone and prevents both minor version updates and major upgrades. Once enabled, the cluster permanently loses the ability to perform standard updates. These feature sets are explicitly not recommended for production clusters due to their irreversible nature and update limitations, which conflicts with the typical edge deployment requirement for reliable, long-term operation and maintenance. ## Alternatives (Not Implemented) @@ -184,7 +184,7 @@ No significant alternatives were considered for this enhancement. The configurat 1. **How does OpenShift handle upgrades when custom feature gates are configured?** This requires clarification of OpenShift's actual implementation behavior: - - Does OpenShift actively **block/prevent** upgrades when TechPreviewNoUpgrade/CustomNoUpgrade is configured? + - Does OpenShift actively **block/prevent** upgrades when TechPreviewNoUpgrade/DevPreviewNoUpgrade/CustomNoUpgrade is configured? - Or does OpenShift **allow** upgrades to proceed but the resulting cluster becomes unsupported? Understanding OpenShift's approach will inform whether MicroShift should implement active blocking logic (pre-upgrade checks that fail) or simply document that upgrades with custom feature gates are unsupported while allowing them to proceed technically. @@ -200,7 +200,7 @@ The testing strategy focuses on verifying the passthrough functionality - that c ### Unit Tests **Configuration Parsing:** -- Validate parsing of `featureSet` values (TechPreviewNoUpgrade, CustomNoUpgrade, Default) +- Validate parsing of `featureSet` values (TechPreviewNoUpgrade, DevPreviewNoUpgrade, CustomNoUpgrade, Default) - Test parsing of `customNoUpgrade.enabled` and `customNoUpgrade.disabled` lists - Verify configuration schema validation and error handling for malformed configurations - Test default behavior when feature gates section is not configured @@ -216,7 +216,7 @@ The testing strategy focuses on verifying the passthrough functionality - that c **Passthrough Verification:** - Test that custom feature gates specified in MicroShift configuration appear in component configurations after service restart -- Verify TechPreviewNoUpgrade preset results in correct feature gates being passed to all components +- Verify TechPreviewNoUpgrade and DevPreviewNoUpgrade presets result in correct feature gates being passed to all components - Test CustomNoUpgrade configuration with specific enabled/disabled lists are correctly applied to component configurations - Validate that configuration changes only take effect after MicroShift service restart @@ -259,9 +259,9 @@ N/A Upgrades and downgrades proceed normally using standard MicroShift procedures with no additional considerations for feature gate handling. **Custom Feature Gate Configurations:** -Upgrades and downgrades are not supported when custom feature gates are configured (TechPreviewNoUpgrade or CustomNoUpgrade). Users must remove all custom feature gate configurations and return to default settings before attempting any version changes. +Upgrades and downgrades are not supported when custom feature gates are configured (TechPreviewNoUpgrade, DevPreviewNoUpgrade, or CustomNoUpgrade). Once custom feature gates are enabled, this configuration cannot be reverted - it is a permanent, one-way operation that permanently disables upgrade capability. -This limitation aligns with OpenShift's approach where TechPreviewNoUpgrade and CustomNoUpgrade feature sets explicitly prevent cluster upgrades to avoid compatibility issues with experimental features. +This limitation aligns with OpenShift's approach where TechPreviewNoUpgrade, DevPreviewNoUpgrade, and CustomNoUpgrade feature sets are irreversible and explicitly prevent cluster upgrades to avoid compatibility issues with experimental features. ## Version Skew Strategy @@ -271,7 +271,7 @@ This enhancement introduces upgrade limitations when custom feature gates are co When no custom feature gates are configured, standard MicroShift version skew handling applies with no additional considerations. ### Custom Feature Gate Limitations -When custom feature gates are configured (TechPreviewNoUpgrade or CustomNoUpgrade), upgrades and downgrades between minor versions are not expected to work. Users must remove custom feature gate configurations before attempting minor version changes. +When custom feature gates are configured (TechPreviewNoUpgrade, DevPreviewNoUpgrade, or CustomNoUpgrade), upgrades and downgrades between minor versions are not expected to work. Users must remove custom feature gate configurations before attempting minor version changes. ### Component Version Alignment All Kubernetes components (kubelet, kube-apiserver, kube-controller-manager, kube-scheduler) are packaged together within each MicroShift release, eliminating internal component version skew concerns. Feature gate configuration is applied during startup with no runtime coordination required between components. From cd3c6f37e3d7457a214fca45650d0aa8d844f8b6 Mon Sep 17 00:00:00 2001 From: Jon Cope Date: Fri, 3 Oct 2025 10:03:56 -0500 Subject: [PATCH 4/8] remove references to microshift propagation --- .../enabling-user-specified-featuregates.md | 116 +++++++++++------- 1 file changed, 70 insertions(+), 46 deletions(-) diff --git a/enhancements/microshift/enabling-user-specified-featuregates.md b/enhancements/microshift/enabling-user-specified-featuregates.md index fcb9381a46..c0b2d2d094 100644 --- a/enhancements/microshift/enabling-user-specified-featuregates.md +++ b/enhancements/microshift/enabling-user-specified-featuregates.md @@ -21,7 +21,7 @@ see-also: ## Summary -MicroShift disables all feature gates from OpenShift by default while hardcoding only a few relevant ones, and lacks a controlled mechanism for users to experiment with additional feature gates or override defaults. This enhancement proposes adding configuration support for Kubernetes and OpenShift feature gates through the MicroShift configuration file. This capability will enable users to experiment with alpha and beta OpenShift and Kubernetes features like CPUManager's `prefer-align-cpus-by-uncorecache` in a supported and deterministic way, addressing edge computing use cases where users want to evaluate advanced resource management capabilities. +MicroShift disables most feature gates by default while hardcoding only a few relevant ones, and lacks a controlled mechanism for users to experiment with additional feature gates or override defaults. This enhancement proposes adding configuration support for feature gates through the MicroShift configuration file. In OpenShift, users configure feature gates through the FeatureGate API, where operators independently filter featureGates for their components based on the central FeatureGate API 'cluster' instance. In contrast, MicroShift users will specify feature gates directly in the configuration file (`/etc/microshift/config.yaml`), and MicroShift will pass all user-specified featureGates to the kube-apiserver, which then propagates them to other Kubernetes components (kubelet, kube-controller-manager, kube-scheduler). This capability will enable users to experiment with alpha and beta Kubernetes features like CPUManager's `prefer-align-cpus-by-uncorecache` in a supported and deterministic way, addressing edge computing use cases where users want to evaluate advanced resource management capabilities. ## Motivation @@ -29,11 +29,11 @@ MicroShift users in edge computing environments want to experiment with upcoming ### User Stories -* As a MicroShift administrator, I want to configure feature gates through the MicroShift configuration file so that I can experiment with alpha/beta OpenShift features in a controlled and supported manner. +* As a MicroShift administrator, I want to configure feature gates through the MicroShift configuration file (`/etc/microshift/config.yaml`), so that I can experiment with alpha/beta features in a controlled and supported manner consistent with MicroShift's file-based configuration approach. ### Goals -* Enable user configuration of Kubernetes and OpenShift feature gates through the MicroShift configuration file +* Enable user configuration of feature gates through the MicroShift configuration file * Provide a controlled and deterministic way to experiment with alpha and beta features ### Non-Goals @@ -46,15 +46,16 @@ MicroShift users in edge computing environments want to experiment with upcoming ## Proposal -This enhancement proposes adding feature gate configuration support to MicroShift by extending `/etc/microshift/config.yaml` with a configuration schema inspired by OpenShift's FeatureGate custom resource specification. The configuration will support both predefined feature sets and custom feature gate combinations, ensuring consistency with OpenShift's FeatureGate API patterns. +This enhancement proposes adding feature gate configuration support to MicroShift by extending `/etc/microshift/config.yaml` with a configuration schema inspired by OpenShift's FeatureGate custom resource specification. In OpenShift, users configure feature gates through the FeatureGate API, and operators independently filter featureGates before applying them to their components. MicroShift takes a different approach aligned with its file-based configuration philosophy: users specify feature gates directly in the configuration file, and MicroShift passes all user-specified featureGates to the kube-apiserver, which then handles propagation to other Kubernetes components. The implementation includes: -1. **FeatureGate Configuration Schema**: Extend MicroShift's configuration file to include `featureGates` section inspired by OpenShift's FeatureGate CRD spec fields (`featureSet` and `customNoUpgrade`) -2. **Predefined Feature Sets**: Support for OpenShift's predefined feature sets like `TechPreviewNoUpgrade` and `DevPreviewNoUpgrade` +1. **FeatureGate Configuration Schema**: Extend MicroShift's configuration file to include `featureGates` section with fields inspired by OpenShift's FeatureGate CRD spec (`featureSet` and `customNoUpgrade`) +2. **Predefined Feature Sets**: Support for predefined feature sets like `TechPreviewNoUpgrade` and `DevPreviewNoUpgrade` 3. **Custom Feature Gates**: Support for individual feature gate enablement/disablement via `customNoUpgrade` configuration +4. **API Server Propagation**: All configured featureGates will be passed to the kube-apiserver, which handles propagation to other Kubernetes components (kubelet, kube-controller-manager, kube-scheduler) -This approach ensures that users can experiment with the same feature gate capabilities as OpenShift while maintaining MicroShift's file-based configuration pattern. Default feature gate values will continue to be inherited from OpenShift to ensure consistency across the platform. +This approach ensures that users can experiment with feature gate capabilities while maintaining MicroShift's file-based configuration pattern instead of requiring API interactions. ### Workflow Description @@ -67,12 +68,13 @@ This approach ensures that users can experiment with the same feature gate capab - **Custom Feature Gates**: Configure `featureGates.featureSet: CustomNoUpgrade` and specify individual features in `featureGates.customNoUpgrade.enabled/disabled` lists 3. Administrator updates `/etc/microshift/config.yaml` with the chosen configuration 4. Administrator restarts MicroShift service -5. MicroShift parses the FeatureGate configuration and passes settings to relevant Kubernetes components where validation occurs -6. The features are enabled / disabled according to the configured state +5. MicroShift parses the FeatureGate configuration and passes all settings to the kube-apiserver +6. The kube-apiserver propagates the feature gates to other Kubernetes components (kubelet, kube-controller-manager, kube-scheduler) +7. Each component processes the featureGates and enables/disables the features it supports according to the configured state ### API Extensions -This enhancement extends MicroShift's configuration file schema only. No new CRDs, admission webhooks, conversion webhooks, aggregated API servers, or finalizers are introduced. The configuration file structure will be extended to include a `featureGates` section inspired by the OpenShift FeatureGate CRD specification, providing consistency with OpenShift's feature gate configuration patterns while maintaining MicroShift's file-based configuration approach. +This enhancement extends MicroShift's configuration file schema only. No new CRDs, admission webhooks, conversion webhooks, aggregated API servers, or finalizers are introduced. Unlike OpenShift where users interact with the FeatureGate API to configure feature gates, MicroShift users will configure feature gates directly in the `/etc/microshift/config.yaml` file. The configuration file structure will be extended to include a `featureGates` section with a structure inspired by the OpenShift FeatureGate CRD specification, maintaining MicroShift's file-based configuration approach. ### Topology Considerations @@ -101,7 +103,7 @@ The resource consumption impact will be minimal as this enhancement only adds co #### Configuration Schema Extension -The MicroShift configuration file will be extended to include a new `featureGates` section inspired by the OpenShift FeatureGate CRD specification: +The MicroShift configuration file will be extended to include a new `featureGates` section with a structure inspired by the OpenShift FeatureGate CRD specification. While OpenShift users configure feature gates through the Kubernetes API (e.g., `oc edit featuregate cluster`), MicroShift users will configure them directly in `/etc/microshift/config.yaml`: **Predefined Feature Set Configuration:** ```yaml @@ -128,21 +130,51 @@ featureGates: This configuration will be parsed during MicroShift startup and the feature gate settings will be passed to the appropriate Kubernetes components via their command-line arguments or configuration files. +#### FeatureSet Definitions + +Each OpenShift release image provides one manifest per FeatureSet profile. This enables the existing MicroShift rebase automation to keep current with OpenShift feature-set lists. The pertinent manifests for MicroShift are: + +- `0000_50_cluster-config-api_featureGate-SelfManagedHA-Default.yaml` +- `0000_50_cluster-config-api_featureGate-SelfManagedHA-DevPreviewNoUpgrade.yaml` +- `0000_50_cluster-config-api_featureGate-SelfManagedHA-TechPreviewNoUpgrade.yaml` + #### Component Integration -Feature gates will be applied to the following MicroShift components, which are integrated into the MicroShift runtime rather than running as separate processes: -- **kubelet**: Feature gates specified in kubelet configuration file -- **kube-apiserver**: Feature gates specified in kube-apiserver configuration file -- **kube-controller-manager**: Feature gates specified in kube-controller-manager configuration file -- **kube-scheduler**: Feature gates specified in kube-scheduler configuration file +In OpenShift, users configure feature gates by creating FeatureGate API objects and operators independently filter featureGates for their respective components. MicroShift adopts a different model aligned with its file-based configuration approach: users specify feature gates in `/etc/microshift/config.yaml`, and MicroShift passes all user-specified featureGates to the kube-apiserver, which then handles the propagation to other components. This approach ensures all components receive the necessary feature gate settings without requiring MicroShift to implement complex filtering logic. + +The propagation flow works as follows: +1. **MicroShift → kube-apiserver**: MicroShift passes all configured feature gates to the kube-apiserver +2. **kube-apiserver → Other Components**: The kube-apiserver propagates feature gates to: + - **kubelet**: Through the Node configuration + - **kube-controller-manager**: Through internal cluster configuration + - **kube-scheduler**: Through internal cluster configuration + +Each component will then internally process these settings according to its capabilities. This leverages Kubernetes' native propagation mechanisms rather than requiring MicroShift to directly configure each component. + +#### Comparison with OpenShift's FeatureGate Architecture + +**OpenShift Approach:** +- Users configure feature gates through the FeatureGate API by creating/modifying FeatureGate instances +- The FeatureGate API instance named 'cluster' serves as the single source of truth for all featureGates across the cluster +- Each operator independently reads the 'cluster' FeatureGate instance and filters the featureGates relevant to its managed components +- Operators determine which featureGates to pass to their components and handle component restarts when featureGate values change +- This provides fine-grained control but requires complex operator logic for filtering and lifecycle management -MicroShift will generate or modify the appropriate configuration files for each component based on the user's feature gate settings in the MicroShift configuration file. +**MicroShift Approach:** +- Users configure feature gates through the configuration file (`/etc/microshift/config.yaml`) rather than through an API +- Configuration file-based featureGate specification without a central API object +// TODO this is unclear on openshift. i saw that the MCO watches the FeatureGate API and will restart kubelets, but I don't know if this applies to all components. It's probably not worth mentioning here though since it doesn't really change the design +- Single-point propagation through kube-apiserver to all other Kubernetes components +- Simpler implementation leveraging kube-apiserver's native propagation mechanisms +- Component restart handled through MicroShift service restart rather than individual operator reconciliation #### Validation and Error Handling -- Invalid feature gate names will be caught by the Kubernetes components themselves -- MicroShift will log configuration parsing errors but delegate feature gate validation to the components -- Conflicting feature gate settings between user configuration and component requirements will result in component startup failures with appropriate error messages +- **Configuration Parsing**: MicroShift will validate the structural correctness of the configuration (YAML syntax, required fields) +- **API Server Validation**: The kube-apiserver does not validate the feature gates it receives from MicroShift before propagating them +- **Component-level Validation**: Each Kubernetes component will validate the feature gates it recognizes +- **Error Reporting**: Components will log errors or warnings for invalid feature gate configurations +- **Startup Failures**: May occur when featureGate settings conflict (i.e. a featureGate is both enabled and disabled) ### Risks and Mitigations @@ -152,9 +184,9 @@ Users experimenting with alpha-stage feature gates may encounter instability or *Mitigation:* Emphasize that experimentation should be conducted in non-production environments. Feature gate validation will be handled by the Kubernetes components themselves. **Risk: Configuration Errors** -Invalid feature gate configurations could prevent MicroShift components from starting. +Invalid feature gate configurations in the MicroShift configuration file could prevent MicroShift components from starting. -*Mitigation:* Leverage Kubernetes component validation for feature gate names and values. Provide clear error messages and documentation for troubleshooting configuration issues. +*Mitigation:* Kubernetes components inherently ignore unrecognized feature gate names, so typos or incorrect names will not cause failures. Only invalid values for recognized gates can cause issues. Components provide clear error messages for such cases, and documentation will guide troubleshooting. **Risk: Security Implications** Some feature gates may expose new attack vectors or security vulnerabilities. @@ -195,7 +227,7 @@ No significant alternatives were considered for this enhancement. The configurat ## Test Plan -The testing strategy focuses on verifying the passthrough functionality - that custom feature gate configurations are correctly parsed and passed to the appropriate Kubernetes components. Since this is strictly a configuration passthrough feature, testing validates the parsing and delivery mechanism rather than feature gate functionality itself. +The testing strategy focuses on verifying the propagation functionality - that custom feature gate configurations are correctly parsed from the MicroShift configuration file and passed to the kube-apiserver, which then handles propagation to other Kubernetes components. Testing validates the parsing and delivery mechanism rather than feature gate functionality itself. ### Unit Tests @@ -205,30 +237,24 @@ The testing strategy focuses on verifying the passthrough functionality - that c - Verify configuration schema validation and error handling for malformed configurations - Test default behavior when feature gates section is not configured -**Component Configuration Generation:** -- Test that feature gates are correctly written to kubelet configuration files -- Verify feature gates are properly formatted in kube-apiserver configuration -- Test feature gates are correctly applied to kube-controller-manager configuration -- Validate feature gates are properly set in kube-scheduler configuration -- Test that feature gates are applied to the correct components based on their scope +**API Server Configuration:** +- Verify feature gates are properly formatted in the kube-apiserver configuration ### Robot Framework Integration Tests -**Passthrough Verification:** -- Test that custom feature gates specified in MicroShift configuration appear in component configurations after service restart -- Verify TechPreviewNoUpgrade and DevPreviewNoUpgrade presets result in correct feature gates being passed to all components -- Test CustomNoUpgrade configuration with specific enabled/disabled lists are correctly applied to component configurations -- Validate that configuration changes only take effect after MicroShift service restart +**Universal Propagation Verification:** +- Test that custom feature gates specified in MicroShift configuration appear after service restart +- Verify TechPreviewNoUpgrade and DevPreviewNoUpgrade presets results in their feature gates being passed to kube-apiserver **Configuration Error Handling:** -- Test MicroShift behavior with invalid feature gate names (passthrough with component validation) -- Verify appropriate error reporting when components reject invalid feature gate configurations -- Test handling of conflicting settings (same feature gate in both enabled and disabled lists) +- Verify error reporting from embedded components in MicroShift logs +- Test handling of conflicting settings (same feature gate in both enabled and disabled lists) at the kube-apiserver level +- Verify that configuration file parsing errors are clearly reported to users ### Testing Scope Limitations **Component Behavior Verification:** -This enhancement does not test whether feature gates actually modify Kubernetes component behavior - that is the responsibility of upstream Kubernetes and OpenShift testing. Testing is limited to verifying the configuration passthrough mechanism works correctly. +This enhancement does not test whether feature gates actually modify Kubernetes component behavior - that is the responsibility of upstream Kubernetes testing. Testing is limited to verifying that MicroShift correctly passes feature gates to the kube-apiserver and that the kube-apiserver's native propagation mechanism distributes them to other components correctly. **Upgrade Testing:** Since upgrades are not supported when custom feature gates are configured, no additional upgrade testing is required for this enhancement. Default upgrade behavior without custom feature gates is already covered by existing MicroShift test suites. @@ -261,7 +287,7 @@ Upgrades and downgrades proceed normally using standard MicroShift procedures wi **Custom Feature Gate Configurations:** Upgrades and downgrades are not supported when custom feature gates are configured (TechPreviewNoUpgrade, DevPreviewNoUpgrade, or CustomNoUpgrade). Once custom feature gates are enabled, this configuration cannot be reverted - it is a permanent, one-way operation that permanently disables upgrade capability. -This limitation aligns with OpenShift's approach where TechPreviewNoUpgrade, DevPreviewNoUpgrade, and CustomNoUpgrade feature sets are irreversible and explicitly prevent cluster upgrades to avoid compatibility issues with experimental features. +Similar to OpenShift, the TechPreviewNoUpgrade, DevPreviewNoUpgrade, and CustomNoUpgrade feature sets are irreversible and explicitly prevent cluster upgrades to avoid compatibility issues with experimental features. ## Version Skew Strategy @@ -274,14 +300,14 @@ When no custom feature gates are configured, standard MicroShift version skew ha When custom feature gates are configured (TechPreviewNoUpgrade, DevPreviewNoUpgrade, or CustomNoUpgrade), upgrades and downgrades between minor versions are not expected to work. Users must remove custom feature gate configurations before attempting minor version changes. ### Component Version Alignment -All Kubernetes components (kubelet, kube-apiserver, kube-controller-manager, kube-scheduler) are packaged together within each MicroShift release, eliminating internal component version skew concerns. Feature gate configuration is applied during startup with no runtime coordination required between components. +All Kubernetes components (kubelet, kube-apiserver, kube-controller-manager, kube-scheduler) are packaged together within each MicroShift release, eliminating internal component version skew concerns. Feature gate configuration is read from the MicroShift configuration file and passed to the kube-apiserver during startup, which then handles propagation to other components using Kubernetes' native mechanisms. -### Feature Gate Inconsistencies Between Components -It is possible that one component's feature gate settings disable an existing default feature gate while another component enables it, creating inconsistent behavior across components. However, resolving such inconsistencies is not within the scope of this proposal - this enhancement provides a passthrough mechanism only and does not validate feature gate compatibility between components. +### Feature Gate Consistency Across Components +The kube-apiserver's native propagation mechanism ensures consistent feature gate distribution to all Kubernetes components. While individual components may recognize different subsets of feature gates based on their capabilities, the kube-apiserver ensures all components receive the same feature gate configuration from the MicroShift configuration file. This enhancement relies on the kube-apiserver's propagation logic and does not implement additional validation for feature gate compatibility between components. ## Operational Aspects of API Extensions -This enhancement does not introduce any API extensions (CRDs, admission webhooks, conversion webhooks, aggregated API servers, or finalizers). The feature operates entirely through configuration file changes and does not modify the OpenShift API surface or behavior. +// TODO the configuration schema is being modified. Backwards compatibility must be maintained All operational aspects are handled through existing MicroShift configuration mechanisms and component startup procedures. @@ -296,9 +322,7 @@ All operational aspects are handled through existing MicroShift configuration me - **Detection**: Service status shows failed state, component logs show unknown feature gate names **Component-Specific Failures:** -- **kubelet errors**: Check `journalctl -u microshift.service` for kubelet initialization failures -- **kube-apiserver errors**: Look for API server startup errors in MicroShift service logs -- **Controller/scheduler errors**: Component initialization failures logged in MicroShift service output +- **kube-apiserver errors**: Look for API server startup errors in `journalctl -u microshift.service` - these are critical as the apiserver handles propagation ### Disabling Feature Gate Configuration From 524fa28fcab016ebf649fc9e0c3c6da411be2307 Mon Sep 17 00:00:00 2001 From: Jon Cope Date: Fri, 3 Oct 2025 12:10:47 -0500 Subject: [PATCH 5/8] clarify lack of support and upgrade/rollback for clusters with custom feature gates --- .../enabling-user-specified-featuregates.md | 99 ++++++------------- 1 file changed, 28 insertions(+), 71 deletions(-) diff --git a/enhancements/microshift/enabling-user-specified-featuregates.md b/enhancements/microshift/enabling-user-specified-featuregates.md index c0b2d2d094..e2a6545500 100644 --- a/enhancements/microshift/enabling-user-specified-featuregates.md +++ b/enhancements/microshift/enabling-user-specified-featuregates.md @@ -21,7 +21,7 @@ see-also: ## Summary -MicroShift disables most feature gates by default while hardcoding only a few relevant ones, and lacks a controlled mechanism for users to experiment with additional feature gates or override defaults. This enhancement proposes adding configuration support for feature gates through the MicroShift configuration file. In OpenShift, users configure feature gates through the FeatureGate API, where operators independently filter featureGates for their components based on the central FeatureGate API 'cluster' instance. In contrast, MicroShift users will specify feature gates directly in the configuration file (`/etc/microshift/config.yaml`), and MicroShift will pass all user-specified featureGates to the kube-apiserver, which then propagates them to other Kubernetes components (kubelet, kube-controller-manager, kube-scheduler). This capability will enable users to experiment with alpha and beta Kubernetes features like CPUManager's `prefer-align-cpus-by-uncorecache` in a supported and deterministic way, addressing edge computing use cases where users want to evaluate advanced resource management capabilities. +MicroShift disables most feature gates by default while hardcoding only a few relevant ones, and lacks a controlled mechanism for users to experiment with additional feature gates or override defaults. This enhancement proposes adding configuration support for feature gates through the MicroShift configuration file. In OpenShift, users configure feature gates through the FeatureGate API. In contrast, MicroShift users will specify feature gates directly in the configuration file (`/etc/microshift/config.yaml`), and MicroShift will pass all user-specified featureGates to the kube-apiserver, which then propagates them to other Kubernetes components (kubelet, kube-controller-manager, kube-scheduler). This capability will enable users to experiment with alpha and beta Kubernetes features like CPUManager's `prefer-align-cpus-by-uncorecache` in a supported and deterministic way, addressing edge computing use cases where users want to evaluate advanced resource management capabilities. ## Motivation @@ -34,14 +34,12 @@ MicroShift users in edge computing environments want to experiment with upcoming ### Goals * Enable user configuration of feature gates through the MicroShift configuration file -* Provide a controlled and deterministic way to experiment with alpha and beta features ### Non-Goals * Modify MicroShift's existing feature gate defaults -* Vetting custom feature gates for compatibility with MicroShift -* Validating custom feature gate settings for correctness, e.g. spelling, case, and punctuation -* Automatic enablement of experimental features without explicit user configuration +* Vet custom feature gates for compatibility with MicroShift +* Validate custom feature gate settings for correctness, e.g. spelling, case, and punctuation * Providing upgrade support to customized clusters ## Proposal @@ -74,7 +72,7 @@ This approach ensures that users can experiment with feature gate capabilities w ### API Extensions -This enhancement extends MicroShift's configuration file schema only. No new CRDs, admission webhooks, conversion webhooks, aggregated API servers, or finalizers are introduced. Unlike OpenShift where users interact with the FeatureGate API to configure feature gates, MicroShift users will configure feature gates directly in the `/etc/microshift/config.yaml` file. The configuration file structure will be extended to include a `featureGates` section with a structure inspired by the OpenShift FeatureGate CRD specification, maintaining MicroShift's file-based configuration approach. +This enhancement extends MicroShift's configuration file schema only. No new CRDs, admission webhooks, conversion webhooks, aggregated API servers, or finalizers are introduced. Unlike OpenShift where users interact with the FeatureGate API to configure feature gates, MicroShift users will configure feature gates directly in the `/etc/microshift/config.yaml` file. ### Topology Considerations @@ -134,13 +132,12 @@ This configuration will be parsed during MicroShift startup and the feature gate Each OpenShift release image provides one manifest per FeatureSet profile. This enables the existing MicroShift rebase automation to keep current with OpenShift feature-set lists. The pertinent manifests for MicroShift are: -- `0000_50_cluster-config-api_featureGate-SelfManagedHA-Default.yaml` - `0000_50_cluster-config-api_featureGate-SelfManagedHA-DevPreviewNoUpgrade.yaml` - `0000_50_cluster-config-api_featureGate-SelfManagedHA-TechPreviewNoUpgrade.yaml` #### Component Integration -In OpenShift, users configure feature gates by creating FeatureGate API objects and operators independently filter featureGates for their respective components. MicroShift adopts a different model aligned with its file-based configuration approach: users specify feature gates in `/etc/microshift/config.yaml`, and MicroShift passes all user-specified featureGates to the kube-apiserver, which then handles the propagation to other components. This approach ensures all components receive the necessary feature gate settings without requiring MicroShift to implement complex filtering logic. +In OpenShift, users configure feature gates by creating FeatureGate API objects and operators independently filter featureGates for their respective components. MicroShift adopts a different model aligned with its file-based configuration approach: users specify feature gates in `/etc/microshift/config.yaml`, and MicroShift passes all user-specified featureGates to the kube-apiserver. The propagation flow works as follows: 1. **MicroShift → kube-apiserver**: MicroShift passes all configured feature gates to the kube-apiserver @@ -155,43 +152,37 @@ Each component will then internally process these settings according to its capa **OpenShift Approach:** - Users configure feature gates through the FeatureGate API by creating/modifying FeatureGate instances -- The FeatureGate API instance named 'cluster' serves as the single source of truth for all featureGates across the cluster -- Each operator independently reads the 'cluster' FeatureGate instance and filters the featureGates relevant to its managed components -- Operators determine which featureGates to pass to their components and handle component restarts when featureGate values change +- The FeatureGate API instance named 'cluster' serves as the default source of truth for all featureGates across the cluster +- The kube-apiserver detects a CRUD event on the FeatureGate API, parses all FeatureGate API instances, and communicates the FeatureGate values to cluster components +- Operators like the Machine Config Operator also detect the CRUD event and will restart the the operand component if necessary - This provides fine-grained control but requires complex operator logic for filtering and lifecycle management **MicroShift Approach:** - Users configure feature gates through the configuration file (`/etc/microshift/config.yaml`) rather than through an API -- Configuration file-based featureGate specification without a central API object -// TODO this is unclear on openshift. i saw that the MCO watches the FeatureGate API and will restart kubelets, but I don't know if this applies to all components. It's probably not worth mentioning here though since it doesn't really change the design -- Single-point propagation through kube-apiserver to all other Kubernetes components +- If the new config is custom feature gates, MicroShift passes this to the kube-apiserver via the kube-apiserver config file +- If the new config is for a feature set, MicroShift extracts the feature gates from the respective feature set manifest (embedded) and passes them to the kube-apiserver via the kube-apiserver config file - Simpler implementation leveraging kube-apiserver's native propagation mechanisms - Component restart handled through MicroShift service restart rather than individual operator reconciliation #### Validation and Error Handling - **Configuration Parsing**: MicroShift will validate the structural correctness of the configuration (YAML syntax, required fields) -- **API Server Validation**: The kube-apiserver does not validate the feature gates it receives from MicroShift before propagating them +- **API Server Validation**: The kube-apiserver does not validate the feature gates it receives from MicroShift before propagating them. This behavior is the same on OpenShift - **Component-level Validation**: Each Kubernetes component will validate the feature gates it recognizes - **Error Reporting**: Components will log errors or warnings for invalid feature gate configurations - **Startup Failures**: May occur when featureGate settings conflict (i.e. a featureGate is both enabled and disabled) ### Risks and Mitigations -**Risk: Experimenting with Unstable Alpha Features** -Users experimenting with alpha-stage feature gates may encounter instability or data loss in their MicroShift deployments. +**Risk: Experimenting with Features** +Users experimenting feature gates may encounter instability or data loss in their MicroShift deployments. *Mitigation:* Emphasize that experimentation should be conducted in non-production environments. Feature gate validation will be handled by the Kubernetes components themselves. **Risk: Configuration Errors** Invalid feature gate configurations in the MicroShift configuration file could prevent MicroShift components from starting. -*Mitigation:* Kubernetes components inherently ignore unrecognized feature gate names, so typos or incorrect names will not cause failures. Only invalid values for recognized gates can cause issues. Components provide clear error messages for such cases, and documentation will guide troubleshooting. - -**Risk: Security Implications** -Some feature gates may expose new attack vectors or security vulnerabilities. - -*Mitigation:* Security review will follow standard MicroShift processes. Feature gates that fundamentally conflict with MicroShift's security model will be documented as unsupported. +*Mitigation:* Kubernetes components inherently ignore unrecognized feature gate names, so typos or mispellings may not cause failures. Only invalid values for recognized gates can cause issues. Components provide clear error messages for such cases, and documentation will guide troubleshooting. Recommended that users run `microshift-cleanup-script`, delete the custom feature gates from `/etc/microshift/config.yaml` and restart the MicroShift service. ### Drawbacks @@ -209,7 +200,7 @@ Enabling `TechPreviewNoUpgrade`, `DevPreviewNoUpgrade`, or `CustomNoUpgrade` fea ## Alternatives (Not Implemented) -No significant alternatives were considered for this enhancement. The configuration file approach aligns with MicroShift's existing patterns and provides the required user-configurable feature gates with automated OpenShift alignment. +Utilizing the FeatureGate API on MicroShift is rejected as an alternative approach because it requires additional operators to manage both the API and the kubernetes components. At best, this would increase the complexity of cluster component lifecycle management and increase cluster overhead. This approach would also be a departure from the current model for user-defined configuration. ## Open Questions [optional] @@ -225,6 +216,8 @@ No significant alternatives were considered for this enhancement. The configurat Unlike OpenShift which has extensive CI testing across feature combinations, MicroShift may have limited resources for testing all feature gate combinations across version upgrades. The approach for ensuring compatibility and providing user guidance needs definition. + **Answer:** OpenShift does not validate feature gate compatibility and designates any customization of feature gate flags as unsupported. MicroShift will adopt this philosphy as well. + ## Test Plan The testing strategy focuses on verifying the propagation functionality - that custom feature gate configurations are correctly parsed from the MicroShift configuration file and passed to the kube-apiserver, which then handles propagation to other Kubernetes components. Testing validates the parsing and delivery mechanism rather than feature gate functionality itself. @@ -232,13 +225,13 @@ The testing strategy focuses on verifying the propagation functionality - that c ### Unit Tests **Configuration Parsing:** -- Validate parsing of `featureSet` values (TechPreviewNoUpgrade, DevPreviewNoUpgrade, CustomNoUpgrade, Default) +- Validate parsing of `featureSet` values (TechPreviewNoUpgrade, DevPreviewNoUpgrade, CustomNoUpgrade) - Test parsing of `customNoUpgrade.enabled` and `customNoUpgrade.disabled` lists - Verify configuration schema validation and error handling for malformed configurations - Test default behavior when feature gates section is not configured **API Server Configuration:** -- Verify feature gates are properly formatted in the kube-apiserver configuration +- Verify feature gate pass-through retains string formating in the kube-apiserver configuration ### Robot Framework Integration Tests @@ -247,14 +240,14 @@ The testing strategy focuses on verifying the propagation functionality - that c - Verify TechPreviewNoUpgrade and DevPreviewNoUpgrade presets results in their feature gates being passed to kube-apiserver **Configuration Error Handling:** -- Verify error reporting from embedded components in MicroShift logs -- Test handling of conflicting settings (same feature gate in both enabled and disabled lists) at the kube-apiserver level +- Verify error reporting from embedded components in MicroShift logs +- Test handling of conflicting settings (same feature gate in both enabled and disabled lists) by MicroShift - Verify that configuration file parsing errors are clearly reported to users ### Testing Scope Limitations **Component Behavior Verification:** -This enhancement does not test whether feature gates actually modify Kubernetes component behavior - that is the responsibility of upstream Kubernetes testing. Testing is limited to verifying that MicroShift correctly passes feature gates to the kube-apiserver and that the kube-apiserver's native propagation mechanism distributes them to other components correctly. +This enhancement does not test whether feature gates actually modify Kubernetes component behavior - that is the responsibility of upstream Kubernetes testing. Testing is limited to verifying that MicroShift correctly passes feature gates to the kube-apiserver. **Upgrade Testing:** Since upgrades are not supported when custom feature gates are configured, no additional upgrade testing is required for this enhancement. Default upgrade behavior without custom feature gates is already covered by existing MicroShift test suites. @@ -293,23 +286,20 @@ Similar to OpenShift, the TechPreviewNoUpgrade, DevPreviewNoUpgrade, and CustomN This enhancement introduces upgrade limitations when custom feature gates are configured to prevent compatibility issues across version boundaries. +Feature Sets defined by OpenShift are included in the OCP release image. Rebase automation will be extended to pull in these manifests and they will be embedded into the MicroShift binary at build time. + ### Default Configuration When no custom feature gates are configured, standard MicroShift version skew handling applies with no additional considerations. ### Custom Feature Gate Limitations When custom feature gates are configured (TechPreviewNoUpgrade, DevPreviewNoUpgrade, or CustomNoUpgrade), upgrades and downgrades between minor versions are not expected to work. Users must remove custom feature gate configurations before attempting minor version changes. -### Component Version Alignment -All Kubernetes components (kubelet, kube-apiserver, kube-controller-manager, kube-scheduler) are packaged together within each MicroShift release, eliminating internal component version skew concerns. Feature gate configuration is read from the MicroShift configuration file and passed to the kube-apiserver during startup, which then handles propagation to other components using Kubernetes' native mechanisms. - ### Feature Gate Consistency Across Components -The kube-apiserver's native propagation mechanism ensures consistent feature gate distribution to all Kubernetes components. While individual components may recognize different subsets of feature gates based on their capabilities, the kube-apiserver ensures all components receive the same feature gate configuration from the MicroShift configuration file. This enhancement relies on the kube-apiserver's propagation logic and does not implement additional validation for feature gate compatibility between components. +Feature gate skew can occur between embedded components. On OpenShift, this is a non-issue. On MicroShift, it is a known issue that one component's default may be to disable a feature, while another comonpent enables it. This problem is tracked by [USHIFT-2813](https://issues.redhat.com/browse/USHIFT-2813). Solving this issue is outside the scope of this proposal. ## Operational Aspects of API Extensions -// TODO the configuration schema is being modified. Backwards compatibility must be maintained - -All operational aspects are handled through existing MicroShift configuration mechanisms and component startup procedures. +Any changes to the MicroShift configuration schema must be backwards compatible by at least y-2 minor versions. ## Support Procedures @@ -321,48 +311,15 @@ All operational aspects are handled through existing MicroShift configuration me - **Error patterns**: Component startup failures with feature gate validation errors - **Detection**: Service status shows failed state, component logs show unknown feature gate names -**Component-Specific Failures:** -- **kube-apiserver errors**: Look for API server startup errors in `journalctl -u microshift.service` - these are critical as the apiserver handles propagation - -### Disabling Feature Gate Configuration - -**Remove Custom Feature Gates:** -1. Edit `/etc/microshift/config.yaml` -2. Remove or comment out the `featureGates` section -3. Restart MicroShift service: `sudo systemctl restart microshift` +### Reverting Custom Feature Gate Configurations To Default -**Reset to Default Configuration:** -```yaml -# Remove entire featureGates section or set to: -featureGates: - featureSet: Default -``` - -**Consequences of Disabling:** -- **Cluster health**: No impact on core MicroShift functionality -- **Existing workloads**: Workloads using experimental features may lose functionality -- **New workloads**: Will use default feature gate behavior only - -### Edge Environment Troubleshooting - -**Remote Diagnostics:** -- Feature gate configuration issues are logged in standard MicroShift service logs -- Use `microshift get nodes` to verify basic cluster functionality -- Check component status through `microshift get pods -A` for system pod health +**Reverting the cluster to it's default feature-gates is unsupported and not recommended.** **Recovery Procedures:** - Configuration changes only require MicroShift service restart, not full system reboot - Invalid configurations prevent service startup but do not affect system stability - Greenboot integration ensures automatic rollback if feature gates prevent successful startup -### Graceful Failure and Recovery - -**Configuration Changes:** -- Invalid feature gate configurations fail fast during service startup -- No partial application of settings - either all feature gates apply or none do -- Recovery is immediate upon fixing configuration and restarting service -- No data consistency risks from feature gate configuration changes - ## Infrastructure Needed [optional] No additional infrastructure is needed for this enhancement. The feature uses existing MicroShift configuration mechanisms and testing infrastructure. \ No newline at end of file From 2438baf735697c729f7ef72fafd37e6bcb99109f Mon Sep 17 00:00:00 2001 From: Jon Cope Date: Wed, 8 Oct 2025 12:58:52 -0500 Subject: [PATCH 6/8] Clarify upgrade limitations and the irreversible nature of custom feature gates Provide implementation details on how cluster upgrades and changes to configured feature gate will be prevented. --- .../enabling-user-specified-featuregates.md | 91 +++++++++++++------ 1 file changed, 64 insertions(+), 27 deletions(-) diff --git a/enhancements/microshift/enabling-user-specified-featuregates.md b/enhancements/microshift/enabling-user-specified-featuregates.md index e2a6545500..547e7457a6 100644 --- a/enhancements/microshift/enabling-user-specified-featuregates.md +++ b/enhancements/microshift/enabling-user-specified-featuregates.md @@ -44,31 +44,63 @@ MicroShift users in edge computing environments want to experiment with upcoming ## Proposal -This enhancement proposes adding feature gate configuration support to MicroShift by extending `/etc/microshift/config.yaml` with a configuration schema inspired by OpenShift's FeatureGate custom resource specification. In OpenShift, users configure feature gates through the FeatureGate API, and operators independently filter featureGates before applying them to their components. MicroShift takes a different approach aligned with its file-based configuration philosophy: users specify feature gates directly in the configuration file, and MicroShift passes all user-specified featureGates to the kube-apiserver, which then handles propagation to other Kubernetes components. +This enhancement proposes adding feature gate configuration support to MicroShift by extending `/etc/microshift/config.yaml` with a configuration inspired by OpenShift's FeatureGate custom resource specification. In OpenShift, users configure feature gates through the FeatureGate API, which is then propogated to sub-components (e.g. kube-apiserver, kubelet). In some cases, sub-component operators are also involved in the propagation of feature gate configurations and service restarts, such as the MCO configuring and restarting kubelets. + +MicroShift does not deploy these operators and must a different approach which is aligned with its file-based configuration philosophy: users specify feature gates directly in the configuration file, and MicroShift passes all user-specified featureGates to the kube-apiserver, which then handles propagation to other Kubernetes components. Service restarts are executed by the cluster admin by restarting the MicroShift process. + +> **Important!** The use of custom feature gates on OpenShift is irreversible and renders a cluster unable to be upgraded. This feature should only be used for testing alpha/beta features and should never be used in productions. The implementation includes: -1. **FeatureGate Configuration Schema**: Extend MicroShift's configuration file to include `featureGates` section with fields inspired by OpenShift's FeatureGate CRD spec (`featureSet` and `customNoUpgrade`) +1. **FeatureGate Configuration**: Extend MicroShift's configuration file to include `featureGates` section with fields inspired by OpenShift's FeatureGate CRD spec (`featureSet` and `customNoUpgrade`) 2. **Predefined Feature Sets**: Support for predefined feature sets like `TechPreviewNoUpgrade` and `DevPreviewNoUpgrade` 3. **Custom Feature Gates**: Support for individual feature gate enablement/disablement via `customNoUpgrade` configuration -4. **API Server Propagation**: All configured featureGates will be passed to the kube-apiserver, which handles propagation to other Kubernetes components (kubelet, kube-controller-manager, kube-scheduler) +4. **API Server Propagation**: All configured featureGates will be passed to the kube-apiserver, which handles propagation to other Kubernetes components (kubelet, kube-controller-manager, kube-scheduler). Service restarts are the responsibility of the cluster admin. +5. **Prevent Feature Gate Config Changes**: OpenShift prevents users from reverting custom feature gates via spec validation rules. This is an not option for the MicroShift config. Instead, MicroShift will check for custom feature gates at startup. If customizations exist, MicroShift will write a sentinel file to `/var/lib/microshift/`.This file will contain the custom feature gates. When MicroShift next restarts, it will check for this file and overwrite the in-memory config's feature gate settings with those stored in the sentinel file. + + **Note**: MicroShift will not overwrite `/etc/microshift/config.yaml`. Only the in-memory config will be affected. + +6. **Preventing Clusters Upgrades**: Upgrades on OpenShift are prevented at the cluster level by the cluster-version-operator, in conjunction with other OpenShift operators. However, MicroShift lacks these operators. Instead, MicroShift's install/upgrade logic will re-use the sentinel file described in #5. If the file exists, the cluster is un-upgradeable. -This approach ensures that users can experiment with feature gate capabilities while maintaining MicroShift's file-based configuration pattern instead of requiring API interactions. +This approach ensures that users can experiment with feature gate capabilities while maintaining MicroShift's file-based configuration pattern while still getting the same validation behavior as OpenShift. ### Workflow Description **MicroShift Administrator** is a human user responsible for configuring and managing MicroShift deployments. #### User Configuration Workflow + +##### First Time Configuring Feature Gates 1. MicroShift Administrator identifies a need for specific feature gates (e.g., `CPUManagerPolicyAlphaOptions`) 2. Administrator chooses between two configuration approaches: - **Predefined Feature Set**: Configure `featureGates.featureSet: TechPreviewNoUpgrade` or `DevPreviewNoUpgrade` for a curated set of preview features - **Custom Feature Gates**: Configure `featureGates.featureSet: CustomNoUpgrade` and specify individual features in `featureGates.customNoUpgrade.enabled/disabled` lists 3. Administrator updates `/etc/microshift/config.yaml` with the chosen configuration 4. Administrator restarts MicroShift service -5. MicroShift parses the FeatureGate configuration and passes all settings to the kube-apiserver -6. The kube-apiserver propagates the feature gates to other Kubernetes components (kubelet, kube-controller-manager, kube-scheduler) -7. Each component processes the featureGates and enables/disables the features it supports according to the configured state +5. MicroShift detects the custom FeatureGate configuration. +6. MicroShift writes a sentinel file to `/var/lib/microshift/`, containing the feature gate config. +7. The kube-apiserver propagates the feature gates to other Kubernetes components (kubelet, kube-controller-manager, kube-scheduler) +8. Each component processes the featureGates and enables/disables the features it supports according to the configured state + +##### Attempt to Revert Custom Feature Gates +1. Administrator decides to revert custom feature gates (e.g., wants to return to default settings) +2. Administrator modifies `/etc/microshift/config.yaml` to remove or change feature gate configuration +3. Administrator restarts MicroShift service +4. MicroShift detects the sentinel file exists at `/var/lib/microshift/` containing previous custom feature gates +5. MicroShift overrides the configuration file settings with those stored in the sentinel file +6. MicroShift logs a warning that custom feature gates cannot be reverted once applied +7. The cluster continues to run with the original custom feature gates despite the configuration change attempt + +##### Attempt to Upgrade Cluster with Custom Feature Gates +1. Administrator attempts to upgrade MicroShift to a new version (e.g., via RPM upgrade) +2. MicroShift upgrade process checks for the existence of the sentinel file at `/var/lib/microshift/` +3. If sentinel file exists (indicating custom feature gates are configured): + - The upgrade process detects the cluster is marked as non-upgradeable + - Upgrade is blocked with an error message indicating custom feature gates prevent upgrades +4. Upgrade fails to proceed, preserving the current MicroShift version +5. Administrator must either: + - Continue using the current version with custom feature gates + - Wipe MicroShift's state (`$ sudo microshift-cleanup-data --all`) and restart MicroShift service (`$ sudo systemctl restart microshift`) ### API Extensions @@ -126,7 +158,7 @@ featureGates: - When using `customNoUpgrade`, the `featureSet` must be set to `CustomNoUpgrade` - The `customNoUpgrade` field is only valid when `featureSet: CustomNoUpgrade` -This configuration will be parsed during MicroShift startup and the feature gate settings will be passed to the appropriate Kubernetes components via their command-line arguments or configuration files. +See [Validation and Error Handling](#validation-and-error-handling) config validation details #### FeatureSet Definitions @@ -152,7 +184,7 @@ Each component will then internally process these settings according to its capa **OpenShift Approach:** - Users configure feature gates through the FeatureGate API by creating/modifying FeatureGate instances -- The FeatureGate API instance named 'cluster' serves as the default source of truth for all featureGates across the cluster +- The FeatureGate API instance named 'cluster' serves as the source of truth for all featureGates across the cluster - The kube-apiserver detects a CRUD event on the FeatureGate API, parses all FeatureGate API instances, and communicates the FeatureGate values to cluster components - Operators like the Machine Config Operator also detect the CRUD event and will restart the the operand component if necessary - This provides fine-grained control but requires complex operator logic for filtering and lifecycle management @@ -166,11 +198,14 @@ Each component will then internally process these settings according to its capa #### Validation and Error Handling -- **Configuration Parsing**: MicroShift will validate the structural correctness of the configuration (YAML syntax, required fields) +- **Configuration Parsing**: MicroShift will replicate OpenShift's schema rules as start-time validation checks: + - **Conflicting Feature Gate Settings**: A feature gate appears in both `.customNoUpgrade.enabled` and `.customNoUpgrade.disabled` + - **Conflicting Feature Set Settings**: Feature gates are defined under `.customNoUpgrade.[enabled|disabled]` but `.featureSet:` is not `customNoUpgrade`. - **API Server Validation**: The kube-apiserver does not validate the feature gates it receives from MicroShift before propagating them. This behavior is the same on OpenShift -- **Component-level Validation**: Each Kubernetes component will validate the feature gates it recognizes -- **Error Reporting**: Components will log errors or warnings for invalid feature gate configurations +- **Component-level Validation**: Unrecognized featuer-gate values are ignored by components. The component will only log them as a warning - **Startup Failures**: May occur when featureGate settings conflict (i.e. a featureGate is both enabled and disabled) +- **Upgrade Failure**: RPM install pre-checks detect feature customizations have already been made because of sentinel file written to `/var/lib/microshift/`, and the upgrade fails +- **Custom Features cannot be Reverted or Changed**: MicroShift logs an error that user customizations have changed, then overwrites the changes with the user's original feature gates. This prevents the cluster from becoming unstable. This is also how OpenShift handles this scenario ### Risks and Mitigations @@ -182,12 +217,12 @@ Users experimenting feature gates may encounter instability or data loss in thei **Risk: Configuration Errors** Invalid feature gate configurations in the MicroShift configuration file could prevent MicroShift components from starting. -*Mitigation:* Kubernetes components inherently ignore unrecognized feature gate names, so typos or mispellings may not cause failures. Only invalid values for recognized gates can cause issues. Components provide clear error messages for such cases, and documentation will guide troubleshooting. Recommended that users run `microshift-cleanup-script`, delete the custom feature gates from `/etc/microshift/config.yaml` and restart the MicroShift service. +*Mitigation:* Kubernetes components inherently ignore unrecognized feature gate names, so typos or mispellings may not cause failures. Components provide clear warning messages for such cases, and documentation will guide troubleshooting. Recommended that users run `microshift-cleanup-script`, correct the invalid config values in `/etc/microshift/config.yaml`, then restart the service. ### Drawbacks **Increased Configuration Complexity** -Adding feature gate configuration increases the complexity of MicroShift's configuration surface area. Users must understand both the feature gates themselves and their potential interactions, which could lead to misconfigurations in edge deployments where troubleshooting access is limited. +Adding feature gate configuration increases the complexity of MicroShift's configuration surface area. Users must understand both the feature gates themselves and their potential interactions, which could lead to misconfigurations in edge deployments where troubleshooting access is limited. Again, users must be aware that custom feature gates are for experimentation only, are unsupported, irreversible, and make a cluster un-upgradeable. **Support Complexity** Enabling alpha and beta features through user configuration means support teams may encounter issues related to experimental functionality that behaves differently across Kubernetes versions or has incomplete implementations. @@ -196,7 +231,7 @@ Enabling alpha and beta features through user configuration means support teams Edge deployments often have limited remote access for troubleshooting. If users enable experimental feature gates that cause instability, recovering these devices may require physical access or complex recovery procedures. **Upgrade Limitations and Irreversible Changes** -Enabling `TechPreviewNoUpgrade`, `DevPreviewNoUpgrade`, or `CustomNoUpgrade` feature sets cannot be undone and prevents both minor version updates and major upgrades. Once enabled, the cluster permanently loses the ability to perform standard updates. These feature sets are explicitly not recommended for production clusters due to their irreversible nature and update limitations, which conflicts with the typical edge deployment requirement for reliable, long-term operation and maintenance. +Once enabled, `TechPreviewNoUpgrade`, `DevPreviewNoUpgrade`, or `CustomNoUpgrade` feature sets CANNOT be undone and the cluster CANNOT be upgraded. These feature sets are NOT RECOMMENDED FOR PRODUCTION CLUSTERS. ## Alternatives (Not Implemented) @@ -210,7 +245,7 @@ Utilizing the FeatureGate API on MicroShift is rejected as an alternative approa - Does OpenShift actively **block/prevent** upgrades when TechPreviewNoUpgrade/DevPreviewNoUpgrade/CustomNoUpgrade is configured? - Or does OpenShift **allow** upgrades to proceed but the resulting cluster becomes unsupported? - Understanding OpenShift's approach will inform whether MicroShift should implement active blocking logic (pre-upgrade checks that fail) or simply document that upgrades with custom feature gates are unsupported while allowing them to proceed technically. + OpenShift actively prevents upgrades of clusters with customized features. OpenShift operators work together to communicate if any component has a had a custom feature gate applied. If so, the cluster-version-operator marks the cluster as un-upgradeable. 2. **How should feature gate compatibility be validated across MicroShift versions?** @@ -228,7 +263,6 @@ The testing strategy focuses on verifying the propagation functionality - that c - Validate parsing of `featureSet` values (TechPreviewNoUpgrade, DevPreviewNoUpgrade, CustomNoUpgrade) - Test parsing of `customNoUpgrade.enabled` and `customNoUpgrade.disabled` lists - Verify configuration schema validation and error handling for malformed configurations -- Test default behavior when feature gates section is not configured **API Server Configuration:** - Verify feature gate pass-through retains string formating in the kube-apiserver configuration @@ -249,8 +283,16 @@ The testing strategy focuses on verifying the propagation functionality - that c **Component Behavior Verification:** This enhancement does not test whether feature gates actually modify Kubernetes component behavior - that is the responsibility of upstream Kubernetes testing. Testing is limited to verifying that MicroShift correctly passes feature gates to the kube-apiserver. -**Upgrade Testing:** -Since upgrades are not supported when custom feature gates are configured, no additional upgrade testing is required for this enhancement. Default upgrade behavior without custom feature gates is already covered by existing MicroShift test suites. +**Upgrade Prevention Testing:** +A test scenario will verify that MicroShift properly blocks upgrades when custom feature gates are configured: +- Validate that clusters with TechPreviewNoUpgrade, DevPreviewNoUpgrade, or CustomNoUpgrade cannot be upgraded +- Test that upgrade failures provide clear error messages indicating custom feature gates prevent upgrades + +**Custom Feature Gate Immutability Testing:** +A test scenario verifies that custom feature gate configurations cannot be modified or reverted once applied: +- Verify that customized feature gates result in the creation of the sentinel file and that it's contents are correct +- Test that MicroShift correctly overwrites configuration changes with stored sentinel values +- Verify proper logging of warnings when users attempt to revert or change custom feature gates ## Graduation Criteria @@ -313,13 +355,8 @@ Any changes to the MicroShift configuration schema must be backwards compatible ### Reverting Custom Feature Gate Configurations To Default -**Reverting the cluster to it's default feature-gates is unsupported and not recommended.** - **Recovery Procedures:** -- Configuration changes only require MicroShift service restart, not full system reboot -- Invalid configurations prevent service startup but do not affect system stability -- Greenboot integration ensures automatic rollback if feature gates prevent successful startup - -## Infrastructure Needed [optional] +- To restore MicroShift to a stable and supported state, users must run `$ sudo microshift-cleanup-data --all`, set `.featureGates: {}`, and restart MicroShift -No additional infrastructure is needed for this enhancement. The feature uses existing MicroShift configuration mechanisms and testing infrastructure. \ No newline at end of file +### Upgrade / Rollback +- Upgrades are actively blocked when custom feature gates are configured. See [Attempt to Upgrade Cluster with Custom Feature Gates](#attempt-to-upgrade-cluster-with-custom-feature-gates). From c42e0722295b85025fc534226bd49ab80d2a3219 Mon Sep 17 00:00:00 2001 From: Jon Cope Date: Tue, 14 Oct 2025 14:02:14 -0500 Subject: [PATCH 7/8] Update enhancements/microshift/enabling-user-specified-featuregates.md Co-authored-by: Shauna Diaz --- enhancements/microshift/enabling-user-specified-featuregates.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/enhancements/microshift/enabling-user-specified-featuregates.md b/enhancements/microshift/enabling-user-specified-featuregates.md index 547e7457a6..f6c98ebb1c 100644 --- a/enhancements/microshift/enabling-user-specified-featuregates.md +++ b/enhancements/microshift/enabling-user-specified-featuregates.md @@ -202,7 +202,7 @@ Each component will then internally process these settings according to its capa - **Conflicting Feature Gate Settings**: A feature gate appears in both `.customNoUpgrade.enabled` and `.customNoUpgrade.disabled` - **Conflicting Feature Set Settings**: Feature gates are defined under `.customNoUpgrade.[enabled|disabled]` but `.featureSet:` is not `customNoUpgrade`. - **API Server Validation**: The kube-apiserver does not validate the feature gates it receives from MicroShift before propagating them. This behavior is the same on OpenShift -- **Component-level Validation**: Unrecognized featuer-gate values are ignored by components. The component will only log them as a warning +- **Component-level Validation**: Unrecognized feature-gate values are ignored by components. The component will only log them as a warning - **Startup Failures**: May occur when featureGate settings conflict (i.e. a featureGate is both enabled and disabled) - **Upgrade Failure**: RPM install pre-checks detect feature customizations have already been made because of sentinel file written to `/var/lib/microshift/`, and the upgrade fails - **Custom Features cannot be Reverted or Changed**: MicroShift logs an error that user customizations have changed, then overwrites the changes with the user's original feature gates. This prevents the cluster from becoming unstable. This is also how OpenShift handles this scenario From 259470630143fa5ad9590b22fad5849c7e5fe8aa Mon Sep 17 00:00:00 2001 From: Jon Cope Date: Tue, 14 Oct 2025 16:38:44 -0500 Subject: [PATCH 8/8] when fgs are set, changes to fgs and upgrades will cause microshift to fail --- .../enabling-user-specified-featuregates.md | 96 ++++++++++--------- 1 file changed, 53 insertions(+), 43 deletions(-) diff --git a/enhancements/microshift/enabling-user-specified-featuregates.md b/enhancements/microshift/enabling-user-specified-featuregates.md index f6c98ebb1c..f14183bac0 100644 --- a/enhancements/microshift/enabling-user-specified-featuregates.md +++ b/enhancements/microshift/enabling-user-specified-featuregates.md @@ -21,15 +21,29 @@ see-also: ## Summary -MicroShift disables most feature gates by default while hardcoding only a few relevant ones, and lacks a controlled mechanism for users to experiment with additional feature gates or override defaults. This enhancement proposes adding configuration support for feature gates through the MicroShift configuration file. In OpenShift, users configure feature gates through the FeatureGate API. In contrast, MicroShift users will specify feature gates directly in the configuration file (`/etc/microshift/config.yaml`), and MicroShift will pass all user-specified featureGates to the kube-apiserver, which then propagates them to other Kubernetes components (kubelet, kube-controller-manager, kube-scheduler). This capability will enable users to experiment with alpha and beta Kubernetes features like CPUManager's `prefer-align-cpus-by-uncorecache` in a supported and deterministic way, addressing edge computing use cases where users want to evaluate advanced resource management capabilities. +MicroShift disables most feature gates by default while hardcoding only a few relevant ones, and lacks a controlled +mechanism for users to experiment with additional feature gates or override defaults. This enhancement proposes adding +configuration support for feature gates through the MicroShift configuration file. In OpenShift, users configure feature +gates through the FeatureGate API. In contrast, MicroShift users will specify feature gates directly in the +configuration file (`/etc/microshift/config.yaml`), and MicroShift will pass all user-specified featureGates to the +kube-apiserver, which then propagates them to other Kubernetes components (kubelet, kube-controller-manager, +kube-scheduler). This capability will enable users to experiment with alpha and beta Kubernetes features like +CPUManager's `prefer-align-cpus-by-uncorecache` in a supported and deterministic way, addressing edge computing use +cases where users want to evaluate advanced resource management capabilities. ## Motivation -MicroShift users in edge computing environments want to experiment with upcoming Kubernetes features that are in alpha or beta stages to evaluate their potential benefits for specific use cases. Currently, users cannot configure feature gates in a supported way, preventing them from experimenting with capabilities like advanced CPU management, enhanced scheduling features, or experimental storage options that might improve performance in their resource-constrained edge environments. +MicroShift users in edge computing environments want to experiment with upcoming Kubernetes features that are in alpha +or beta stages to evaluate their potential benefits for specific use cases. Currently, users cannot configure feature +gates in a supported way, preventing them from experimenting with capabilities like advanced CPU management, enhanced +scheduling features, or experimental storage options that might improve performance in their resource-constrained edge +environments. ### User Stories -* As a MicroShift administrator, I want to configure feature gates through the MicroShift configuration file (`/etc/microshift/config.yaml`), so that I can experiment with alpha/beta features in a controlled and supported manner consistent with MicroShift's file-based configuration approach. +* As a MicroShift administrator, I want to configure feature gates through the MicroShift configuration file + (`/etc/microshift/config.yaml`), so that I can experiment with alpha/beta features in a controlled and supported + manner consistent with MicroShift's file-based configuration approach. ### Goals @@ -40,7 +54,7 @@ MicroShift users in edge computing environments want to experiment with upcoming * Modify MicroShift's existing feature gate defaults * Vet custom feature gates for compatibility with MicroShift * Validate custom feature gate settings for correctness, e.g. spelling, case, and punctuation -* Providing upgrade support to customized clusters +* Providing upgrade support to customized nodes ## Proposal @@ -48,7 +62,7 @@ This enhancement proposes adding feature gate configuration support to MicroShif MicroShift does not deploy these operators and must a different approach which is aligned with its file-based configuration philosophy: users specify feature gates directly in the configuration file, and MicroShift passes all user-specified featureGates to the kube-apiserver, which then handles propagation to other Kubernetes components. Service restarts are executed by the cluster admin by restarting the MicroShift process. -> **Important!** The use of custom feature gates on OpenShift is irreversible and renders a cluster unable to be upgraded. This feature should only be used for testing alpha/beta features and should never be used in productions. +> **Important!** The use of custom feature gates on MicroShift is irreversible and renders a cluster unable to be upgraded. This feature should only be used for testing alpha/beta features and should never be used in productions. Upgraded clusters will fail to start. The implementation includes: @@ -56,13 +70,8 @@ The implementation includes: 2. **Predefined Feature Sets**: Support for predefined feature sets like `TechPreviewNoUpgrade` and `DevPreviewNoUpgrade` 3. **Custom Feature Gates**: Support for individual feature gate enablement/disablement via `customNoUpgrade` configuration 4. **API Server Propagation**: All configured featureGates will be passed to the kube-apiserver, which handles propagation to other Kubernetes components (kubelet, kube-controller-manager, kube-scheduler). Service restarts are the responsibility of the cluster admin. -5. **Prevent Feature Gate Config Changes**: OpenShift prevents users from reverting custom feature gates via spec validation rules. This is an not option for the MicroShift config. Instead, MicroShift will check for custom feature gates at startup. If customizations exist, MicroShift will write a sentinel file to `/var/lib/microshift/`.This file will contain the custom feature gates. When MicroShift next restarts, it will check for this file and overwrite the in-memory config's feature gate settings with those stored in the sentinel file. - - **Note**: MicroShift will not overwrite `/etc/microshift/config.yaml`. Only the in-memory config will be affected. - -6. **Preventing Clusters Upgrades**: Upgrades on OpenShift are prevented at the cluster level by the cluster-version-operator, in conjunction with other OpenShift operators. However, MicroShift lacks these operators. Instead, MicroShift's install/upgrade logic will re-use the sentinel file described in #5. If the file exists, the cluster is un-upgradeable. - -This approach ensures that users can experiment with feature gate capabilities while maintaining MicroShift's file-based configuration pattern while still getting the same validation behavior as OpenShift. +5. **Feature Gate Config Changes**: OpenShift prevents users from reverting custom feature gates via spec validation rules. This is an not option for the MicroShift config. Instead, MicroShift will check for custom feature gates at startup. If customizations exist, MicroShift will write the feature gate config data to `/var/lib/microshift/no-upgrade`. During initialization MicroShift will read `/var/lib/microshift/no-upgrade` and compare it to `/etc/microshift/config.yaml` and `/etc/microshift/config.d/`. If the user's feature gate configuration differs from `/var/lib/microshift/no-upgrade`, MicroShift will log a fatal error. +6. **Breaking Node Upgrades**: Upgrades on OpenShift are preempted at the cluster level by the cluster-version-operator, in conjunction with other OpenShift operators. The lack of OpenShift operators on MicroShift means there is no reliable way to preempt node upgrades. MicroShift will check for `/var/lib/microshift/no-upgrade` and check the current MicroShift version against `/var/lib/microshift/version`. If `/var/lib/microshift/no-upgrade` exists and the version differs, MicroShift will log the error and fail. ### Workflow Description @@ -78,7 +87,7 @@ This approach ensures that users can experiment with feature gate capabilities w 3. Administrator updates `/etc/microshift/config.yaml` with the chosen configuration 4. Administrator restarts MicroShift service 5. MicroShift detects the custom FeatureGate configuration. -6. MicroShift writes a sentinel file to `/var/lib/microshift/`, containing the feature gate config. +6. MicroShift writes feature gate config to `/var/lib/microshift/no-upgrade`, and logs the event 7. The kube-apiserver propagates the feature gates to other Kubernetes components (kubelet, kube-controller-manager, kube-scheduler) 8. Each component processes the featureGates and enables/disables the features it supports according to the configured state @@ -86,21 +95,17 @@ This approach ensures that users can experiment with feature gate capabilities w 1. Administrator decides to revert custom feature gates (e.g., wants to return to default settings) 2. Administrator modifies `/etc/microshift/config.yaml` to remove or change feature gate configuration 3. Administrator restarts MicroShift service -4. MicroShift detects the sentinel file exists at `/var/lib/microshift/` containing previous custom feature gates -5. MicroShift overrides the configuration file settings with those stored in the sentinel file -6. MicroShift logs a warning that custom feature gates cannot be reverted once applied -7. The cluster continues to run with the original custom feature gates despite the configuration change attempt - -##### Attempt to Upgrade Cluster with Custom Feature Gates -1. Administrator attempts to upgrade MicroShift to a new version (e.g., via RPM upgrade) -2. MicroShift upgrade process checks for the existence of the sentinel file at `/var/lib/microshift/` -3. If sentinel file exists (indicating custom feature gates are configured): - - The upgrade process detects the cluster is marked as non-upgradeable - - Upgrade is blocked with an error message indicating custom feature gates prevent upgrades -4. Upgrade fails to proceed, preserving the current MicroShift version +4. MicroShift detects `/var/lib/microshift/no-upgrade` with differring feature gate config. +5. MicroShift logs a fatal error that custom feature gates cannot be reverted or changed once applied + +##### Attempt to Upgrade Node with Custom Feature Gates +1. Administrator deploys a node upgrade. +2. Administrator restarts MicroShift service +3. MicroShift detects the custom FeatureGate configuration. +4. If `/var/lib/microshift/no-upgrade` exists (indicating custom feature gates are configured), then the upgrade is blocked with a fatal error indicating custom feature gates break upgrades 5. Administrator must either: - - Continue using the current version with custom feature gates - - Wipe MicroShift's state (`$ sudo microshift-cleanup-data --all`) and restart MicroShift service (`$ sudo systemctl restart microshift`) + - Revert the upgrade back to the prior version, OR + - Wipe MicroShift's state (`$ sudo microshift-cleanup-data --all`) and restart MicroShift service (`$ sudo systemctl restart microshift`). This returns the node to a supported state. ### API Extensions @@ -114,7 +119,7 @@ This enhancement is not applicable to Hypershift/Hosted Control Planes as featur #### Standalone Clusters -This enhancement is primarily designed for standalone MicroShift deployments where administrators need direct control over feature gate configuration through the local configuration file. +This enhancement is not applicable to standalone clusters. #### Single-node Deployments or MicroShift @@ -133,7 +138,7 @@ The resource consumption impact will be minimal as this enhancement only adds co #### Configuration Schema Extension -The MicroShift configuration file will be extended to include a new `featureGates` section with a structure inspired by the OpenShift FeatureGate CRD specification. While OpenShift users configure feature gates through the Kubernetes API (e.g., `oc edit featuregate cluster`), MicroShift users will configure them directly in `/etc/microshift/config.yaml`: +The MicroShift configuration file will be extended to include a new `featureGates` section with a structure inspired by the OpenShift FeatureGate CRD specification. MicroShift users will configure feature gates in `/etc/microshift/config.yaml`: **Predefined Feature Set Configuration:** ```yaml @@ -204,8 +209,8 @@ Each component will then internally process these settings according to its capa - **API Server Validation**: The kube-apiserver does not validate the feature gates it receives from MicroShift before propagating them. This behavior is the same on OpenShift - **Component-level Validation**: Unrecognized feature-gate values are ignored by components. The component will only log them as a warning - **Startup Failures**: May occur when featureGate settings conflict (i.e. a featureGate is both enabled and disabled) -- **Upgrade Failure**: RPM install pre-checks detect feature customizations have already been made because of sentinel file written to `/var/lib/microshift/`, and the upgrade fails -- **Custom Features cannot be Reverted or Changed**: MicroShift logs an error that user customizations have changed, then overwrites the changes with the user's original feature gates. This prevents the cluster from becoming unstable. This is also how OpenShift handles this scenario +- **Upgrade Failure**: `/var/lib/microshift/no-upgrade` exists and MicroShift version does not equal `/var/lib/microshift/version` +- **Custom Features cannot be Reverted or Changed**: `/var/lib/microshift/no-upgrade` data does not match `/etc/microshift/config` and `/etc/microshift/config.d` ### Risks and Mitigations @@ -217,12 +222,12 @@ Users experimenting feature gates may encounter instability or data loss in thei **Risk: Configuration Errors** Invalid feature gate configurations in the MicroShift configuration file could prevent MicroShift components from starting. -*Mitigation:* Kubernetes components inherently ignore unrecognized feature gate names, so typos or mispellings may not cause failures. Components provide clear warning messages for such cases, and documentation will guide troubleshooting. Recommended that users run `microshift-cleanup-script`, correct the invalid config values in `/etc/microshift/config.yaml`, then restart the service. +*Mitigation:* Kubernetes components inherently ignore unrecognized feature gate names, so typos or mispellings may not cause failures. Components provide clear warning messages for such cases, and documentation will guide troubleshooting. ### Drawbacks **Increased Configuration Complexity** -Adding feature gate configuration increases the complexity of MicroShift's configuration surface area. Users must understand both the feature gates themselves and their potential interactions, which could lead to misconfigurations in edge deployments where troubleshooting access is limited. Again, users must be aware that custom feature gates are for experimentation only, are unsupported, irreversible, and make a cluster un-upgradeable. +Adding feature gate configuration increases the complexity of MicroShift's configuration surface area. Users must understand both the feature gates themselves and their potential interactions, which could lead to misconfigurations in edge deployments where troubleshooting access is limited. Again, users must be aware that custom feature gates are for experimentation only, are irreversible, and make a cluster un-upgradeable. **Support Complexity** Enabling alpha and beta features through user configuration means support teams may encounter issues related to experimental functionality that behaves differently across Kubernetes versions or has incomplete implementations. @@ -245,7 +250,7 @@ Utilizing the FeatureGate API on MicroShift is rejected as an alternative approa - Does OpenShift actively **block/prevent** upgrades when TechPreviewNoUpgrade/DevPreviewNoUpgrade/CustomNoUpgrade is configured? - Or does OpenShift **allow** upgrades to proceed but the resulting cluster becomes unsupported? - OpenShift actively prevents upgrades of clusters with customized features. OpenShift operators work together to communicate if any component has a had a custom feature gate applied. If so, the cluster-version-operator marks the cluster as un-upgradeable. + OpenShift actively prevents upgrades of clusters with customize features gates. OpenShift operators work together to communicate if any component has a had a custom feature gate applied. If so, the cluster-version-operator marks the cluster as un-upgradeable. 2. **How should feature gate compatibility be validated across MicroShift versions?** @@ -267,6 +272,11 @@ The testing strategy focuses on verifying the propagation functionality - that c **API Server Configuration:** - Verify feature gate pass-through retains string formating in the kube-apiserver configuration +### Documentation Validation + +**Validating Example Configs** +- Validation testing of featureGate config example(s) in documentation for correct syntax. + ### Robot Framework Integration Tests **Universal Propagation Verification:** @@ -283,16 +293,15 @@ The testing strategy focuses on verifying the propagation functionality - that c **Component Behavior Verification:** This enhancement does not test whether feature gates actually modify Kubernetes component behavior - that is the responsibility of upstream Kubernetes testing. Testing is limited to verifying that MicroShift correctly passes feature gates to the kube-apiserver. -**Upgrade Prevention Testing:** -A test scenario will verify that MicroShift properly blocks upgrades when custom feature gates are configured: -- Validate that clusters with TechPreviewNoUpgrade, DevPreviewNoUpgrade, or CustomNoUpgrade cannot be upgraded -- Test that upgrade failures provide clear error messages indicating custom feature gates prevent upgrades +**Upgrade Breaking Test:** +A test scenario will verify that MicroShift fails after upgrades when custom feature gates are configured: +- Verify that nodes with TechPreviewNoUpgrade, DevPreviewNoUpgrade, or CustomNoUpgrade fail to start after upgrade +- Test that upgrade failures provide clear error messages indicating custom feature gates **Custom Feature Gate Immutability Testing:** A test scenario verifies that custom feature gate configurations cannot be modified or reverted once applied: -- Verify that customized feature gates result in the creation of the sentinel file and that it's contents are correct -- Test that MicroShift correctly overwrites configuration changes with stored sentinel values -- Verify proper logging of warnings when users attempt to revert or change custom feature gates +- Verify that customized feature gates are written to `/var/lib/microshift/no-upgrade` and that it's contents are correct +- Test that MicroShift logs fatal error when `/var/lib/microshift/no-upgrade` data does not match user config ## Graduation Criteria @@ -322,7 +331,8 @@ Upgrades and downgrades proceed normally using standard MicroShift procedures wi **Custom Feature Gate Configurations:** Upgrades and downgrades are not supported when custom feature gates are configured (TechPreviewNoUpgrade, DevPreviewNoUpgrade, or CustomNoUpgrade). Once custom feature gates are enabled, this configuration cannot be reverted - it is a permanent, one-way operation that permanently disables upgrade capability. -Similar to OpenShift, the TechPreviewNoUpgrade, DevPreviewNoUpgrade, and CustomNoUpgrade feature sets are irreversible and explicitly prevent cluster upgrades to avoid compatibility issues with experimental features. +The TechPreviewNoUpgrade, DevPreviewNoUpgrade, and CustomNoUpgrade feature sets are irreversible and +will cause MicroShift to fail after and upgrades. ## Version Skew Strategy @@ -359,4 +369,4 @@ Any changes to the MicroShift configuration schema must be backwards compatible - To restore MicroShift to a stable and supported state, users must run `$ sudo microshift-cleanup-data --all`, set `.featureGates: {}`, and restart MicroShift ### Upgrade / Rollback -- Upgrades are actively blocked when custom feature gates are configured. See [Attempt to Upgrade Cluster with Custom Feature Gates](#attempt-to-upgrade-cluster-with-custom-feature-gates). +- Upgrades when custom feature gates are configured cause MicroShift to fail to start. See [Attempt to Upgrade Node with Custom Feature Gates](#attempt-to-upgrade-node-with-custom-feature-gates).