Skip to content

Conversation

@ricky-rav
Copy link
Contributor

Enhancement for no-overlay mode,

Upstream enhancement: ovn-kubernetes/ovn-kubernetes#5289

@openshift-ci openshift-ci bot requested review from danwinship and jcaamano October 9, 2025 09:46
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 9, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign abhat for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ricky-rav ricky-rav changed the title Enhancement for no-overlay mode CORENET-6133:Enhancement for no-overlay mode Oct 9, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 9, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 9, 2025

@ricky-rav: This pull request references CORENET-6133 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

Enhancement for no-overlay mode,

Upstream enhancement: ovn-kubernetes/ovn-kubernetes#5289

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

- "managed": delegate to OCP the configuration of the BGP routers or reflectors necessary to advertise the network pod subnet to all cluster nodes;
- "unmanaged": use the FRRConfig and RouteAdvertisements provided by the cluster administrator to implement the no-overlay mode

For CUDNs these parameters will be added to the CUDN CRD and can be configured by the cluster administrator when creating a CUDN. For the default network, these parameters must be input by the cluster administrator at installation time and passed over to ovn-kubernetes by the Cluster Network Operator.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine to overview the support for CUDNs here (enough to understand testing requirements) here but I wouldn't focus on it too much on this section (Proposal) of document.
In general, impacts for CNO are:

  • configuration of the CDN
  • configuration of the internal BGP fabric, which might be employed for CDN and CUDNs.

Or maybe we can have different chapters describing the functionality vs CNO impacts. It's kinda mixed now.

Comment on lines 344 to 324
For the default network, we need a separate CI lane where we can enable the no-overlay mode at installation time. This lane will cover only the unmanaged scenario (with, say, outboundSNAT=enable) and will run upstream conformance tests on the default network:
- e2e-metal-ipi-ovn-dualstack-no-overlay-unmanaged-techpreview

For the managed scenario, we need yet two CI lanes, where we can test no-overlay mode with the route reflector topology and with the full mesh topology, for both the default network and CUDNs. In order to keep the number of lanes manageable, we can have one lane run ovn-kubernetes in shared gateway mode and the other in local gateway mode. In both lanes the default network is in no-overlay mode and we can create CUDNs in no-overlay mode on the fly, with outboundSNAT=enable and then outboundSNAT=disable, thus covering both cases in the same lane.
- e2e-metal-ipi-ovn-dualstack-no-overlay-managed-route-reflector-techpreview
- e2e-metal-ipi-ovn-dualstack-local-gw-no-overlay-managed-full-mesh-techpreview
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we only test managed mode with CDN? It should be a subset of unmanaged mode, it might be a good compromise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize when writing this that we can actually mix networks in managed mode and networks in unmanaged mode. This is possible with the APIs we agreed upon, right @pliurh ? If so, I'll update this section accordingly.

@ricky-rav ricky-rav force-pushed the no_overlay_enhancement branch from 9d6d724 to 6bbff43 Compare October 17, 2025 18:49
@ricky-rav
Copy link
Contributor Author

I've just pushed a new commit that should address all comments from above.

A couple of points that I want to call out:

  • The transport parameter for the default network is now called DefaultNetworkTransport and similarly we now have DefaultNetworkNoOverlayOptions. These names might not be the prettiest and I'm open to suggestions, but we have to convey that we're configuring the default network only
  • I made DefaultNetworkNoOverlayOptions and BGPManagedConfig immutable after cluster installation. If we allow users to change between managed and unmanaged or between OutboundSNAT enabled and OutboundSNAT disabled, we should provide a way to migrate seamlessly from one configuration to another, which as far as I understood is out of scope for now

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 17, 2025

@ricky-rav: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/markdownlint 6bbff43 link true /test markdownlint

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Comment on lines +89 to +94
- enable BGP for the cluster:
- spec.additionalRoutingCapabilities.providers: "FRR"
- spec.defaultNEtwork.ovnKubernetesConfig.routeAdvertisements: "enable"
- enable no-overlay mode for the default network (if desired) and configure it with the per-network parameters:
- outboundSNAT: "enable","disable"
- "routing": "managed", "unmanaged"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so these parameters should be set in install-config.yaml that openshift-installer will need to support this?
or we need to set in the manifest/network-02-config.yaml ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In d/s, CNO shall be doing this work.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should call out how or where to set these parameters and RA and Frrconfiguration when deployment cluster

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhaozhanqi, these are parameters that will be found in operator.openshift.io/v1 Network.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean do we need a installer api to set this in install-config.yaml ?

- asNumber
- bgpTopology: "fullMesh"

It's important to note that there can be networks in no-overlay mode running in "managed" mode, and networks in no-overlay mode running in "unmanaged" mode, coexisting in the same cluster. The manifests that are necessary for "unmanaged" mode, whether it is for the default network or for CUDNs, must be provided on day 0 by the cluster administrator. Similarly, the parameters for "managed" mode must be provided on day 0, even if the CUDNs to which no-overlay mode will be applied will be created later on.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only things that should be provided day 0 is those things that are needed to configure no overlay for the CDN if that is the case.
If, for example, you only plan to use no overlay for CUDNs, then you should be able to configure anything that it is needed for it on day 2:

  • Currently, you can enable FRR and RouteASdvertisements on day 2.
  • Currently, you can configure FRRConfigrations on day 2, and you would be able to use that for unmanaged no overlay CUDNs created after.
  • You should be able to configure the managed fabric on day 2 as well and use that for unmanaged no overlay CUDNs created after.

That's to say: day-0 of a CUDN != day-0 of the CDN.

So this statement is not quite correct I think

The manifests that are necessary for "unmanaged" mode, whether it is for the default network or for CUDNs, must be provided on day 0 by the cluster administrator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants