-
Notifications
You must be signed in to change notification settings - Fork 59
Allow silos to have restricted permissions for networking resources #9227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…VPC and then checking permissions on it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! I have some suggestions here, some of which are a bit deeper. Mentioning here for the record that we also discussed in chat an approach that used a new role rather than a custom silo property. Let me know if you want to talk through any of this.
| /// When true, restricts networking actions to Silo Admins only | ||
| restrict_network_actions: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest making this an enum with two explicit values, like:
enum ProjectRolesConfigureNetworking {
Allowed,
Disallowed,
}| # NOTE: No permission rules defined here! | ||
| # All permissions controlled by custom networking restriction | ||
| # rules in omicron.polar (can_modify_networking_resource) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why you can't put the stuff that's copy/pasted for each networking resource in omicron.polar in here instead.
| admin_group_name: None, | ||
| tls_certificates: vec![], | ||
| mapped_fleet_roles: Default::default(), | ||
| restrict_network_actions: None, // Default: no restrictions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be worthwhile to modify this test to run all the silo resources (or at least all the networking ones) twice: once with this set and once without. You could probably skip some of the hand-written tests, it should automatically catch everything, and then we'll have the current behavior documented in the test output file.
| ALTER TABLE omicron.public.silo | ||
| ADD COLUMN IF NOT EXISTS restrict_network_actions BOOL | ||
| NOT NULL | ||
| DEFAULT FALSE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want a separate step here that removes the DEFAULT:
https://github.com/oxidecomputer/omicron/tree/a65cda6f7d1e969caa7fd72423e2ed8aa91cb386/schema/crdb#211-adding-a-new-column-without-a-default-value
| // Only create default VPC if allowed (i.e., networking is not restricted | ||
| // or the actor is a Silo Admin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Only create default VPC if allowed (i.e., networking is not restricted | |
| // or the actor is a Silo Admin) | |
| // Only create default VPC if requested. |
I'd suggest that the question of whether this is set is a policy choice by the caller and this code doesn't need to care about when or why it's set.
| .internal_context("creating a Project")?; | ||
| opctx.authorize(authz::Action::CreateChild, &authz_silo).await?; | ||
|
|
||
| // Determine if we should create a default VPC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is potentially fine but I find it pretty confusing when code behaves differently depending on what permissions you have (aside from failing a permission check, I mean) so I'm just thinking out loud if there's a cleaner way to do this. Some ideas:
- Do this all the time if networking is not restricted to silo admins. Never do this if networking is restricted to silo admins. In this world, the "restrict networking config to silo admins" is more like "silos have two modes: one where networking is configured separately from projects and only silo admins can do it, and one where networking is fully configurable within a project and project admins can do it too".
- Make this an option in the API and either:
- put this logic in the console and/or CLI so it behaves the same way. This sounds the same but I think it's a big difference if the API's behavior is deterministic given its input vs. using other system state to determine what to do.
- make the user choose this (i.e., have a checkbox)
I admit I might be overthinking this. With the behavior as coded, I guess the story is: "this creates a project and, if you have permission to do so, creates the VPC too." I've certainly seen tools that do that, but I usually assume it's a bug that they didn't fail or warn me if they couldn't do some of the stuff they normally do.
| if let Some(policy) = opctx.authn.silo_authn_policy() { | ||
| if policy.restrict_network_actions() { | ||
| // Networking is restricted - only create VPC if user is Silo Admin | ||
| // (i.e., has Modify permission on the Silo) | ||
| opctx | ||
| .authorize(authz::Action::Modify, &authz_silo) | ||
| .await | ||
| .is_ok() | ||
| } else { | ||
| // No networking restrictions, create VPC | ||
| true | ||
| } | ||
| } else { | ||
| // No policy, create VPC | ||
| true | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like this is all duplicating policy logic that should exist in the Polar file, though it's admittedly tricky to figure out which resource and action we'd want to authorize here. In an ideal world I feel like this would be something like:
let create_default_vpc = opctx.authorize(authz::Action::Create, authz_project.vpc_list()).await.is_ok();That's not quite right -- you don't want to ignore all errors here, only authz errors.
And vpc_list() doesn't exist -- I was imagining we created a per-project synthetic resource which was "this project's list of VPCs" (or "this project's networking config") and that this function returned that.
Speaking of which, I wonder if that approach would simplify a lot of this? If there were a synthetic resource within the project that referred to the networking config, that's where you could put the special Polar rules, and all the other networking resources would have a simple snippet that's like "you have this permission if you have it on the parent project's networking config".
| /// | ||
| /// Returns Err if the silo restricts networking and the actor is not | ||
| /// a Silo Admin. | ||
| pub(crate) async fn check_networking_restrictions( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also feels like it should be expressible as a call to authorize.
This PR implements silo-level networking restrictions that allow administrators to limit networking operations to Silo Admins only. When a silo's
restrict_network_actionsvalue is set to TRUE, only users with Silo Admin privileges cancreate,modify, ordeletenetworking resources (VPCs, subnets, routers, routes, internet gateways, firewall rules, and IP pool address attachments).readandlistoperations remain available to all users with appropriate project-level permissions. If therestrict_network_actionsvalue on the silo isfalse, then normal rules apply, and project collaborators would be able to configure networking resources.So how does it work? A little background will help …
Authorization in Omicron uses the Polar policy engine. Apart from the
createmethod for VPCs — which we'll get into — this does the same. See the end of omicron.polar for these new rules. This required a new Polar snippet for networking resources (look forPolarSnippet::InProjectNetworking). In Polar, permissions stack, and if any of the snippets grant permission to an actor, that actor can execute the requested action. That means we needed to not use the existing (and permissive)InProjectsnippet, and instead build up a snippet (InProjectNetworking) that assumes no permissions unless explicitly granted.This pattern is applied to:
The curious case of VPC Create
There's a problem with the Polar implementation, though. Our existing Polar rules ask a question before granting
createpermissions to a user: "Does this actor havecreate_childpermissions at the parent layer?" Because VPCs are children of projects, the Polar question there is simply saying "can this user (let's assume they're a project collaborator) create 'children' of projects?" "Children" here would be … disks, images, and … VPCs. There's no way to specify in the Polar check that we're asking about creating a specific kind of resource — namely a VPC. So! For the special case of VPCs, we have an app-level Rust check —self.check_networking_restrictions(opctx).await?;to determine if the user has the appropriate permissions, based on the silo'srestrict_networking_actionsvalue.But wait! There's more!
What's the deal with sagas?
When a project is created in Omicron, we have a saga that automatically creates the networking resources that are children of that project. This is fine in a world where all project collaborators have create permissions on networking resources, but can cause problems in the world we're moving into — if a user isn't supposed to have the permission to create networking resources, it'd be a convenient end-run to allow them to show up after project creation.
So … the saga now accepts a
create_default_vpcparameter that controls whether the VPC subsaga runs or not. Before creating the saga, the app layer (nexus/src/app/project.rs) determines this value, based on the silo'srestrict_network_actionsvalue:restrict_network_actionsisFALSEsilos: Always create default VPC (the default / current situation)restrict_network_actionsisTRUEsilos, Silo Admin: Create default VPCrestrict_network_actionsisTRUEsilos, non-admin: Skip default VPC creation entirelyThe saga builder (nexus/src/app/sagas/project_create.rs) conditionally constructs the DAG, omitting the VPC subsaga and all related nodes for non-admins in restricted silos. Since the VPC subsaga creates all child resources (routers, routes, subnets, internet gateway, firewall rules), this single conditional prevents creation of the entire networking stack.
Database changes
API changes
Testing
For the individual resources (
test_vpc_networking_restrictions), the basic pattern here is "set up a silo with therestrict_network_actionsvalue set to true. Then have a user with admin permissions create and update a networking resource, to verify that that's possible. Downgrade them to a project collaborator. Attempt to create the resource (should fail), attempt to modify the admin-created resource (should fail), attempt to delete the admin-created resource (should fail), upgrade to an admin again, and delete the resource (should succeed).There are also some permission matrices run on both a restricted silo (
test_vpc_networking_permissions_restricted) and an unrestricted silo (test_vpc_networking_permissions_unrestricted).Pertains to https://github.com/oxidecomputer/customer-support/issues/416