Skip to content

gunicorn spends 80% of each request in get_object_permissions() #7399

@heroin-moose

Description

@heroin-moose

NetBox version

v3.0.3

Python version

3.9

Steps to Reproduce

  1. Configure Netbox with AD authentication (a few thousands users + heavy nesting)
  2. Create a bunch of devices (~500 servers in my case, ~500 cables, ~50 VLANs, ~500 IP addresses, ~40 prefixes, ~40 racks in 3 sites)
  3. Try to access https://netbox.example.com/dcim/devices/1234
  4. See that gunicorn process is suddenly eats 100% CPU and the request takes more than a second

Expected Behavior

The request should be completed way faster.

Observed Behavior

For me the situtation looks like this: the HTTP request takes ~1.6s to complete and ~1.3s of that request was spent in authentication.py:get_object_permissions(). I am terrible with Python, but I came up with a patch:

@@ -11,6 +11,9 @@ from django.db.models import Q
 from users.models import ObjectPermission
 from utilities.permissions import permission_is_exempt, resolve_permission, resolve_permission_ct

+import time
+import syslog
+
 UserModel = get_user_model()


@@ -38,11 +42,18 @@ class ObjectPermissionMixin():

         # Create a dictionary mapping permissions to their constraints
         perms = defaultdict(list)
+        i = 0
+        start = time.time()
         for obj_perm in object_permissions:
             for object_type in obj_perm.object_types.all():
                 for action in obj_perm.actions:
                     perm_name = f"{object_type.app_label}.{action}_{object_type.model}"
                     perms[perm_name].extend(obj_perm.list_constraints())
+                    syslog.syslog(perm_name)
+                    i += 1
+        end = time.time()
+        t = end - start
+        syslog.syslog("took " + str(t) + " seconds with " + str(i) + " iterations")

         return perms

The output:

Sep 29 22:55:18 netbox.example.com gunicorn[78629]: took 1.325146198272705 seconds with 10397 iterations

Wow. That's a lot. Okay, more dirty patching:

@@ -24,7 +27,7 @@ class ObjectPermissionMixin():
         return user_obj._object_perm_cache

     def get_permission_filter(self, user_obj):
-        return Q(users=user_obj) | Q(groups__user=user_obj)
+        return Q(users=user_obj)

     def get_object_permissions(self, user_obj):
         """

And now I have this:

Sep 29 22:57:51 netbox.example.com gunicorn[78889]: took 0.17675495147705078 seconds with 1126 iterations

Still a lot (the request itself took ~400ms), but a lot better.

I'm not familiar with Netbox internals at all, so I'm not sure where to start digging.

Metadata

Metadata

Assignees

No one assigned

    Labels

    status: under reviewFurther discussion is needed to determine this issue's scope and/or implementationtype: bugA confirmed report of unexpected behavior in the application

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions