Skip to content

Improve performance of Job queries #18403

@atownson

Description

@atownson

NetBox version

v4.2.1

Feature type

Change to existing functionality

Triage priority

I volunteer to perform this work (if approved)

Proposed functionality

If the output of a script(s) is relatively large (see MOP below) and the script has been run multiple times (see MOP below) the UI struggles to retrieve the completed script-related jobs efficiently (or sometimes at all) and out-of-memory exceptions are prevalent. The proposed FR here is to optimize the queries for Jobs by not retrieving the data field unless it is needed. The data field cannot be displayed in the list table and is not needed for the delete form or even the housekeeping task. So omitting the data field will improve performance while retaining existing functionality.

Here is a MOP and illustration to replicate the underlying issue:

  1. Create a fresh install of NetBox with no database objects.
  2. Add a script that is capable of generating a large output, such as:
from extras.scripts import Script, IntegerVar

class GenerateTestString(Script):

    class Meta:
        name = 'Generate test string'
        description = 'A test script used to generate a string output of the given length'
        scheduling_enabled = True
    
    # Fields
    length = IntegerVar(
        label = 'Text Length',
        description = 'Enter the length of the string generated.',
        min_value = 1,
        max_value = 1000000000,
    )

    def run(self, data, commit):
        return '0' * data['length']
  1. Run the script roughly 60 times (either manually or by scheduling it) with the length field set to 100,000,000 which will generate an output string of 100m characters.
  2. Test the performance of the Job list view and attempt to bulk delete the Jobs.

Here's an illustration using the above MOP and the results:
netbox_job_performance

The fix would likely be using the defer() method as Job.objects.defer('data') rather than the only() method listing the included fields shown in the illustration.

I also believe these large jobs can hinder the completion of the housekeeping task that purges the jobs, allowing the job count to grow without limit.

Use case

Our environment uses a few scheduled scripts to provide data output to API integrations that combine models (such as Devices and Virtual Machines) and that traverse relationships (such as Prefixes and their associated IP Ranges and IP Addresses and their associated Interfaces and MAC Addresses). The output of these scripts is inherently larger than typical script use cases. Granted the GraphQL API exists to solve for this use case to some extent, the script output provides a very customizable output and there are other open issues related to the functionality and performance of the GraphQL API. By optimizing the Job queries these completed script jobs could be interacted with more efficiently.

Database changes

No changes to the database design

External dependencies

None

Metadata

Metadata

Assignees

Labels

complexity: lowRequires minimal effort to implementstatus: acceptedThis issue has been accepted for implementationtype: featureIntroduction of new functionality to the application

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions