Skip to content

The RQ worker should verify that a job exists in the database before attempting to start it #18880

@jeremystretch

Description

@jeremystretch

NetBox version

v4.2.5

Feature type

New functionality

Proposed functionality

Extend the handle() method of the base JobRunner class to confirm that a Job exists in the database before attempting to call start().

If the Job does not exist yet, it should retry several times after an incrementally increasing backoff timer (e.g. 0.5 seconds, 1 second, 2 seconds).

Use case

When a background RQ task is enqueued, a serialized representation of a Job object is stored with it. When the task executes, it does so with the assumption that its associated Job already exists in the database. However, this may not hold true due to various race conditions, such as the one captured by netboxlabs/netbox-branching#193. (I've also seen this happen with data source syncing.)

Implementing a short delay mitigates the race condition that occurs between enqueuing a background task in Redis and committing the PostgreSQL transaction within which its corresponding Job object was created.

Database changes

N/A

External dependencies

N/A

Metadata

Metadata

Assignees

Labels

complexity: mediumRequires a substantial but not unusual amount of effort to implementstatus: acceptedThis issue has been accepted for implementationtype: featureIntroduction of new functionality to the application

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions