-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Description
NetBox version
v4.2.5
Feature type
New functionality
Proposed functionality
Extend the handle() method of the base JobRunner class to confirm that a Job exists in the database before attempting to call start().
If the Job does not exist yet, it should retry several times after an incrementally increasing backoff timer (e.g. 0.5 seconds, 1 second, 2 seconds).
Use case
When a background RQ task is enqueued, a serialized representation of a Job object is stored with it. When the task executes, it does so with the assumption that its associated Job already exists in the database. However, this may not hold true due to various race conditions, such as the one captured by netboxlabs/netbox-branching#193. (I've also seen this happen with data source syncing.)
Implementing a short delay mitigates the race condition that occurs between enqueuing a background task in Redis and committing the PostgreSQL transaction within which its corresponding Job object was created.
Database changes
N/A
External dependencies
N/A