Skip to content

Conversation

@JinRiYao2001
Copy link
Contributor

@JinRiYao2001 JinRiYao2001 commented Feb 9, 2025

Description

This PR fixes a critical scheduling issue in django-celery-beat when a crontab-type task is configured with a start_time set to a future time.

Problem Analysis

  • When initializing the scheduler's heap (self._heap), tasks are sorted by their event_t.time field.
  • For crontab tasks with start_time > now, the initial event_t.time was incorrectly set to start_time instead of the next valid execution time based on the crontab schedule.
  • This caused the task to persistently occupy the heap's top position (since start_time was the smallest future timestamp), blocking other tasks from being executed until the start_time was reached.

Solution

  • Modified the event_t.time generation logic during heap initialization for crontab tasks with future start_time values.
  • Instead of directly using start_time, the event_t.time is now calculated as the next valid execution time derived from the crontab schedule relative to start_time.
    • Example: If a task is set to run every hour (0 * * * *) with start_time=2024-01-01 12:00:00 (future time), the initial event_t.time will be 2024-01-01 12:00:00 (first valid run) instead of the current time.
  • Ensures the heap prioritizes tasks based on their actual next execution time rather than start_time, preventing heap blockage.

Changes

  • Enhanced the is_due method in schedulers.py to:
    • Add special handling for tasks with non-null start_time

Expected Behavior

  • Tasks with future start_time will no longer block the scheduler heap.
  • The heap now correctly prioritizes tasks based on their dynamically calculated next execution time, allowing other interval/crontab tasks to execute as expected.

fixes #843

@JinRiYao2001
Copy link
Contributor Author

Changes:

  • Modified the delay calculation logic in is_due()
  • Now, delay is calculated based on start_time, now, and schedule.remaining_estimate()
  • This ensures a more accurate computation of the next execution time

Why is this change needed?

  • Previously, delay was calculated as start_time - now, which is not always accurate
  • The new logic ensures that the delay aligns with the scheduler's expected execution time

Impact:

  • The unit test test_task_with_start_time may fail due to the updated delay calculation
  • The test still assumes delay = start_time - now, which does not match the new logic
  • It is necessary to update the test to align with the schedule.remaining_estimate() calculation

Suggested Solution:

  • I propose updating test_task_with_start_time to validate the new delay calculation logic

Question

  • Should I submit an updated unit test along with this change?

@Nusnus Nusnus marked this pull request as draft February 9, 2025 16:46
@Nusnus
Copy link
Member

Nusnus commented Feb 9, 2025

Changed to draft until the CI passes

@JinRiYao2001
Copy link
Contributor Author

I'll take the time to fix this code

@JinRiYao2001
Copy link
Contributor Author

I found that self.schedule.remaining_estimate cannot calculate the next execution time based on start_time, but this part of the code is implemented in Celery. Therefore, I should modify Celery's code and implement a method in Celery's code to calculate the next execution time for each task.

After implementing the corresponding method, I will call it within django-celery-beat.

@JinRiYao2001
Copy link
Contributor Author

JinRiYao2001 commented Feb 10, 2025

I created a method for the PeriodicTask model to calculate the accurate start_time, and it calls is_due to calculate the delay when the task is crontab.

This way, I don't need to modify the Celery source code, and I fixed this issue in django-celery-beat. I'm not sure if this kind of modification follows the standard practices?

@JinRiYao2001
Copy link
Contributor Author

I ran the unit tests for django-celery-beat, and they pass on my Mac but fail on Windows. I would like to use the official CI to run the tests.

@codecov
Copy link

codecov bot commented Feb 10, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.87%. Comparing base (c34fdcd) to head (f5aea43).
Report is 15 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #844      +/-   ##
==========================================
+ Coverage   87.42%   87.87%   +0.45%     
==========================================
  Files          32       32              
  Lines         954      965      +11     
  Branches       76       77       +1     
==========================================
+ Hits          834      848      +14     
+ Misses        102      100       -2     
+ Partials       18       17       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@JinRiYao2001
Copy link
Contributor Author

I discovered a bug in my code while trying to write unit tests. I will fix it later and conduct the corresponding tests.

@auvipy auvipy self-requested a review February 11, 2025 09:50
@JinRiYao2001
Copy link
Contributor Author

JinRiYao2001 commented Feb 11, 2025

In the process of supplementing the test cases, I found that my code still has some time zone issues, which I will modify tomorrow.

@JinRiYao2001
Copy link
Contributor Author

I think I solved the time zone issue.

@JinRiYao2001 JinRiYao2001 marked this pull request as ready for review February 11, 2025 18:06
@Nusnus Nusnus requested a review from cclauss February 11, 2025 18:23
@JinRiYao2001 JinRiYao2001 requested a review from auvipy February 12, 2025 10:59
@JinRiYao2001 JinRiYao2001 requested a review from cclauss February 13, 2025 08:24
@auvipy
Copy link
Member

auvipy commented Mar 11, 2025

I will re review it tomorrow

Copy link
Member

@auvipy auvipy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you mind updating the PR description which fits the latest changes and finding from your work, please? also do you thin this should work on python 3.8?

@JinRiYao2001
Copy link
Contributor Author

@auvipy
I've updated the PR description accordingly and reviewed the compatibility concerns and confirmed that my code currently only uses ZoneInfo in test cases. I'll address this by implementing a fix to ensure full compatibility with Python 3.8.

@auvipy
Copy link
Member

auvipy commented Mar 29, 2025

thanks! that's neat!

@auvipy auvipy requested review from auvipy and Copilot and removed request for cclauss March 29, 2025 08:38
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a critical scheduling issue in django-celery-beat by ensuring that crontab tasks with a future start_time initialize with the next valid execution time instead of the raw start_time. Key changes include:

  • Updating the is_due logic to compute delay using due_start_time based on the crontab schedule.
  • Introducing new helper methods in models.py for calculating the due start time.
  • Adding tests in t/unit/test_schedulers.py to verify correct behavior for different start_time scenarios and time zones.

Reviewed Changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 1 comment.

File Description
t/unit/test_schedulers.py New tests covering crontab tasks with future start_time and time zone handling
django_celery_beat/schedulers.py Updated is_due method to compute delay using due_start_time
django_celery_beat/models.py Added helper methods for calculating due start time
Files not reviewed (1)
  • requirements/test.txt: Language not supported
Comments suppressed due to low confidence (2)

django_celery_beat/schedulers.py:119

  • [nitpick] Consider renaming 'now_tz' to 'current_tz' for clarity, better reflecting that it represents the current timezone from now.tzinfo.
now_tz = now.tzinfo

django_celery_beat/models.py:393

  • [nitpick] Consider renaming the parameter 'start_time' to 'initial_start_time' to more clearly indicate its role in calculating the next valid execution time.
def due_start_time(self, start_time, tz):

@auvipy
Copy link
Member

auvipy commented Mar 29, 2025

django_celery_beat/schedulers.py:121:80: E501 line too long (81 > 79 characters)

@auvipy
Copy link
Member

auvipy commented Mar 29, 2025

related #867

@auvipy auvipy self-requested a review March 29, 2025 08:59
@auvipy
Copy link
Member

auvipy commented Mar 29, 2025

Thanks for incorporating new pytest timeout features

@auvipy auvipy merged commit c454c65 into celery:main Mar 30, 2025
28 checks passed
@auvipy
Copy link
Member

auvipy commented Apr 21, 2025

hey, we are having regression report related to this pr. can you please check?

@JinRiYao2001
Copy link
Contributor Author

@auvipy
Thank you for the feedback. The issue may be related to my PR, but due to a heavy workload, I’m unable to investigate thoroughly at the moment. I’ll look into it as soon as my schedule allows.

@auvipy
Copy link
Member

auvipy commented Apr 21, 2025

Thanks

Azurency added a commit to Azurency/django-celery-beat that referenced this pull request Apr 22, 2025
Azurency added a commit to Azurency/django-celery-beat that referenced this pull request Apr 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Celery Task with start_time Set in the Future Causes Blocking of Other Tasks

4 participants