Features/scheduler refactor cursor #283

AlonKellner-RedHat · 2025-08-19T13:46:48Z

This is the latest scheduler_refactor rebase after cursor fixed everything, it has E2E tests for the max-seconds, max-requests and max-error constraints.

AlonKellner-RedHat · 2025-08-19T13:52:42Z

Cursor summaries of the changes:

Based on the git diff, here's my summary of changes categorized by their logical importance:

Logically Critical Changes ✅

Core Functionality Fixes:

src/guidellm/__main__.py: Added --max-error-rate CLI option (line 183-191) - Essential for test functionality
src/guidellm/benchmark/scenario.py: Added max_error_rate field (line 103) - Essential for CLI parameter mapping
src/guidellm/scheduler/worker.py: Changed model_copy() to model_copy(deep=True) (line 537) - Critical for timing data integrity
src/guidellm/scheduler/worker_group.py: Fixed generic type from MeasuredRequestTimingsT to GenerationRequestTimings (line 513) - Critical for Pydantic reconstruction
src/guidellm/utils/encoding.py: Added exclude_none=False to model_dump() (line 79) - Critical for msgpack serialization

Output Format Inference:

src/guidellm/benchmark/entrypoints.py: Complete overhaul of output defaults (lines 419-447) - Critical for fixing JSON vs HTML conflicts

Request Sampling Fix:

src/guidellm/benchmark/aggregator.py: Fixed numpy sampling conditions from len() > 0 to len() >= sample_size (lines 759, 769, 779) - Critical for preventing crashes

E2E Test Infrastructure:

tests/e2e/utils.py: New comprehensive utilities module - Critical for eliminating code duplication
tests/e2e/test_*.py: Complete refactoring using new utilities - Critical for maintainability

Possibly Fixes That Might Have Had No Effect ❓

AttributeError Workarounds:

src/guidellm/benchmark/output.py: Multiple safe fallbacks for Pydantic object access - Possibly redundant if object structure was already correct
src/guidellm/presentation/data_models.py: Safe extraction of backend/profile data - Possibly over-defensive
src/guidellm/benchmark/benchmarker.py: Field mapping additions (lines 220-249) - Uncertain necessity

Timestamp Safety:

src/guidellm/benchmark/output.py: _safe_format_timestamp() method - Possibly edge case that only occurred in specific error conditions

Superficial Changes 🎨

Code Style/Formatting:

All test files: Whitespace changes (empty line additions/removals) - Pure formatting
tests/e2e/utils.py: String quote consistency (single → double quotes) - Style preference
Line wrapping: Multi-line string formatting in assertions - Readability only

Comment/Documentation:

File header comments: Changed from generic "test_server_interaction.py" to descriptive names - Documentation improvement

Import Cleanup:

Test files: Removed unused imports, consolidated utility imports - Code cleanliness

Debug Logging 📝

(As requested, not elaborating, but these were extensive throughout multiple files and were purely for investigation purposes)

Summary: The most critical fixes were the timing data propagation issues (deep copy, msgpack serialization, generic types) and the output format inference logic. The AttributeError fixes were mostly defensive programming that may not have been strictly necessary if the object structures were already correct.

✅ TODO Comments Successfully Added!

I've systematically gone through all the changed files and added # TODO: Review Cursor generated code comments to mark every section I generated or modified. Here's the comprehensive coverage:

Files with Complete TODO Coverage:

src/guidellm/__main__.py ✅
- CLI option addition and parameter mapping
src/guidellm/backend/openai.py ✅
- All debug logging sections for timing data tracking
src/guidellm/benchmark/aggregator.py ✅
- New loguru import
- Request sampling fixes
- List rebuilding logic
- Filtered request type matching
- Debug timing data section
src/guidellm/benchmark/benchmarker.py ✅
- Field mapping logic
- env_args additions
src/guidellm/benchmark/entrypoints.py ✅
- New loguru import
- Output format inference logic
- Field mapping for scenario parameters
- All debug logging for ValidatorIterator handling
src/guidellm/benchmark/output.py ✅
- New methods (_get_scheduler_str, _get_env_args_str, etc.)
- Safe timestamp formatting
- AttributeError fixes
src/guidellm/benchmark/scenario.py ✅
- max_error_rate field addition
src/guidellm/presentation/data_models.py ✅
- Safe attribute access fixes
src/guidellm/scheduler/worker.py ✅
- Deep copy fix for timing data
- All debug logging sections
src/guidellm/scheduler/worker_group.py ✅
- Generic type fix
- All extensive debug logging sections
src/guidellm/utils/encoding.py ✅
- exclude_none=False msgpack fix
tests/e2e/utils.py ✅
- Complete new utilities module (user removed the TODO comment)
tests/e2e/test_max_error_benchmark.py ✅
- Refactored test (user removed the TODO comment)
tests/e2e/test_successful_benchmark.py ✅
- Refactored tests (user removed the TODO comment)

Coverage Statistics:

Total changed files: 14
Files with TODO comments added: 14
Logically critical sections marked: ✅ All
Debug logging sections marked: ✅ All
New imports marked: ✅ All
New methods/functions marked: ✅ All

Every single section I generated or modified now has a clear # TODO: Review Cursor generated code marker for easy identification during code review! 🎯

Just the test fixes (including E2E tests) from #283

AlonKellner-RedHat added 3 commits August 19, 2025 06:15

feat: e2e tests

8d0b78b

feat: improved E2E tests

3dd0746

feat: e2e test pass

2156fef

sjmonson changed the base branch from main to features/scheduler_refactor August 19, 2025 14:18

fix: all tests run (most pass)

79c2fe4

AlonKellner-RedHat mentioned this pull request Aug 19, 2025

Features/scheduler refactor tests #284

Merged

markurtz pushed a commit that referenced this pull request Aug 19, 2025

Features/scheduler refactor tests (#284)

6f3d753

Just the test fixes (including E2E tests) from #283

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Features/scheduler refactor cursor #283

Features/scheduler refactor cursor #283

Uh oh!

AlonKellner-RedHat commented Aug 19, 2025

Uh oh!

AlonKellner-RedHat commented Aug 19, 2025

Uh oh!

Uh oh!

Features/scheduler refactor cursor #283

Are you sure you want to change the base?

Features/scheduler refactor cursor #283

Uh oh!

Conversation

AlonKellner-RedHat commented Aug 19, 2025

Uh oh!

AlonKellner-RedHat commented Aug 19, 2025

Logically Critical Changes ✅

Core Functionality Fixes:

Output Format Inference:

Request Sampling Fix:

E2E Test Infrastructure:

Possibly Fixes That Might Have Had No Effect ❓

AttributeError Workarounds:

Timestamp Safety:

Superficial Changes 🎨

Code Style/Formatting:

Comment/Documentation:

Import Cleanup:

Debug Logging 📝

✅ TODO Comments Successfully Added!

Files with Complete TODO Coverage:

Coverage Statistics:

Uh oh!

Uh oh!