-
-
Notifications
You must be signed in to change notification settings - Fork 836
feat(webapp): completing spans server-side no longer write-after-read, improving efficiency and perf #2530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🦋 Changeset detectedLatest commit: ea71d05 The changes in this PR will be included in the next version bump. This PR includes changesets to release 23 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
WalkthroughReworks event finalization to be run-centric: replaces a single completeEvent with multiple specialized EventRepository methods (successful, cached, failed, expired, attempt-failed, cancel) and introduces CompleteableTaskRun and attemptNumber in event types. Updates run engine handlers and many services to pre-fetch TaskRun records and call the new methods (many calls wrapped with tryCatch). Adjusts event merging/ancestor override logic in eventRepository. Adds run.traceId exposure in presenters and admin UI, introduces cachedRunId on cachedRunCompleted events, adds AttemptFailedSpanEvent to OpenTelemetry schemas, tweaks idempotency spanId computation, and removes an AbortSignal argument from startActiveSpan calls. Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Rationale: Broad, heterogeneous changes touching persistence/event merging, public APIs/types, engine handlers, multiple services, internal engine events, and cross-package schemas — requires end-to-end reasoning and consistency checks. Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/webapp/app/v3/services/crashTaskRun.server.ts (1)
34-37
: Unreachable crash finalization due to conflicting returnsYou return early when overrideCompletion is true (Lines 34–37) and also return early when it's false (Lines 84–86). This makes the crash finalization/event recording block unreachable in all cases.
Apply this diff to keep the flow working while deprecating the flag:
- if (options?.overrideCompletion) { - logger.error("CrashTaskRunService.call: overrideCompletion is deprecated", { runId }); - return; - } + if (options?.overrideCompletion) { + logger.warn("CrashTaskRunService.call: overrideCompletion is deprecated (ignored)", { runId }); + // continue; don't return + } ... - if (!opts.overrideCompletion) { - return; - } + // Continue with crash finalization regardless of overrideCompletionAlso applies to: 84-86
🧹 Nitpick comments (6)
apps/webapp/app/v3/services/crashTaskRun.server.ts (1)
124-133
: Normalize exception payload via helper for consistent event schemaUse createExceptionPropertiesFromError(sanitizeError(...)) to match other paths and avoid leaking internal code as the exception “type”.
+import { createExceptionPropertiesFromError } from "../eventRepository.server"; ... - const [createAttemptFailedEventError] = await tryCatch( - eventRepository.completeFailedRunEvent({ - run: crashedTaskRun, - endTime: opts.crashedAt, - exception: { - type: opts.errorCode ?? TaskRunErrorCodes.TASK_RUN_CRASHED, - message: opts.reason, - stacktrace: opts.logs, - }, - }) - ); + const [createAttemptFailedEventError] = await tryCatch( + eventRepository.completeFailedRunEvent({ + run: crashedTaskRun, + endTime: opts.crashedAt ?? new Date(), + exception: createExceptionPropertiesFromError( + sanitizeError({ + type: "INTERNAL_ERROR", + code: opts.errorCode ?? TaskRunErrorCodes.TASK_RUN_CRASHED, + message: opts.reason, + stackTrace: opts.logs, + }) + ), + }) + );apps/webapp/app/v3/services/cancelTaskRunV1.server.ts (1)
104-121
: Guard optional projectRef when emitting cancel eventproject.externalRef may be null/undefined depending on migration state. Guard it or let the repository infer from run to avoid runtime type errors.
- projectRef: cancelledTaskRun.project.externalRef, + projectRef: cancelledTaskRun.project?.externalRef ?? undefined,apps/webapp/app/v3/runEngineHandlers.server.ts (2)
22-43
: Prefer a shared taskRun select to avoid driftThe same select shape is duplicated across handlers. Extract a const TASK_RUN_SELECT to ensure field changes stay consistent.
185-201
: Validate span.id format assumptionsSplitting span.id on ":" assumes "parent:child". Add a lightweight helper or type guard to avoid subtle bugs if the delimiter or format changes.
Example:
function parseSpanId(id: string): { parentSpanId: string; spanId: string } | null { const parts = id.split(":"); return parts.length === 2 && parts[0] && parts[1] ? { parentSpanId: parts[0], spanId: parts[1] } : null; }apps/webapp/app/v3/eventRepository.server.ts (2)
345-394
: Clarify the cached run event semantics.The
completeCachedRunEvent
method takes bothrun
andblockedRun
parameters, but usesblockedRun.traceId
andblockedRun.friendlyId
while usingrun.taskIdentifier
. This mixing of data from two different runs might be confusing. Consider adding a comment explaining this pattern.+ /** + * Records a cached run event in the context of a blocked run's trace. + * @param run - The original run that was cached + * @param blockedRun - The run that was blocked and whose trace should contain the cached event + * @param spanId - The span ID to use for the cached event + * @param parentSpanId - The parent span ID for proper trace hierarchy + * @param spanCreatedAt - When the span was originally created + * @param isError - Whether the cached run had an error + */ async completeCachedRunEvent({
2290-2294
: Consider extracting common duration calculation.The
calculateDurationFromStart
function is used throughout the new methods. While it's already a utility function, its repeated usage pattern with date conversion suggests it could benefit from a higher-level abstraction.Consider creating a helper that encapsulates the common pattern:
function calculateRunDuration(run: CompleteableTaskRun, endTime?: Date): number { const startTime = convertDateToNanoseconds(run.createdAt); return calculateDurationFromStart(startTime, endTime ?? new Date()); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
references/hello-world/src/trigger/example.ts
is excluded by!references/**
references/hello-world/src/trigger/idempotency.ts
is excluded by!references/**
📒 Files selected for processing (15)
apps/webapp/app/presenters/v3/SpanPresenter.server.ts
(1 hunks)apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam.spans.$spanParam/route.tsx
(1 hunks)apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
(1 hunks)apps/webapp/app/v3/eventRepository.server.ts
(21 hunks)apps/webapp/app/v3/runEngineHandlers.server.ts
(2 hunks)apps/webapp/app/v3/services/cancelAttempt.server.ts
(0 hunks)apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
(4 hunks)apps/webapp/app/v3/services/completeAttempt.server.ts
(4 hunks)apps/webapp/app/v3/services/crashTaskRun.server.ts
(2 hunks)apps/webapp/app/v3/services/expireEnqueuedRun.server.ts
(2 hunks)apps/webapp/app/v3/taskEventStore.server.ts
(6 hunks)internal-packages/run-engine/src/engine/eventBus.ts
(1 hunks)internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
(1 hunks)packages/core/src/v3/schemas/openTelemetry.ts
(3 hunks)packages/core/src/v3/workers/taskExecutor.ts
(1 hunks)
💤 Files with no reviewable changes (1)
- apps/webapp/app/v3/services/cancelAttempt.server.ts
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}
: Always prefer using isomorphic code like fetch, ReadableStream, etc. instead of Node.js specific code
For TypeScript, we usually use types over interfaces
Avoid enums
No default exports, use function declarations
Files:
internal-packages/run-engine/src/engine/eventBus.ts
apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam.spans.$spanParam/route.tsx
packages/core/src/v3/workers/taskExecutor.ts
internal-packages/run-engine/src/engine/systems/waitpointSystem.ts
apps/webapp/app/v3/taskEventStore.server.ts
apps/webapp/app/presenters/v3/SpanPresenter.server.ts
packages/core/src/v3/schemas/openTelemetry.ts
apps/webapp/app/v3/services/expireEnqueuedRun.server.ts
apps/webapp/app/v3/services/crashTaskRun.server.ts
apps/webapp/app/v3/services/completeAttempt.server.ts
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
apps/webapp/app/v3/runEngineHandlers.server.ts
apps/webapp/app/v3/eventRepository.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
We use zod a lot in packages/core and in the webapp
Files:
apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam.spans.$spanParam/route.tsx
packages/core/src/v3/workers/taskExecutor.ts
apps/webapp/app/v3/taskEventStore.server.ts
apps/webapp/app/presenters/v3/SpanPresenter.server.ts
packages/core/src/v3/schemas/openTelemetry.ts
apps/webapp/app/v3/services/expireEnqueuedRun.server.ts
apps/webapp/app/v3/services/crashTaskRun.server.ts
apps/webapp/app/v3/services/completeAttempt.server.ts
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
apps/webapp/app/v3/runEngineHandlers.server.ts
apps/webapp/app/v3/eventRepository.server.ts
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
When importing from @trigger.dev/core in the webapp, never import the root package path; always use one of the documented subpath exports from @trigger.dev/core’s package.json
Files:
apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam.spans.$spanParam/route.tsx
apps/webapp/app/v3/taskEventStore.server.ts
apps/webapp/app/presenters/v3/SpanPresenter.server.ts
apps/webapp/app/v3/services/expireEnqueuedRun.server.ts
apps/webapp/app/v3/services/crashTaskRun.server.ts
apps/webapp/app/v3/services/completeAttempt.server.ts
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
apps/webapp/app/v3/runEngineHandlers.server.ts
apps/webapp/app/v3/eventRepository.server.ts
{apps/webapp/app/**/*.server.{ts,tsx},apps/webapp/app/routes/**/*.ts}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Access environment variables only via the env export from app/env.server.ts; do not reference process.env directly
Files:
apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
apps/webapp/app/v3/taskEventStore.server.ts
apps/webapp/app/presenters/v3/SpanPresenter.server.ts
apps/webapp/app/v3/services/expireEnqueuedRun.server.ts
apps/webapp/app/v3/services/crashTaskRun.server.ts
apps/webapp/app/v3/services/completeAttempt.server.ts
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
apps/webapp/app/v3/runEngineHandlers.server.ts
apps/webapp/app/v3/eventRepository.server.ts
apps/webapp/app/**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Modules intended for test consumption under apps/webapp/app/**/*.ts must not read environment variables; accept configuration via options instead
Files:
apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
apps/webapp/app/v3/taskEventStore.server.ts
apps/webapp/app/presenters/v3/SpanPresenter.server.ts
apps/webapp/app/v3/services/expireEnqueuedRun.server.ts
apps/webapp/app/v3/services/crashTaskRun.server.ts
apps/webapp/app/v3/services/completeAttempt.server.ts
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
apps/webapp/app/v3/runEngineHandlers.server.ts
apps/webapp/app/v3/eventRepository.server.ts
🧠 Learnings (5)
📚 Learning: 2025-07-18T17:49:24.468Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-18T17:49:24.468Z
Learning: Applies to internal-packages/database/**/*.{ts,tsx} : We use prisma in internal-packages/database for our database interactions using PostgreSQL
Applied to files:
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Import Trigger.dev APIs from "trigger.dev/sdk/v3" when writing tasks or related utilities
Applied to files:
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
apps/webapp/app/v3/eventRepository.server.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use schemaTask({ schema, run, ... }) to validate payloads when input validation is required
Applied to files:
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Define tasks using task({ id, run, ... }) with a unique id per project
Applied to files:
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
apps/webapp/app/v3/eventRepository.server.ts
📚 Learning: 2025-08-29T10:06:49.293Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-08-29T10:06:49.293Z
Learning: Prefer Run Engine 2.0 via internal/run-engine; avoid extending legacy run engine code
Applied to files:
apps/webapp/app/v3/runEngineHandlers.server.ts
🧬 Code graph analysis (9)
apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam.spans.$spanParam/route.tsx (1)
apps/webapp/app/services/runsReplicationService.server.ts (2)
run
(779-834)run
(836-844)
packages/core/src/v3/workers/taskExecutor.ts (3)
apps/webapp/app/presenters/v3/SpanPresenter.server.ts (1)
traceContext
(601-621)apps/webapp/app/runEngine/services/triggerTask.server.ts (1)
traceContext
(386-430)packages/core/src/v3/trace-context-api.ts (1)
traceContext
(5-5)
apps/webapp/app/presenters/v3/SpanPresenter.server.ts (1)
apps/webapp/app/services/runsReplicationService.server.ts (2)
run
(779-834)run
(836-844)
packages/core/src/v3/schemas/openTelemetry.ts (1)
apps/webapp/app/v3/eventRepository.server.ts (1)
event
(1305-1324)
apps/webapp/app/v3/services/expireEnqueuedRun.server.ts (1)
apps/webapp/app/v3/eventRepository.server.ts (1)
eventRepository
(1769-1769)
apps/webapp/app/v3/services/crashTaskRun.server.ts (2)
apps/webapp/app/v3/eventRepository.server.ts (1)
eventRepository
(1769-1769)packages/core/src/v3/schemas/common.ts (2)
TaskRunErrorCodes
(197-197)TaskRunErrorCodes
(198-198)
apps/webapp/app/v3/services/completeAttempt.server.ts (3)
packages/core/src/v3/tryCatch.ts (1)
tryCatch
(8-15)apps/webapp/app/v3/eventRepository.server.ts (2)
eventRepository
(1769-1769)createExceptionPropertiesFromError
(1879-1908)apps/webapp/app/services/logger.server.ts (1)
logger
(49-59)
apps/webapp/app/v3/runEngineHandlers.server.ts (4)
apps/webapp/app/v3/runEngine.server.ts (1)
engine
(9-9)apps/webapp/app/db.server.ts (1)
$replica
(103-106)apps/webapp/app/v3/eventRepository.server.ts (2)
eventRepository
(1769-1769)createExceptionPropertiesFromError
(1879-1908)packages/core/src/v3/errors.ts (2)
sanitizeError
(252-283)createJsonErrorObject
(222-249)
apps/webapp/app/v3/eventRepository.server.ts (4)
packages/core/src/v3/schemas/common.ts (2)
TaskRun
(209-234)TaskRun
(236-236)apps/webapp/app/utils/taskEvent.ts (2)
calculateDurationFromStart
(408-412)getDateFromNanoseconds
(432-434)packages/core/src/v3/schemas/openTelemetry.ts (8)
AttemptFailedSpanEvent
(31-39)AttemptFailedSpanEvent
(41-41)SpanEvent
(51-56)SpanEvent
(58-58)SpanEvents
(60-60)SpanEvents
(62-62)ExceptionSpanEvent
(11-17)ExceptionSpanEvent
(19-19)apps/webapp/app/components/runs/v3/SpanEvents.tsx (1)
SpanEvents
(20-34)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (22)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (30)
internal-packages/run-engine/src/engine/eventBus.ts (1)
290-290
: LGTM! Clean extension of event payload.The addition of the optional
cachedRunId
field to thecachedRunCompleted
event payload is well-structured and follows TypeScript best practices. This change aligns with the broader PR objective to provide better traceability for cached run completion events.internal-packages/run-engine/src/engine/systems/waitpointSystem.ts (1)
163-163
: LGTM! Consistent implementation of the new event payload.The addition of
cachedRunId: waitpoint.completedByTaskRunId ?? undefined
correctly populates the new optional field introduced in the event bus. The logic is sound: if the waitpoint was completed by a task run, include that ID; otherwise, leave it undefined.apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam.spans.$spanParam/route.tsx (1)
775-782
: LGTM! Proper admin-only display of tracing information.The addition of Trace ID and Span ID fields to the admin section is well-implemented:
- Correctly gated behind
isAdmin
check for security- Properly placed within the existing admin section structure
- Uses consistent
Property.Item
pattern matching the surrounding code- Aligns with the PR's goal to improve observability and tracing capabilities
The fields rely on
run.traceId
andrun.spanId
which are populated by the backend changes in this PR.apps/webapp/app/presenters/v3/SpanPresenter.server.ts (1)
254-254
: LGTM! Essential field for trace correlation.Adding
traceId
to the run payload is a crucial enhancement that enables:
- Cross-component trace correlation
- Better observability in the UI (as shown in the route file)
- Alignment with the PR's shift to more comprehensive run-centric event handling
This change is consistent with the existing pattern of exposing tracing metadata and supports the broader goal of improving efficiency and traceability.
packages/core/src/v3/schemas/openTelemetry.ts (3)
31-41
: LGTM! Well-structured addition of attempt failure telemetry.The
AttemptFailedSpanEvent
schema is properly designed:
- Uses consistent naming convention with
attempt_failed
literal- Includes all necessary context fields (
exception
,attemptNumber
,runId
)- Follows the existing pattern established by other span events
- Proper type inference and export
This supports the PR's goal of more granular attempt-level observability.
51-56
: LGTM! Proper integration into the SpanEvent union.The extension of the
SpanEvent
union to includeAttemptFailedSpanEvent
maintains type safety and follows the existing pattern. The order is logical, placing it between the other specific event types and the genericOtherSpanEvent
.
72-74
: LGTM! Clean type guard implementation.The
isAttemptFailedSpanEvent
type guard follows the established pattern and provides proper type narrowing for the new event type.apps/webapp/app/v3/taskEventStore.server.ts (3)
24-24
: LGTM! Consistent addition of attemptNumber to trace event types.The addition of
attemptNumber
to bothTraceEvent
andDetailedTraceEvent
types is correctly implemented and maintains consistency across the type definitions.Also applies to: 51-51
193-195
: LGTM! Proper database field selection for partitioned events.The inclusion of
attemptNumber
in the SELECT clauses for both partitioned and non-partitioned queries ensures that the new field is properly retrieved from the database and matches the updated type definitions.Also applies to: 226-228
280-282
: LGTM! Consistent field selection in detailed trace queries.The addition of
attemptNumber
to the detailed trace event queries maintains consistency with the simpler trace queries and ensures all necessary attempt context is available for detailed analysis.Also applies to: 319-321
apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts (2)
93-98
: LGTM! Improved span ID computation for waitpoint completion.The enhanced logic for computing the correct
spanId
for completion is well-thought-out:
- Replay case: Uses
event.spanId
directly whenparentAsLinkType === "replay"
- Traceparent case: Concatenates
traceparent.spanId
andevent.spanId
for hierarchical span relationships- Default case: Falls back to
event.spanId
This change aligns with the PR's goal of improving span completion accuracy and supports the shift to more granular event handling.
104-104
: LGTM! Consistent usage of computed spanId.The replacement of
event.spanId
with the computedspanId
variable ensures that the correct span is targeted for completion in the waitpoint blocking mechanism.packages/core/src/v3/workers/taskExecutor.ts (1)
355-355
: Verify removal ofsignal
from startActiveSpanLocal search produced no matches for
startActiveSpan
/its implementation — cannot verify impact. Confirm the removal does not change span lifecycle or cancellation by doing the following:
- Locate the
startActiveSpan
implementation (e.g.,TriggerTracer.startActiveSpan
) and confirm its signature/callers no longer expect an AbortSignal; update callers and tests if needed.- Ensure AbortSignal-driven cancellation still closes/ends spans — either forward the signal into tracer span creation options or handle abort inside the span callback.
- Run these checks locally: rg -n '\bstartActiveSpan\b' -S; rg -n 'TriggerTracer' -S; rg -n '\bstartSpan\b' -S --type=ts; inspect tests referencing signals.
apps/webapp/app/v3/services/completeAttempt.server.ts (3)
168-181
: Good: event completion errors are contained and loggedWrapping eventRepository.completeSuccessfulRunEvent in tryCatch decouples persistence failures from run finalization. This reduces blast radius.
319-333
: Good: failed run event uses normalized exception mappingUsing createExceptionPropertiesFromError(sanitizedError) yields consistent event shapes and avoids leaking raw internals.
36-36
: Verify tryCatch import subpathpackages/core/package.json exports include "./utils" (line 25), so '@trigger.dev/core/utils' is a valid subpath; no re-export/definition of tryCatch was found in packages/core — confirm tryCatch is exported from that subpath or switch the import to the documented export.
apps/webapp/app/v3/runEngineHandlers.server.ts (1)
287-337
: LGTM: Expiry handler correctly bails when ttl is missingMatches the intended “no read-before-write” flow and avoids false expirations.
apps/webapp/app/v3/eventRepository.server.ts (13)
5-5
: LGTM! Good addition for attempt-failed span events.The import of
AttemptFailedSpanEvent
aligns with the new functionality for handling failed attempts in run completions.
24-24
: Well-designed type for run completion context.The
CompleteableTaskRun
type is a well-chosen subset ofTaskRun
fields that provides all the necessary context for span completion without requiring the full entity. This supports the PR's goal of avoiding unnecessary reads.Also applies to: 57-71
156-156
: Good addition ofattemptNumber
to queried events.Adding
attemptNumber
to the selected fields enables proper tracking of attempt-specific events, which is essential for the new attempt-failed functionality.
502-556
: Excellent use of UNSPECIFIED kind for invisible events.The
createAttemptFailedRunEvent
method cleverly useskind: "UNSPECIFIED"
to create invisible events that get merged with the main span. The use of thesatisfies
operator on line 546 ensures type safety for the AttemptFailedSpanEvent structure.
688-695
: Consistent and correct merging of UNSPECIFIED events.The handling of UNSPECIFIED events in both
getTraceSummary
andgetTraceDetailedSummary
is consistent and correctly concatenates events from invisible spans into their parent spans.Also applies to: 810-818
714-730
: Well-implemented ancestor override logic.The application of ancestor overrides in both summary methods properly handles cancelled and error states from parent spans, ensuring that partial spans inherit the correct final state.
Also applies to: 847-863
960-962
: Good filtering of UNSPECIFIED events in getRunEvents.Properly filters out UNSPECIFIED events from the run events list, as these are only meant to be merged into their parent spans.
1273-1283
: Smart event merging strategy.The dual merging approach - first merging partial with final events, then applying overrides - ensures that all event data is preserved while maintaining the correct final state. The helper functions properly handle empty fields and concatenate events.
Also applies to: 1290-1324
1990-2115
: Comprehensive ancestor traversal logic.The ancestor traversal functions (
getAncestorOverrides
,findCancelledAncestor
,findAttemptFailedAncestor
) properly walk the span hierarchy to determine the final state of partial spans. The logic correctly prioritizes cancellation over error states.
2304-2306
: Simple helper for date to nanoseconds conversion.The
convertDateToNanoseconds
helper properly converts dates to nanoseconds for consistency with the startTime format.
2563-2577
: Useful empty-checking utilities.The
isEmpty
andisEmptyObject
helper functions properly check for various empty states including null, undefined, and empty objects, which is important for the merge logic.
1088-1144
: Complex but correct attempt-failed event handling — confirm attemptNumber is populated.attemptNumber exists in packages/core/src/v3/schemas/messages.ts and runEngine.ts but is optional/nullish; ensure the 'attempt_failed' span events and the compared event have a non-null attemptNumber or add a defensive guard before comparing to avoid missed matches.
310-343
: Verify span ID propagation for distributed tracing.Search only found an INSERT writing spanIdToComplete (internal-packages/run-engine/src/engine/systems/waitpointSystem.ts); I couldn't find where TaskRun.spanId or run.parentSpanId are assigned — ensure TaskRun creation or its caller populates spanId and parentSpanId so trace continuity is preserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts (2)
104-110
: Good: cancel event write without pre-reads; minor polish on tryCatch usage.If tryCatch returns [err, result], consider capturing result for metrics/debug or comment why it’s intentionally ignored.
- const [cancelRunEventError] = await tryCatch( + const [cancelRunEventError, _cancelResult] = await tryCatch( eventRepository.cancelRunEvent({ reason: opts.reason, run: cancelledTaskRun, cancelledAt: opts.cancelledAt, }) );
112-117
: Enhance error context for triage.Include environment and reason to speed up on-call debugging.
- logger.error("[CancelTaskRunServiceV1] Failed to cancel run event", { - error: cancelRunEventError, - runId: cancelledTaskRun.id, - }); + logger.error("[CancelTaskRunServiceV1] Failed to cancel run event", { + error: cancelRunEventError, + runId: cancelledTaskRun.id, + envType: cancelledTaskRun.runtimeEnvironment?.type, + reason: opts.reason, + cancelledAt: opts.cancelledAt?.toISOString(), + });
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
(4 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}
: Always prefer using isomorphic code like fetch, ReadableStream, etc. instead of Node.js specific code
For TypeScript, we usually use types over interfaces
Avoid enums
No default exports, use function declarations
Files:
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
We use zod a lot in packages/core and in the webapp
Files:
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
When importing from @trigger.dev/core in the webapp, never import the root package path; always use one of the documented subpath exports from @trigger.dev/core’s package.json
Files:
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
{apps/webapp/app/**/*.server.{ts,tsx},apps/webapp/app/routes/**/*.ts}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Access environment variables only via the env export from app/env.server.ts; do not reference process.env directly
Files:
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
apps/webapp/app/**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Modules intended for test consumption under apps/webapp/app/**/*.ts must not read environment variables; accept configuration via options instead
Files:
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
🧠 Learnings (4)
📚 Learning: 2025-07-18T17:49:24.468Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-18T17:49:24.468Z
Learning: Applies to internal-packages/database/**/*.{ts,tsx} : We use prisma in internal-packages/database for our database interactions using PostgreSQL
Applied to files:
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Import Trigger.dev APIs from "trigger.dev/sdk/v3" when writing tasks or related utilities
Applied to files:
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use schemaTask({ schema, run, ... }) to validate payloads when input validation is required
Applied to files:
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Define tasks using task({ id, run, ... }) with a unique id per project
Applied to files:
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: typecheck / typecheck
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (3)
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts (3)
1-1
: Type-only Prisma import is correct.Reduces runtime bundle and avoids accidental value import.
12-13
: Verified — './utils' exported and tryCatch accepts a Promise; no change required.packages/core/package.json exports include "./utils". tryCatch is declared to accept a Promise (packages/core/src/utils.ts and packages/core/src/v3/tryCatch.ts), so keep the current await tryCatch(eventRepository.cancelRunEvent(...)) usage (do NOT wrap in a thunk).
95-96
: Cannot verify — eventRepository.server.ts missing; confirm if include.project is required
- File: apps/webapp/app/v3/services/cancelTaskRunV1.server.ts (lines 95–96) contains project: true.
- Could not locate apps/webapp/app/v3/services/eventRepository.server.ts to confirm whether cancelRunEvent reads run.project — verify whether cancelRunEvent accesses run.project; if it does not, remove the project include to avoid an extra join.
- Quick checks to run:
- rg -nP 'cancelRunEvent\s*(' apps/webapp -C3
- rg -nP '\brun.project\b' apps/webapp -n -C2
- Inspect apps/webapp/app/v3/services/finalizeTaskRun.server.ts to confirm FinalizeTaskRunService.call supports include.project.
...anizationSlug.projects.$projectParam.env.$envParam.runs.$runParam.spans.$spanParam/route.tsx
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/webapp/app/v3/eventRepository.server.ts (1)
569-607
: MissingattemptNumber
incancelRunEvent
.This method is also missing the
attemptNumber
field.await this.insertImmediate({ message: run.taskIdentifier, serviceName: "api server", serviceNamespace: "trigger.dev", level: "TRACE", kind: "SERVER", traceId: run.traceId, spanId: run.spanId, parentId: run.parentSpanId, runId: run.friendlyId, taskSlug: run.taskIdentifier, projectRef: "", projectId: run.projectId, environmentId: run.runtimeEnvironmentId, environmentType: run.environmentType ?? "DEVELOPMENT", organizationId: run.organizationId ?? "", isPartial: false, isError: true, isCancelled: true, status: "ERROR", runIsTest: run.isTest, events: [ { name: "cancellation", time: cancelledAt, properties: { reason, }, }, ], startTime, properties: {}, metadata: undefined, style: undefined, duration: calculateDurationFromStart(startTime, cancelledAt), output: undefined, payload: undefined, payloadType: undefined, + attemptNumber: undefined, // or get from run if available });
🧹 Nitpick comments (2)
apps/webapp/app/v3/eventRepository.server.ts (2)
2207-2209
: Make the helper function more robust against edge cases.The
convertDateToNanoseconds
helper should handle potential edge cases.function convertDateToNanoseconds(date: Date) { + if (!date || !(date instanceof Date) || isNaN(date.getTime())) { + throw new Error("Invalid date provided to convertDateToNanoseconds"); + } return BigInt(date.getTime()) * BigInt(1_000_000); }
2466-2480
: Optimize empty object check and improve code consistency.The current implementation uses a for-in loop which can be inefficient and doesn't follow the existing code style patterns in the codebase.
function isEmptyObject(obj: object) { - for (var prop in obj) { - if (Object.prototype.hasOwnProperty.call(obj, prop)) { - return false; - } - } - - return true; + return Object.keys(obj).length === 0; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
apps/webapp/app/v3/eventRepository.server.ts
(21 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}
: Always prefer using isomorphic code like fetch, ReadableStream, etc. instead of Node.js specific code
For TypeScript, we usually use types over interfaces
Avoid enums
No default exports, use function declarations
Files:
apps/webapp/app/v3/eventRepository.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
We use zod a lot in packages/core and in the webapp
Files:
apps/webapp/app/v3/eventRepository.server.ts
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
When importing from @trigger.dev/core in the webapp, never import the root package path; always use one of the documented subpath exports from @trigger.dev/core’s package.json
Files:
apps/webapp/app/v3/eventRepository.server.ts
{apps/webapp/app/**/*.server.{ts,tsx},apps/webapp/app/routes/**/*.ts}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Access environment variables only via the env export from app/env.server.ts; do not reference process.env directly
Files:
apps/webapp/app/v3/eventRepository.server.ts
apps/webapp/app/**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Modules intended for test consumption under apps/webapp/app/**/*.ts must not read environment variables; accept configuration via options instead
Files:
apps/webapp/app/v3/eventRepository.server.ts
🧠 Learnings (2)
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Import Trigger.dev APIs from "trigger.dev/sdk/v3" when writing tasks or related utilities
Applied to files:
apps/webapp/app/v3/eventRepository.server.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Define tasks using task({ id, run, ... }) with a unique id per project
Applied to files:
apps/webapp/app/v3/eventRepository.server.ts
🧬 Code graph analysis (1)
apps/webapp/app/v3/eventRepository.server.ts (3)
packages/core/src/v3/schemas/common.ts (2)
TaskRun
(209-234)TaskRun
(236-236)apps/webapp/app/utils/taskEvent.ts (2)
calculateDurationFromStart
(408-412)getDateFromNanoseconds
(432-434)packages/core/src/v3/schemas/openTelemetry.ts (8)
AttemptFailedSpanEvent
(31-39)AttemptFailedSpanEvent
(41-41)SpanEvent
(51-56)SpanEvent
(58-58)SpanEvents
(60-60)SpanEvents
(62-62)ExceptionSpanEvent
(11-17)ExceptionSpanEvent
(19-19)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (22)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: typecheck / typecheck
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
🔇 Additional comments (5)
apps/webapp/app/v3/eventRepository.server.ts (5)
515-555
: LGTM! Proper use ofattemptNumber
increateAttemptFailedRunEvent
.This method correctly includes the
attemptNumber
field in both the event creation and the AttemptFailedSpanEvent properties. The use ofUNSPECIFIED
kind for invisible events is also appropriate.
689-695
: Clear and well-documented handling of UNSPECIFIED events.The logic properly merges invisible events by concatenating their event arrays while keeping the original event data. The comment clearly explains the intention.
1121-1144
: Good implementation of attempt failure handling in ancestry walk.The logic correctly identifies attempt-failed events that match the current attempt number and creates appropriate exception events from them.
1290-1303
: Well-designed merge strategy for partial and final events.The
#mergePartialWithFinalEvent
method implements an intelligent merge strategy that uses the final event as the base while preserving meaningful data from the partial event when the final event has missing/empty fields.
24-24
: Keep TaskRun import — it's used and safe.TaskRun is required to build CompleteableTaskRun in apps/webapp/app/v3/eventRepository.server.ts; internal-packages/database re-exports the Prisma-generated types (internal-packages/database/src/index.ts -> ../generated/prisma) and does not import core/webapp, so this import does not create a circular dependency.
Issues attributed to commits in this pull requestThis pull request was merged and Sentry observed the following issues:
|
This change will prevent inefficient queries finding “in progress” span events to close server-side, instead we’ve now taken an approach that doesn’t not requiring task event reads before doing writes to close spans. This effects both v3 and v4, and is a platform-only change. This should lead to much more efficient cancellation, WAY fewer queries reading the TaskEvent/Partitioned tables, and lays the ground work for moving to ClickHouse for storing this data.