Skip to content

Deadlock in SentryContextWrapper with Java 24 Virtual Threads and Spring Boot 3.5 #4872

@randyhbh

Description

@randyhbh

We're experiencing a complete application freeze in our Kubernetes environments (both dev and stg) caused by a deadlock in Sentry's OpenTelemetry integration when using Java 24 virtual threads. All HTTP request processing threads become blocked waiting for the same lock in SentryContextWrapper, making the application unresponsive.

Environment

  • Sentry SDK Version: 8.25.0
  • Java Version: Java 24 (OpenJDK 24)
  • Spring Boot Version: 3.5.6
  • Application Server: Tomcat with VirtualThreadExecutor (enabled by default in Spring Boot 3.2+)
  • Deployment: Kubernetes (both dev and stg environments affected)
  • Sentry Integration:
    • Sentry Java Agent (sentry-opentelemetry-agent via -javaagent)
    • Sentry Spring Boot Starter (sentry-spring-boot-jakarta)

Configuration

# application.yml
spring:
  datasource:
    hikari:
      register-mbeans: true
      allow-pool-suspension: true
      leak-detection-threshold: 20000

# Sentry configuration
sentry:
  dsn: "https://[email protected]/4504878114340864"
  sample-rate: 1.0
  traces-sample-rate: 0.2  # Performance tracing at 20% sampling
  environment: ${ENVIRONMENT_NAME}
  send-default-pii: true
  max-request-body-size: always
  logging:
    minimum-event-level: warn
    minimum-breadcrumb-level: info

# Micrometer tracing configuration
management:
  tracing:
    sampling:
      probability: 0.2  # 20% trace sampling

Environment variables:

SENTRY_AUTO_INIT=false
OTEL_LOGS_EXPORTER=none
OTEL_METRICS_EXPORTER=none
OTEL_TRACES_EXPORTER=none

Problem Description

Symptoms

  1. Application becomes completely unresponsive in K8s environments (both dev and stg)
  2. No HTTP requests can be processed
  3. Hikari connection pool appears exhausted (but is actually blocked from starting operations)
  4. Issue does NOT occur in local development (lower concurrency, Sentry disabled)

Root Cause

Multiple threads (Tomcat NIO poller, virtual thread workers, and master poller) are deadlocked waiting for the same ReentrantLock object (<0x00000007fe0d06d0>) in SentryContextWrapper.forkCurrentScopeInternal().

The deadlock occurs when:

  1. Tomcat tries to create a new virtual thread for an HTTP request
  2. Sentry's OpenTelemetry integration tries to fork the current scope
  3. Lock contention occurs in SynchronizedQueue.toArray()
  4. All virtual threads become pinned/blocked waiting for this lock
  5. No new HTTP requests can be processed

Thread Dump Evidence

Blocked Thread #1: Tomcat NIO Poller (line 1048)

"http-nio-8088-Poller" #153 [147] daemon prio=5 os_prio=0 cpu=2798.94ms elapsed=10104.08s tid=0x00007f7ee2cc5430 nid=147 waiting on condition  [0x00007f7fc4ffe000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park(java.base@24/Native Method)
	- parking to wait for  <0x00000007fe0d06d0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
	at java.util.concurrent.locks.LockSupport.park(java.base@24/LockSupport.java:223)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@24/AbstractQueuedSynchronizer.java:789)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@24/AbstractQueuedSynchronizer.java:1029)
	at java.util.concurrent.locks.ReentrantLock$Sync.lock(java.base@24/ReentrantLock.java:154)
	at java.util.concurrent.locks.ReentrantLock.lock(java.base@24/ReentrantLock.java:323)
	at io.sentry.util.AutoClosableReentrantLock.acquire(AutoClosableReentrantLock.java:12)
	at io.sentry.SynchronizedQueue.toArray(SynchronizedQueue.java:148)
	at io.sentry.Scope.<init>(Scope.java:138)
	at io.sentry.Scope.clone(Scope.java:1099)
	at io.sentry.Scopes.forkedScopes(Scopes.java:110)
	at io.sentry.Sentry.forkedRootScopes(Sentry.java:129)
	at io.sentry.opentelemetry.SentryContextWrapper.forkCurrentScopeInternal(SentryContextWrapper.java:75)
	at io.sentry.opentelemetry.SentryContextWrapper.forkCurrentScope(SentryContextWrapper.java:46)
	at io.sentry.opentelemetry.SentryContextWrapper.wrap(SentryContextWrapper.java:94)
	at io.sentry.opentelemetry.SentryContextStorage.root(SentryContextStorage.java:44)
	at io.opentelemetry.javaagent.shaded.io.opentelemetry.context.Context.root(Context.java:105)
	at io.opentelemetry.javaagent.shaded.io.opentelemetry.context.Context.current(Context.java:93)
	at io.opentelemetry.javaagent.bootstrap.Java8BytecodeBridge.currentContext(Java8BytecodeBridge.java:23)
	at java.util.concurrent.ForkJoinPool.execute(java.base@24/ForkJoinPool.java:3100)
	at java.lang.VirtualThread.submitRunContinuation(java.base@24/VirtualThread.java:350)
	at java.lang.VirtualThread.externalSubmitRunContinuationOrThrow(java.base@24/VirtualThread.java:435)
	at java.lang.VirtualThread.start(java.base@24/VirtualThread.java:710)
	at java.lang.VirtualThread.start(java.base@24/VirtualThread.java:721)
	at java.lang.ThreadBuilders$VirtualThreadBuilder.start(java.base@24/ThreadBuilders.java:262)
	at org.apache.tomcat.util.threads.VirtualThreadExecutor.execute(VirtualThreadExecutor.java:52)
	at org.apache.tomcat.util.net.AbstractEndpoint.processSocket(AbstractEndpoint.java:1360)
	at org.apache.tomcat.util.net.NioEndpoint$Poller.processKey(NioEndpoint.java:842)
	at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:809)

Blocked Thread #2: ForkJoinPool Worker (line 1174)

"ForkJoinPool-1-worker-1" #162 [155] daemon prio=5 os_prio=0 cpu=130508.48ms elapsed=10103.54s tid=0x00007f7ee2d45ad0 nid=155 waiting on condition  [0x00007f7fc44fe000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park(java.base@24/Native Method)
	- parking to wait for  <0x00000007fe0d06d0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
	at java.util.concurrent.locks.LockSupport.park(java.base@24/LockSupport.java:223)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@24/AbstractQueuedSynchronizer.java:789)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@24/AbstractQueuedSynchronizer.java:1029)
	at java.util.concurrent.locks.ReentrantLock$Sync.lock(java.base@24/ReentrantLock.java:154)
	at java.util.concurrent.locks.ReentrantLock.lock(java.base@24/ReentrantLock.java:323)
	at io.sentry.util.AutoClosableReentrantLock.acquire(AutoClosableReentrantLock.java:12)
	at io.sentry.SynchronizedQueue.toArray(SynchronizedQueue.java:148)
	at io.sentry.Scope.<init>(Scope.java:138)
	at io.sentry.Scope.clone(Scope.java:1099)
	at io.sentry.Scopes.forkedScopes(Scopes.java:110)
	at io.sentry.Sentry.forkedRootScopes(Sentry.java:129)
	at io.sentry.opentelemetry.SentryContextWrapper.forkCurrentScopeInternal(SentryContextWrapper.java:75)
	at io.sentry.opentelemetry.SentryContextWrapper.forkCurrentScope(SentryContextWrapper.java:46)
	at io.sentry.opentelemetry.SentryContextWrapper.wrap(SentryContextWrapper.java:94)
	at io.sentry.opentelemetry.SentryContextStorage.root(SentryContextStorage.java:44)
	at io.opentelemetry.javaagent.shaded.io.opentelemetry.context.Context.root(Context.java:105)
	at io.opentelemetry.javaagent.bootstrap.executors.ExecutorAdviceHelper.shouldPropagateContext(ExecutorAdviceHelper.java:53)
	at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(java.base@24/ScheduledThreadPoolExecutor.java:543)
	at java.lang.VirtualThread.schedule(java.base@24/VirtualThread.java:1450)
	at java.lang.VirtualThread.afterYield(java.base@24/VirtualThread.java:571)
	at java.lang.VirtualThread.runContinuation(java.base@24/VirtualThread.java:309)

Blocked Thread #3: Master Poller (line 1226)

"MasterPoller" #164 [157] daemon prio=5 os_prio=0 cpu=25987.63ms elapsed=10103.52s tid=0x0000564951d91730 nid=157 waiting on condition  [0x00007f7fc42fe000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park(java.base@24/Native Method)
	- parking to wait for  <0x00000007fe0d06d0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
	at java.util.concurrent.locks.LockSupport.park(java.base@24/LockSupport.java:223)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@24/AbstractQueuedSynchronizer.java:789)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@24/AbstractQueuedSynchronizer.java:1029)
	at java.util.concurrent.locks.ReentrantLock$Sync.lock(java.base@24/ReentrantLock.java:154)
	at java.util.concurrent.locks.ReentrantLock.lock(java.base@24/ReentrantLock.java:323)
	at io.sentry.util.AutoClosableReentrantLock.acquire(AutoClosableReentrantLock.java:12)
	at io.sentry.SynchronizedQueue.toArray(SynchronizedQueue.java:148)
	at io.sentry.Scope.<init>(Scope.java:138)
	at io.sentry.Scope.clone(Scope.java:1099)
	at io.sentry.Scopes.forkedScopes(Scopes.java:110)
	at io.sentry.Sentry.forkedRootScopes(Sentry.java:129)
	at io.sentry.opentelemetry.SentryContextWrapper.forkCurrentScopeInternal(SentryContextWrapper.java:75)

Key Observation

  • Only 1 Hikari thread present: HikariPool-1:housekeeper (maintenance thread)
  • No threads waiting for database connections
  • All blocking occurs at the same lock: <0x00000007fe0d06d0>
  • Bottleneck: SynchronizedQueue.toArray() at line 148

Analysis

Why This Happens

  1. High virtual thread concurrency: Java 24 + Spring Boot 3.5 enables virtual threads by default
  2. Performance tracing enabled: With traces-sample-rate: 0.2, 20% of HTTP requests create traces
  3. Lock contention in scope forking: Every virtual thread creation triggers SentryContextWrapper.forkCurrentScopeInternal() for trace context propagation (even for non-traced requests)
  4. Scope cloning bottleneck: Scope.clone() calls SynchronizedQueue.toArray() which acquires a lock
  5. Cascade effect: Under high concurrency, multiple virtual threads compete for the same lock
  6. Complete freeze: Even with 20% sampling, the lock contention is severe enough to deadlock the system

Why It Appears as "Connection Pool Exhaustion"

  • Virtual threads never start → No database operations begin
  • Hikari pool shows as "exhausted" because no connections are being used OR released
  • This is a secondary symptom, not the root cause

Reproduction Steps

  1. Deploy Spring Boot 3.5+ application with Sentry 8.25.0 to Kubernetes
  2. Enable both Sentry Java Agent and Spring Boot Starter
  3. Configure:
    • sentry.traces-sample-rate=0.2 (20% performance tracing)
    • management.tracing.sampling.probability=0.2 (Micrometer at 20%)
  4. Run on Java 24 (virtual threads enabled by default)
  5. Send moderate concurrent HTTP traffic (10-20 concurrent requests)
  6. Observe application freeze after 10-30 minutes

Note: Both our dev and stg environments with identical configuration (20% sampling) experience this deadlock.

Expected Behavior

Sentry should handle virtual thread scope propagation without lock contention, allowing high concurrency without deadlocks.

Actual Behavior

Application freezes completely. All virtual threads block waiting for the same lock in SentryContextWrapper, preventing any HTTP request processing.

Workarounds Tested

Important: The deadlock occurs even with 20% trace sampling (traces-sample-rate=0.2), indicating the issue is not simply about sampling rate but about fundamental lock contention in virtual thread scope propagation.

Under Investigation:

  1. Further reduce trace sampling - Test with sentry.traces-sample-rate=0.05 (5%)

    • Current 20% sampling still causes a deadlock
    • Testing if very low sampling avoids the issue
  2. Remove Sentry Java Agent - Keep only Spring Boot starter (sentry-spring-boot-jakarta)

    • Testing if agent's bytecode instrumentation causes the contention
    • May lose some automatic instrumentation features
  3. Remove sentry-reactor dependency - Already removed, testing in progress

    • Unlikely to help (we use Spring MVC, not WebFlux)

Related Issues

Questions

  1. Is SentryContextWrapper.forkCurrentScopeInternal() optimized for high-concurrency virtual thread scenarios?
  2. Could scope cloning be made lock-free or use more granular locking?
  3. Should the dual setup (Java Agent + Spring Boot Starter) be avoided with virtual threads?

Additional Context

We're using:

  • PostgreSQL with HikariCP (default 10 connections)
  • Elasticsearch for search
  • MongoDB for document storage
  • Spring MVC (not WebFlux)
  • Kubernetes deployment with resource limits

Metadata

Metadata

Assignees

No one assigned

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions