Skip to content

Conversation

@zhengyu123
Copy link
Contributor

@zhengyu123 zhengyu123 commented Oct 16, 2025

What Does This Do

Eagerly initializing java.nio on main thread to avoid a race that may result in crashing JVM.

Motivation

Improve stability.

Additional Notes

This is a workaround of upstream JDK bug: https://bugs.openjdk.org/browse/JDK-8345810

Contributor Checklist

Jira ticket: [PROJ-IDENT]
https://datadoghq.atlassian.net/browse/PROF-12749

@datadog-datadog-prod-us1
Copy link
Contributor

datadog-datadog-prod-us1 bot commented Oct 16, 2025

🎯 Code Coverage
Patch Coverage: 0.00%
Total Coverage: 38.88% (-20.94%)

View detailed report

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: c96d579 | Docs | Was this helpful? Give us feedback!

@pr-commenter
Copy link

pr-commenter bot commented Oct 16, 2025

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master PROF-12749
git_commit_date 1760943758 1760968422
git_commit_sha 92a857d c96d579
release_version 1.55.0-SNAPSHOT~92a857db10 1.55.0-SNAPSHOT~c96d5798e2
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1760970585 1760970585
ci_job_id 1187479743 1187479743
ci_pipeline_id 79777501 79777501
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-iz56aw0m 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-iz56aw0m 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 1 performance regressions! Performance is the same for 59 metrics, 5 unstable metrics.

scenario Δ mean execution_time candidate mean execution_time baseline mean execution_time
scenario:startup:petclinic:tracing:Debugger worse
[+143.237µs; +311.826µs] or [+2.286%; +4.977%]
6.493ms 6.266ms
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.55.0-SNAPSHOT~c96d5798e2, baseline=1.55.0-SNAPSHOT~92a857db10

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.02 s) : 0, 1019973
Total [baseline] (10.719 s) : 0, 10718636
Agent [candidate] (1.017 s) : 0, 1017226
Total [candidate] (10.786 s) : 0, 10786442
section appsec
Agent [baseline] (1.194 s) : 0, 1194234
Total [baseline] (10.788 s) : 0, 10787823
Agent [candidate] (1.198 s) : 0, 1197621
Total [candidate] (11.062 s) : 0, 11062415
section iast
Agent [baseline] (1.156 s) : 0, 1156250
Total [baseline] (11.16 s) : 0, 11159547
Agent [candidate] (1.149 s) : 0, 1149055
Total [candidate] (11.029 s) : 0, 11028752
section profiling
Agent [baseline] (1.163 s) : 0, 1163376
Total [baseline] (10.841 s) : 0, 10840856
Agent [candidate] (1.16 s) : 0, 1160109
Total [candidate] (11.06 s) : 0, 11060161
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.02 s -
Agent appsec 1.194 s 174.261 ms (17.1%)
Agent iast 1.156 s 136.277 ms (13.4%)
Agent profiling 1.163 s 143.403 ms (14.1%)
Total tracing 10.719 s -
Total appsec 10.788 s 69.187 ms (0.6%)
Total iast 11.16 s 440.911 ms (4.1%)
Total profiling 10.841 s 122.22 ms (1.1%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.017 s -
Agent appsec 1.198 s 180.395 ms (17.7%)
Agent iast 1.149 s 131.829 ms (13.0%)
Agent profiling 1.16 s 142.883 ms (14.0%)
Total tracing 10.786 s -
Total appsec 11.062 s 275.973 ms (2.6%)
Total iast 11.029 s 242.31 ms (2.2%)
Total profiling 11.06 s 273.719 ms (2.5%)
gantt
    title petclinic - break down per module: candidate=1.55.0-SNAPSHOT~c96d5798e2, baseline=1.55.0-SNAPSHOT~92a857db10

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.454 ms) : 0, 1454
crashtracking [candidate] (1.46 ms) : 0, 1460
BytebuddyAgent [baseline] (695.632 ms) : 0, 695632
BytebuddyAgent [candidate] (692.101 ms) : 0, 692101
GlobalTracer [baseline] (243.241 ms) : 0, 243241
GlobalTracer [candidate] (242.145 ms) : 0, 242145
AppSec [baseline] (32.176 ms) : 0, 32176
AppSec [candidate] (32.585 ms) : 0, 32585
Debugger [baseline] (6.266 ms) : 0, 6266
Debugger [candidate] (6.493 ms) : 0, 6493
Remote Config [baseline] (676.361 µs) : 0, 676
Remote Config [candidate] (694.949 µs) : 0, 695
Telemetry [baseline] (9.227 ms) : 0, 9227
Telemetry [candidate] (9.361 ms) : 0, 9361
Flare Poller [baseline] (10.124 ms) : 0, 10124
Flare Poller [candidate] (11.25 ms) : 0, 11250
section appsec
crashtracking [baseline] (1.48 ms) : 0, 1480
crashtracking [candidate] (1.468 ms) : 0, 1468
BytebuddyAgent [baseline] (718.384 ms) : 0, 718384
BytebuddyAgent [candidate] (720.37 ms) : 0, 720370
GlobalTracer [baseline] (234.681 ms) : 0, 234681
GlobalTracer [candidate] (235.35 ms) : 0, 235350
AppSec [baseline] (174.715 ms) : 0, 174715
AppSec [candidate] (175.259 ms) : 0, 175259
Debugger [baseline] (6.096 ms) : 0, 6096
Debugger [candidate] (6.191 ms) : 0, 6191
Remote Config [baseline] (647.368 µs) : 0, 647
Remote Config [candidate] (628.305 µs) : 0, 628
Telemetry [baseline] (8.565 ms) : 0, 8565
Telemetry [candidate] (8.428 ms) : 0, 8428
Flare Poller [baseline] (3.847 ms) : 0, 3847
Flare Poller [candidate] (3.801 ms) : 0, 3801
IAST [baseline] (24.669 ms) : 0, 24669
IAST [candidate] (24.997 ms) : 0, 24997
section iast
crashtracking [baseline] (1.464 ms) : 0, 1464
crashtracking [candidate] (1.469 ms) : 0, 1469
BytebuddyAgent [baseline] (818.594 ms) : 0, 818594
BytebuddyAgent [candidate] (814.028 ms) : 0, 814028
GlobalTracer [baseline] (232.157 ms) : 0, 232157
GlobalTracer [candidate] (231.073 ms) : 0, 231073
AppSec [baseline] (35.468 ms) : 0, 35468
AppSec [candidate] (35.161 ms) : 0, 35161
Debugger [baseline] (6.21 ms) : 0, 6210
Debugger [candidate] (6.139 ms) : 0, 6139
Remote Config [baseline] (616.19 µs) : 0, 616
Remote Config [candidate] (604.001 µs) : 0, 604
Telemetry [baseline] (8.858 ms) : 0, 8858
Telemetry [candidate] (8.63 ms) : 0, 8630
Flare Poller [baseline] (4.258 ms) : 0, 4258
Flare Poller [candidate] (4.199 ms) : 0, 4199
IAST [baseline] (27.058 ms) : 0, 27058
IAST [candidate] (26.249 ms) : 0, 26249
section profiling
ProfilingAgent [baseline] (109.194 ms) : 0, 109194
ProfilingAgent [candidate] (107.757 ms) : 0, 107757
crashtracking [baseline] (1.468 ms) : 0, 1468
crashtracking [candidate] (1.426 ms) : 0, 1426
BytebuddyAgent [baseline] (720.536 ms) : 0, 720536
BytebuddyAgent [candidate] (720.324 ms) : 0, 720324
GlobalTracer [baseline] (218.745 ms) : 0, 218745
GlobalTracer [candidate] (217.916 ms) : 0, 217916
AppSec [baseline] (32.224 ms) : 0, 32224
AppSec [candidate] (32.354 ms) : 0, 32354
Debugger [baseline] (6.675 ms) : 0, 6675
Debugger [candidate] (6.529 ms) : 0, 6529
Remote Config [baseline] (723.004 µs) : 0, 723
Remote Config [candidate] (770.011 µs) : 0, 770
Telemetry [baseline] (15.225 ms) : 0, 15225
Telemetry [candidate] (15.845 ms) : 0, 15845
Flare Poller [baseline] (4.912 ms) : 0, 4912
Flare Poller [candidate] (4.089 ms) : 0, 4089
Profiling [baseline] (109.812 ms) : 0, 109812
Profiling [candidate] (108.887 ms) : 0, 108887
Loading
Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.55.0-SNAPSHOT~c96d5798e2, baseline=1.55.0-SNAPSHOT~92a857db10

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.027 s) : 0, 1026684
Total [baseline] (8.694 s) : 0, 8694468
Agent [candidate] (1.016 s) : 0, 1015546
Total [candidate] (8.652 s) : 0, 8652309
section iast
Agent [baseline] (1.154 s) : 0, 1153575
Total [baseline] (9.289 s) : 0, 9289323
Agent [candidate] (1.149 s) : 0, 1149308
Total [candidate] (9.286 s) : 0, 9286406
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.027 s -
Agent iast 1.154 s 126.89 ms (12.4%)
Total tracing 8.694 s -
Total iast 9.289 s 594.854 ms (6.8%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.016 s -
Agent iast 1.149 s 133.761 ms (13.2%)
Total tracing 8.652 s -
Total iast 9.286 s 634.097 ms (7.3%)
gantt
    title insecure-bank - break down per module: candidate=1.55.0-SNAPSHOT~c96d5798e2, baseline=1.55.0-SNAPSHOT~92a857db10

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.461 ms) : 0, 1461
crashtracking [candidate] (1.452 ms) : 0, 1452
BytebuddyAgent [baseline] (699.42 ms) : 0, 699420
BytebuddyAgent [candidate] (691.358 ms) : 0, 691358
GlobalTracer [baseline] (244.594 ms) : 0, 244594
GlobalTracer [candidate] (241.444 ms) : 0, 241444
AppSec [baseline] (32.649 ms) : 0, 32649
AppSec [candidate] (32.298 ms) : 0, 32298
Debugger [baseline] (6.377 ms) : 0, 6377
Debugger [candidate] (6.419 ms) : 0, 6419
Remote Config [baseline] (685.501 µs) : 0, 686
Remote Config [candidate] (693.666 µs) : 0, 694
Telemetry [baseline] (9.313 ms) : 0, 9313
Telemetry [candidate] (9.249 ms) : 0, 9249
Flare Poller [baseline] (10.93 ms) : 0, 10930
Flare Poller [candidate] (11.556 ms) : 0, 11556
section iast
crashtracking [baseline] (1.506 ms) : 0, 1506
crashtracking [candidate] (1.476 ms) : 0, 1476
BytebuddyAgent [baseline] (816.873 ms) : 0, 816873
BytebuddyAgent [candidate] (814.244 ms) : 0, 814244
GlobalTracer [baseline] (231.904 ms) : 0, 231904
GlobalTracer [candidate] (231.07 ms) : 0, 231070
AppSec [baseline] (35.417 ms) : 0, 35417
AppSec [candidate] (35.019 ms) : 0, 35019
Debugger [baseline] (6.087 ms) : 0, 6087
Debugger [candidate] (6.108 ms) : 0, 6108
Remote Config [baseline] (602.293 µs) : 0, 602
Remote Config [candidate] (607.498 µs) : 0, 607
Telemetry [baseline] (8.691 ms) : 0, 8691
Telemetry [candidate] (8.696 ms) : 0, 8696
Flare Poller [baseline] (4.306 ms) : 0, 4306
Flare Poller [candidate] (4.228 ms) : 0, 4228
IAST [baseline] (26.752 ms) : 0, 26752
IAST [candidate] (26.379 ms) : 0, 26379
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master PROF-12749
git_commit_date 1760943758 1760968422
git_commit_sha 92a857d c96d579
release_version 1.55.0-SNAPSHOT~92a857db10 1.55.0-SNAPSHOT~c96d5798e2
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1760970256 1760970256
ci_job_id 1187479746 1187479746
ci_pipeline_id 79777501 79777501
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-1-d8j6o8e6 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-1-d8j6o8e6 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 2 performance regressions! Performance is the same for 10 metrics, 12 unstable metrics.

scenario Δ mean http_req_duration Δ mean throughput candidate mean http_req_duration candidate mean throughput baseline mean http_req_duration baseline mean throughput
scenario:load:insecure-bank:iast_FULL:high_load worse
[+383.283µs; +1019.865µs] or [+2.690%; +7.157%]
unstable
[-48.503op/s; +17.941op/s] or [-14.833%; +5.486%]
14.952ms 311.719op/s 14.250ms 327.000op/s
scenario:load:petclinic:tracing:high_load worse
[+0.985ms; +1.823ms] or [+2.195%; +4.063%]
unstable
[-10.247op/s; +3.947op/s] or [-9.824%; +3.784%]
46.271ms 101.150op/s 44.867ms 104.300op/s
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.55.0-SNAPSHOT~c96d5798e2, baseline=1.55.0-SNAPSHOT~92a857db10
    dateFormat X
    axisFormat %s
section baseline
no_agent (4.352 ms) : 4302, 4402
.   : milestone, 4352,
iast (9.639 ms) : 9468, 9810
.   : milestone, 9639,
iast_FULL (14.25 ms) : 13961, 14540
.   : milestone, 14250,
iast_GLOBAL (10.53 ms) : 10345, 10715
.   : milestone, 10530,
profiling (8.859 ms) : 8723, 8994
.   : milestone, 8859,
tracing (8.09 ms) : 7967, 8212
.   : milestone, 8090,
section candidate
no_agent (4.403 ms) : 4353, 4453
.   : milestone, 4403,
iast (9.374 ms) : 9221, 9527
.   : milestone, 9374,
iast_FULL (14.952 ms) : 14651, 15254
.   : milestone, 14952,
iast_GLOBAL (10.804 ms) : 10608, 10999
.   : milestone, 10804,
profiling (8.676 ms) : 8532, 8820
.   : milestone, 8676,
tracing (7.857 ms) : 7743, 7972
.   : milestone, 7857,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 4.352 ms [4.302 ms, 4.402 ms] -
iast 9.639 ms [9.468 ms, 9.81 ms] 5.287 ms (121.5%)
iast_FULL 14.25 ms [13.961 ms, 14.54 ms] 9.898 ms (227.4%)
iast_GLOBAL 10.53 ms [10.345 ms, 10.715 ms] 6.178 ms (142.0%)
profiling 8.859 ms [8.723 ms, 8.994 ms] 4.506 ms (103.5%)
tracing 8.09 ms [7.967 ms, 8.212 ms] 3.738 ms (85.9%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 4.403 ms [4.353 ms, 4.453 ms] -
iast 9.374 ms [9.221 ms, 9.527 ms] 4.971 ms (112.9%)
iast_FULL 14.952 ms [14.651 ms, 15.254 ms] 10.549 ms (239.6%)
iast_GLOBAL 10.804 ms [10.608 ms, 10.999 ms] 6.401 ms (145.4%)
profiling 8.676 ms [8.532 ms, 8.82 ms] 4.273 ms (97.0%)
tracing 7.857 ms [7.743 ms, 7.972 ms] 3.454 ms (78.5%)
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.55.0-SNAPSHOT~c96d5798e2, baseline=1.55.0-SNAPSHOT~92a857db10
    dateFormat X
    axisFormat %s
section baseline
no_agent (37.95 ms) : 37640, 38261
.   : milestone, 37950,
appsec (49.439 ms) : 49009, 49869
.   : milestone, 49439,
code_origins (44.505 ms) : 44120, 44890
.   : milestone, 44505,
iast (45.252 ms) : 44849, 45655
.   : milestone, 45252,
profiling (48.791 ms) : 48364, 49218
.   : milestone, 48791,
tracing (44.867 ms) : 44487, 45248
.   : milestone, 44867,
section candidate
no_agent (37.238 ms) : 36946, 37529
.   : milestone, 37238,
appsec (49.068 ms) : 48640, 49497
.   : milestone, 49068,
code_origins (44.008 ms) : 43622, 44395
.   : milestone, 44008,
iast (45.095 ms) : 44699, 45491
.   : milestone, 45095,
profiling (48.625 ms) : 48172, 49078
.   : milestone, 48625,
tracing (46.271 ms) : 45873, 46670
.   : milestone, 46271,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 37.95 ms [37.64 ms, 38.261 ms] -
appsec 49.439 ms [49.009 ms, 49.869 ms] 11.489 ms (30.3%)
code_origins 44.505 ms [44.12 ms, 44.89 ms] 6.555 ms (17.3%)
iast 45.252 ms [44.849 ms, 45.655 ms] 7.302 ms (19.2%)
profiling 48.791 ms [48.364 ms, 49.218 ms] 10.841 ms (28.6%)
tracing 44.867 ms [44.487 ms, 45.248 ms] 6.917 ms (18.2%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 37.238 ms [36.946 ms, 37.529 ms] -
appsec 49.068 ms [48.64 ms, 49.497 ms] 11.83 ms (31.8%)
code_origins 44.008 ms [43.622 ms, 44.395 ms] 6.771 ms (18.2%)
iast 45.095 ms [44.699 ms, 45.491 ms] 7.857 ms (21.1%)
profiling 48.625 ms [48.172 ms, 49.078 ms] 11.387 ms (30.6%)
tracing 46.271 ms [45.873 ms, 46.67 ms] 9.034 ms (24.3%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master PROF-12749
git_commit_date 1760943758 1760968422
git_commit_sha 92a857d c96d579
release_version 1.55.0-SNAPSHOT~92a857db10 1.55.0-SNAPSHOT~c96d5798e2
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1760970798 1760970798
ci_job_id 1187479748 1187479748
ci_pipeline_id 79777501 79777501
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-1-5r4aw7al 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-1-5r4aw7al 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 0 unstable metrics.

Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.55.0-SNAPSHOT~c96d5798e2, baseline=1.55.0-SNAPSHOT~92a857db10
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.472 ms) : 1461, 1484
.   : milestone, 1472,
appsec (2.455 ms) : 2403, 2506
.   : milestone, 2455,
iast (2.197 ms) : 2133, 2260
.   : milestone, 2197,
iast_GLOBAL (2.243 ms) : 2179, 2306
.   : milestone, 2243,
profiling (2.05 ms) : 1999, 2102
.   : milestone, 2050,
tracing (2.023 ms) : 1974, 2072
.   : milestone, 2023,
section candidate
no_agent (1.471 ms) : 1460, 1483
.   : milestone, 1471,
appsec (2.447 ms) : 2396, 2497
.   : milestone, 2447,
iast (2.197 ms) : 2134, 2260
.   : milestone, 2197,
iast_GLOBAL (2.242 ms) : 2178, 2305
.   : milestone, 2242,
profiling (2.046 ms) : 1995, 2097
.   : milestone, 2046,
tracing (2.013 ms) : 1964, 2062
.   : milestone, 2013,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.472 ms [1.461 ms, 1.484 ms] -
appsec 2.455 ms [2.403 ms, 2.506 ms] 982.457 µs (66.7%)
iast 2.197 ms [2.133 ms, 2.26 ms] 724.49 µs (49.2%)
iast_GLOBAL 2.243 ms [2.179 ms, 2.306 ms] 770.469 µs (52.3%)
profiling 2.05 ms [1.999 ms, 2.102 ms] 578.12 µs (39.3%)
tracing 2.023 ms [1.974 ms, 2.072 ms] 550.727 µs (37.4%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.471 ms [1.46 ms, 1.483 ms] -
appsec 2.447 ms [2.396 ms, 2.497 ms] 975.569 µs (66.3%)
iast 2.197 ms [2.134 ms, 2.26 ms] 725.834 µs (49.3%)
iast_GLOBAL 2.242 ms [2.178 ms, 2.305 ms] 770.689 µs (52.4%)
profiling 2.046 ms [1.995 ms, 2.097 ms] 574.948 µs (39.1%)
tracing 2.013 ms [1.964 ms, 2.062 ms] 542.262 µs (36.9%)
Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.55.0-SNAPSHOT~c96d5798e2, baseline=1.55.0-SNAPSHOT~92a857db10
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.592 s) : 15592000, 15592000
.   : milestone, 15592000,
appsec (15.029 s) : 15029000, 15029000
.   : milestone, 15029000,
iast (18.59 s) : 18590000, 18590000
.   : milestone, 18590000,
iast_GLOBAL (17.799 s) : 17799000, 17799000
.   : milestone, 17799000,
profiling (15.093 s) : 15093000, 15093000
.   : milestone, 15093000,
tracing (14.867 s) : 14867000, 14867000
.   : milestone, 14867000,
section candidate
no_agent (15.625 s) : 15625000, 15625000
.   : milestone, 15625000,
appsec (15.049 s) : 15049000, 15049000
.   : milestone, 15049000,
iast (18.786 s) : 18786000, 18786000
.   : milestone, 18786000,
iast_GLOBAL (18.168 s) : 18168000, 18168000
.   : milestone, 18168000,
profiling (15.139 s) : 15139000, 15139000
.   : milestone, 15139000,
tracing (15.212 s) : 15212000, 15212000
.   : milestone, 15212000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.592 s [15.592 s, 15.592 s] -
appsec 15.029 s [15.029 s, 15.029 s] -563.0 ms (-3.6%)
iast 18.59 s [18.59 s, 18.59 s] 2.998 s (19.2%)
iast_GLOBAL 17.799 s [17.799 s, 17.799 s] 2.207 s (14.2%)
profiling 15.093 s [15.093 s, 15.093 s] -499.0 ms (-3.2%)
tracing 14.867 s [14.867 s, 14.867 s] -725.0 ms (-4.6%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.625 s [15.625 s, 15.625 s] -
appsec 15.049 s [15.049 s, 15.049 s] -576.0 ms (-3.7%)
iast 18.786 s [18.786 s, 18.786 s] 3.161 s (20.2%)
iast_GLOBAL 18.168 s [18.168 s, 18.168 s] 2.543 s (16.3%)
profiling 15.139 s [15.139 s, 15.139 s] -486.0 ms (-3.1%)
tracing 15.212 s [15.212 s, 15.212 s] -413.0 ms (-2.6%)

@zhengyu123 zhengyu123 added the comp: crash tracking Crash tracking label Oct 16, 2025
@zhengyu123 zhengyu123 changed the title Race on late initializing crash tracking results JVM to crash Eagerly initializing java.nio to prevent a race that may crash JVM Oct 16, 2025
@zhengyu123 zhengyu123 requested a review from jbachorik October 16, 2025 15:05
@zhengyu123 zhengyu123 marked this pull request as ready for review October 16, 2025 15:05
@zhengyu123 zhengyu123 requested a review from a team as a code owner October 16, 2025 15:05
@zhengyu123 zhengyu123 requested a review from smola October 16, 2025 15:05
@github-actions
Copy link
Contributor

Hi! 👋 Thanks for your pull request! 🎉

To help us review it, please make sure to:

  • Add at least one type, and one component or instrumentation label to the pull request

If you need help, please check our contributing guidelines.

@zhengyu123 zhengyu123 added the type: bug Bug report and fix label Oct 16, 2025
@zhengyu123 zhengyu123 marked this pull request as draft October 17, 2025 12:12
Copy link
Contributor

@mcculls mcculls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @amarziali pointed out, I'd much prefer for crash-tracking to use a simpler java.io approach then have to apply hacks to forcibly load java.nio native code earlier than would normally be necessary

@zhengyu123 zhengyu123 changed the title Eagerly initializing java.nio to prevent a race that may crash JVM Eagerly loading/initializing pthread library to prevent a race that may crash JVM Oct 20, 2025
@zhengyu123 zhengyu123 marked this pull request as ready for review October 20, 2025 13:55
@zhengyu123
Copy link
Contributor Author

As @amarziali pointed out, I'd much prefer for crash-tracking to use a simpler java.io approach then have to apply hacks to forcibly load java.nio native code earlier than would normally be necessary

Please see my comment: #9780 (comment)

@mcculls
Copy link
Contributor

mcculls commented Oct 20, 2025

As @amarziali pointed out, I'd much prefer for crash-tracking to use a simpler java.io approach then have to apply hacks to forcibly load java.nio native code earlier than would normally be necessary

Please see my comment: #9780 (comment)

I saw that, I still believe this is the wrong fix - pre-loading FileSystems.getDefault() will incur a startup cost for everyone, whether they would actually be affected by the JDK bug or not.

Fixing crash-tracking to use java.io would mitigate the main driver for this fix without having such a wide impact.

@jbachorik
Copy link
Contributor

@mcculls My worry is that we might, unknowingly, add some other code that would load pthread library under the hood and if that happens not on the main thread, we would end up with the same intermittent (although pretty rare) crash.

I don't have a great solution for this, though :(

@mcculls
Copy link
Contributor

mcculls commented Oct 20, 2025

I would much prefer to address the known situation first, which should be a straightforward fix of just switching to use java.io in the crash-tracking code.

@mcculls
Copy link
Contributor

mcculls commented Oct 20, 2025

@jbachorik @zhengyu123 for example, calling FileSystems.getDefault() early during premain could break applications that set the java.nio.file.spi.DefaultFileSystemProvider system property in their main method and expect that to control the default filesystem used by the application.

We already avoid touching java.util.logging and JMX for very similar reasons, so I'd like to avoid introducing another potential pitfall. (Side note: we have tests to detect regressions where new code unintentionally loads java.util.logging during premain, we could look into a similar test for java.nio)

https://docs.oracle.com/javase/8/docs/api/java/nio/file/FileSystems.html#getDefault--

@jbachorik
Copy link
Contributor

@mcculls Fair point. Premain is tricky. I will try and think about how we could have more systematic workaround but ok, let's start with not using nio in the crashtracking initialization first.

@zhengyu123
Copy link
Contributor Author

@jbachorik @zhengyu123 for example, calling FileSystems.getDefault() early during premain could break applications that set the java.nio.file.spi.DefaultFileSystemProvider system property in their main method and expect that to control the default filesystem used by the application.

@mcculls Thanks for the insight. I did not realize the side-effect.

We already avoid touching java.util.logging and JMX for very similar reasons, so I'd like to avoid introducing another potential pitfall. (Side note: we have tests to detect regressions where new code unintentionally loads java.util.logging, we could look into a similar test for java.nio)

Could you articulate what is the "very similar reasons" for not touching java.util.logging and JMX?

I suspect that checking java.nio is not sufficient. Other components, such as socket, also have direct dependencies on pthread, and it is not clear to me if indirect dependencies could also trigger the race ... Let's have a chat if possible.

Thanks

@mcculls
Copy link
Contributor

mcculls commented Oct 20, 2025

Could you articulate what is the "very similar reasons" for not touching java.util.logging and JMX?

Some frameworks set system properties to select a different JUL implementation, or a custom JMX builder. They may set these system properties on the command-line, but some webapp servers set them after main. Attempting to use JUL or JMX during premain can cause the wrong implementation to become locked-in, breaking the framework / application.

Using JUL / JMX can also cause log-spam and startup delays when the chosen implementation class is not available (for example if the web-app expects to set up a context class-loader to load the implementation before JUL / JMX is used.)

We've learnt the hard way that you have to be very careful about what is loaded during premain - and the general direction has been to move as much as possible out of premain.

I suspect that checking java.nio is not sufficient. Other components, such as socket, also have direct dependencies on pthread, and it is not clear to me if indirect dependencies could also trigger the race ... Let's have a chat if possible.

We have a known data point - crash-tracking's use of java.nio can lead to a rare crash. We have a simple solution available which we know won't impact other users: refactor crash-tracking to use java.io. An arguably crash-tracking should always use the fewest classes/resources as possible.

With that in place we can then monitor the situation - I suspect that this will address the situation, i.e. no further action is required. Meanwhile the underlying JDK issue is being fixed and backported as we speak.

@zhengyu123
Copy link
Contributor Author

zhengyu123 commented Oct 20, 2025

Could you articulate what is the "very similar reasons" for not touching java.util.logging and JMX?

Some frameworks set system properties to select a different JUL implementation, or a custom JMX builder. They may set these system properties on the command-line, but some webapp servers set them after main. Attempting to use JUL or JMX during premain can cause the wrong implementation to become locked-in, breaking the framework / application.

Using JUL / JMX can also cause log-spam and startup delays when the chosen implementation class is not available (for example if the web-app expects to set up a context class-loader to load the implementation before JUL / JMX is used.)

We've learnt the hard way that you have to be very careful about what is loaded during premain - and the general direction has been to move as much as possible out of premain.

I see. It was not my intention to introduce side-effect. I will try to find a replacement without side-effect.

I suspect that checking java.nio is not sufficient. Other components, such as socket, also have direct dependencies on pthread, and it is not clear to me if indirect dependencies could also trigger the race ... Let's have a chat if possible.

We have a known data point - crash-tracking's use of java.nio can lead to a rare crash. We have a simple solution available which we know won't impact other users: refactor crash-tracking to use java.io. An arguably crash-tracking should always use the fewest classes/resources as possible.

Yes, we have a known case, but cannot guarantee that is all the cases. We just happened to be able to get the artifacts of this crash, as it was from one of internal services, and we are lucky that it crashed at the spot that we can pin point the defect.

I have been seeing some very strange crashes, e.g. TLS value suddenly disappeared, which may or may not relate to this defect, but it would be good to be sure.

I believe that crash might be just one of symptoms of this unsynchronized initialization. Another symptom, that I can think of, is that one thread sees partially initialized or overwrites the key, which might not result in crash immediately, but further down the road, such as causing TLS value to disappear.

With that in place we can then monitor the situation - I suspect that this will address the situation, i.e. no further action is required. Meanwhile the underlying JDK issue is being fixed and backported as we speak.

Yes, we can backport the defect, can we force our customers to upgrade?

BTW, I am open to refactor crashtracking, if using java.io is more desirable.

@mcculls
Copy link
Contributor

mcculls commented Oct 21, 2025

@zhengyu123 btw, if there's a feature in crash-tracking that cannot be reimplemented with java.io then a compromise solution would be to call FileSystems.getDefault(); in Agent.startCrashTracking, specifically just before this line:

AgentTaskScheduler.get().execute(Agent::initializeCrashTrackingDefault);

along with a comment explaining why it was added and referencing the JDK bug.

This should address the current issue while minimizing the overall impact - and avoids introducing a pre-load call for all users during premain which would then be hard to remove later on.

@zhengyu123 zhengyu123 marked this pull request as draft October 21, 2025 15:01
@zhengyu123
Copy link
Contributor Author

I refactored crashtracking to only use classic java.io. But nio still shows up in early startup.
Then, I built custom JDK to find out where and by whom nio library is loaded. It turns out that nio is loaded very early on premain thread, before AgentTaskScheduler is started.

Followings are the threads running at the time of loading nio library.

Thread: Thread[main,5,main]
        [email protected]/java.lang.Thread.dumpThreads(Native Method)
        [email protected]/java.lang.Thread.getAllStackTraces(Thread.java:1671)
        [email protected]/sun.nio.fs.UnixNativeDispatcher.<clinit>(UnixNativeDispatcher.java:37)
        [email protected]/sun.nio.fs.UnixFileSystem.<init>(UnixFileSystem.java:65)
        [email protected]/sun.nio.fs.LinuxFileSystem.<init>(LinuxFileSystem.java:39)
        [email protected]/sun.nio.fs.LinuxFileSystemProvider.newFileSystem(LinuxFileSystemProvider.java:46)
        [email protected]/sun.nio.fs.LinuxFileSystemProvider.newFileSystem(LinuxFileSystemProvider.java:39)
        [email protected]/sun.nio.fs.UnixFileSystemProvider.<init>(UnixFileSystemProvider.java:55)
        [email protected]/sun.nio.fs.LinuxFileSystemProvider.<init>(LinuxFileSystemProvider.java:41)
        [email protected]/sun.nio.fs.DefaultFileSystemProvider.<clinit>(DefaultFileSystemProvider.java:35)
        [email protected]/java.util.zip.ZipFile$Source.<clinit>(ZipFile.java:1426)
        [email protected]/java.util.zip.ZipFile$CleanableResource.<init>(ZipFile.java:717)
        [email protected]/java.util.zip.ZipFile.<init>(ZipFile.java:251)
        [email protected]/java.util.zip.ZipFile.<init>(ZipFile.java:180)
        [email protected]/java.util.jar.JarFile.<init>(JarFile.java:346)
        [email protected]/jdk.internal.loader.URLClassPath$JarLoader.getJarFile(URLClassPath.java:827)
        [email protected]/jdk.internal.loader.URLClassPath$JarLoader$1.run(URLClassPath.java:771)
        [email protected]/jdk.internal.loader.URLClassPath$JarLoader$1.run(URLClassPath.java:764)
        [email protected]/java.security.AccessController.executePrivileged(AccessController.java:807)
        [email protected]/java.security.AccessController.doPrivileged(AccessController.java:712)
        [email protected]/jdk.internal.loader.URLClassPath$JarLoader.ensureOpen(URLClassPath.java:763)
        [email protected]/jdk.internal.loader.URLClassPath$JarLoader.<init>(URLClassPath.java:737)
        [email protected]/jdk.internal.loader.URLClassPath$3.run(URLClassPath.java:501)
        [email protected]/jdk.internal.loader.URLClassPath$3.run(URLClassPath.java:484)
        [email protected]/java.security.AccessController.executePrivileged(AccessController.java:807)
        [email protected]/java.security.AccessController.doPrivileged(AccessController.java:712)
        [email protected]/jdk.internal.loader.URLClassPath.getLoader(URLClassPath.java:483)
        [email protected]/jdk.internal.loader.URLClassPath.getLoader(URLClassPath.java:451)
        [email protected]/jdk.internal.loader.URLClassPath.getResource(URLClassPath.java:320)
        [email protected]/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:757)
        [email protected]/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:681)
        [email protected]/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:639)
        [email protected]/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)

Thread: Thread[Common-Cleaner,8,InnocuousThreadGroup]
        [email protected]/java.lang.Object.wait(Native Method)
        [email protected]/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:155)
        [email protected]/jdk.internal.ref.CleanerImpl.run(CleanerImpl.java:140)
        [email protected]/java.lang.Thread.run(Thread.java:840)
        [email protected]/jdk.internal.misc.InnocuousThread.run(InnocuousThread.java:162)

Thread: Thread[Finalizer,8,system]
        [email protected]/java.lang.Object.wait(Native Method)
        [email protected]/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:155)
        [email protected]/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:176)
        [email protected]/java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:172)

Thread: Thread[Signal Dispatcher,9,system]

Thread: Thread[Reference Handler,10,system]
        [email protected]/java.lang.ref.Reference.waitForReferencePendingList(Native Method)
        [email protected]/java.lang.ref.Reference.processPendingReferences(Reference.java:253)
        [email protected]/java.lang.ref.Reference$ReferenceHandler.run(Reference.java:215)

So, both assumptions:

  • There is race to load nio library
  • Refactoring crashtracking only using classic java.io prevents nio loading

are wrong. So, I close this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: crash tracking Crash tracking type: bug Bug report and fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants