Skip to content

A crash related to ElfParser::loadSymbolTable #191

@yanglong1010

Description

@yanglong1010

Hi,

I encountered a crash today, after some investigation, I think I have found the reason.

I ran java-profiler using the command bellow. Run with Datadog Java agent can trigger this crash too (not tested).

/usr/lib/jvm/java-8-openjdk-amd64/bin/java -agentpath:/root/java-profiler/ddprof-lib/build/lib/main/release/linux/x64/libjavaProfiler.so=start,cpu=10ms,file=/tmp/ap.jfr -cp java Demo (Any Java code can reproduce)

openjdk version "1.8.0_442"
OpenJDK Runtime Environment (build 1.8.0_442-8u442-b06~us1-0ubuntu1~24.04-b06)
OpenJDK 64-Bit Server VM (build 25.442-b06, mixed mode)
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007ffff7c4527e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007ffff7c288ff in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007ffff6df6f0b in os::abort(bool) [clone .cold] () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#6  0x00007ffff7757a6d in VMError::report_and_die() () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#7  0x00007ffff759d0fd in JVM_handle_linux_signal () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#8  0x00007ffff759024c in signalHandler(int, siginfo_t*, void*) () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#9  <signal handler called>
#10 0x0000000000000000 in ?? ()
#11 0x00007ffff6b64e67 in J9Ext::GetOSThreadID (thread=0x7ffff027b930) at /root/java-profiler/ddprof-lib/src/main/cpp/j9Ext.h:97
#12 VMThread::nativeThreadId (jni=jni@entry=0x7ffff026e260, thread=thread@entry=0x7ffff027b930) at /root/java-profiler/ddprof-lib/src/main/cpp/vmStructs.cpp:713
#13 0x00007ffff6b3a811 in Profiler::updateThreadName (this=this@entry=0x7ffff0005180, jvmti=jvmti@entry=0x7ffff0019930, jni=jni@entry=0x7ffff026e260, thread=thread@entry=0x7ffff027b930, self=self@entry=true) at /root/java-profiler/ddprof-lib/src/main/cpp/profiler.cpp:935
#14 0x00007ffff6b3a92e in Profiler::onThreadStart (this=0x7ffff0005180, jvmti=0x7ffff0019930, jni=0x7ffff026e260, thread=0x7ffff027b930) at /root/java-profiler/ddprof-lib/src/main/cpp/profiler.cpp:111
#15 0x00007ffff73e522e in JvmtiExport::post_thread_start(JavaThread*) () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#16 0x00007ffff732c0d8 in JNI_CreateJavaVM () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#17 0x00007ffff7f8b45a in JavaMain () from /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/../lib/amd64/jli/libjli.so
#18 0x00007ffff7f8f961 in call_continuation () from /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/../lib/amd64/jli/libjli.so
#19 0x00007ffff7c9caa4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#20 0x00007ffff7d29c3c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

According to the stack trace, there must be something wrong with the VMStructs parsing, J9Ext should not be called on OpenJDK 8.

int VMThread::nativeThreadId(JNIEnv *jni, jthread thread) {
if (_has_native_thread_id) {
VMThread *vm_thread = fromJavaThread(jni, thread);
return vm_thread != NULL ? vm_thread->osThreadId() : -1;
}
return J9Ext::GetOSThreadID(thread);
}

Debugging with debug build of java-profiler, I found gHotSpotVMStructs can not be found, then VMStructs::initOffsets returned in line vmStructs.cpp:164.

void VMStructs::initOffsets() {
uintptr_t entry = readSymbol("gHotSpotVMStructs");
uintptr_t stride = readSymbol("gHotSpotVMStructEntryArrayStride");
uintptr_t type_offset = readSymbol("gHotSpotVMStructEntryTypeNameOffset");
uintptr_t field_offset = readSymbol("gHotSpotVMStructEntryFieldNameOffset");
uintptr_t offset_offset = readSymbol("gHotSpotVMStructEntryOffsetOffset");
uintptr_t address_offset = readSymbol("gHotSpotVMStructEntryAddressOffset");
if (entry == 0 || stride == 0) {
return;
}

After further debugging, I found some symbols are skipped in line symbols_linux.cpp:357.

if (_length == 0 || (sym->st_name < _length && sym->st_value < _length)) {

void ElfParser::loadSymbolTable(const char *symbols, size_t total_size,
size_t ent_size, const char *strings) {
for (const char *symbols_end = symbols + total_size; symbols < symbols_end;
symbols += ent_size) {
ElfSymbol *sym = (ElfSymbol *)symbols;
if (sym->st_name != 0 && sym->st_value != 0) {
// sanity check the offsets not to exceed the file size
if (_length == 0 || (sym->st_name < _length && sym->st_value < _length)) {
// Skip special AArch64 mapping symbols: $x and $d
if (sym->st_size != 0 || sym->st_info != 0 ||
strings[sym->st_name] != '$') {
_cc->add(_base + sym->st_value, (int)sym->st_size,
strings + sym->st_name);
}
}
}
}
}

In my case, the symbols are all stripped from libjvm.so, and stored in a separate file, which can be installed via apt-get install openjdk-8-dbg.
but line symbols_linux.cpp:357 compares the virtual address offset (i.e. sym->st_value, 0xdd82b8 = 14516920) with the debug file size (i.e. 2675232), and obviously, 14516920 is greater than 2675232, then symbol gHotSpotVMStructs is skipped.

I think the virtual address offset (sym->st_value,) should not be compared with debug file size (_length).

file /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=734da9c23b83138419928d48e1cc65c9b75facc3, stripped

ls -rtl /usr/lib/debug/.build-id/73/4da9c23b83138419928d48e1cc65c9b75facc3.debug
-rw-r--r-- 1 root root 2675232 Jan 26 15:38 /usr/lib/debug/.build-id/73/4da9c23b83138419928d48e1cc65c9b75facc3.debug

readelf -s -W /usr/lib/debug/.build-id/73/4da9c23b83138419928d48e1cc65c9b75facc3.debug|grep gHotSpotVMStructs
 41073: 0000000000dd82b8     8 OBJECT  GLOBAL DEFAULT   25 gHotSpotVMStructs

(gdb) p _length
$1 = 2675232

(gdb) p sym->st_name
$3 = 1685507

(gdb) p sym->st_value
$4 = 14516920

(gdb) p/x sym->st_value
$5 = 0xdd82b8

#0  ElfParser::loadSymbolTable (this=0x7ffff7bfd550, symbols=0x7ffff4af0d20 "\003\270\031", total_size=986328, ent_size=24, strings=0x7ffff4af0f60 "") at /root/java-profiler/ddprof-lib/src/main/cpp/symbols_linux.cpp:366
#1  0x00007ffff7a9be3d in ElfParser::loadSymbols (this=0x7ffff7bfd550, use_debug=false) at /root/java-profiler/ddprof-lib/src/main/cpp/symbols_linux.cpp:258
#2  0x00007ffff7a9b4f8 in ElfParser::parseFile (cc=0x7ffff00d7250, base=0x7ffff6c00000 "\177ELF\002\001\001", file_name=0x7ffff7bfd5f0 "/usr/lib/debug/.build-id/73/4da9c23b83138419928d48e1cc65c9b75facc3.debug", use_debug=false)
    at /root/java-profiler/ddprof-lib/src/main/cpp/symbols_linux.cpp:85
#3  0x00007ffff7a9c125 in ElfParser::loadSymbolsUsingBuildId (this=0x7ffff7bfe6b0) at /root/java-profiler/ddprof-lib/src/main/cpp/symbols_linux.cpp:304
#4  0x00007ffff7a9be65 in ElfParser::loadSymbols (this=0x7ffff7bfe6b0, use_debug=true) at /root/java-profiler/ddprof-lib/src/main/cpp/symbols_linux.cpp:263
#5  0x00007ffff7a9b4f8 in ElfParser::parseFile (cc=0x7ffff00d7250, base=0x7ffff6c00000 "\177ELF\002\001\001", file_name=0x7ffff0028349 "/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so", use_debug=true)
    at /root/java-profiler/ddprof-lib/src/main/cpp/symbols_linux.cpp:85
#6  0x00007ffff7a9cee4 in parseLibrariesCallback (info=0x7ffff7bfe830, size=64, data=0x7ffff7af9e20 <Libraries::instance()::instance>) at /root/java-profiler/ddprof-lib/src/main/cpp/symbols_linux.cpp:508
#7  0x00007ffff7d84002 in __GI___dl_iterate_phdr (callback=0x7ffff7a9cb6b <parseLibrariesCallback(dl_phdr_info*, size_t, void*)>, data=0x7ffff7af9e20 <Libraries::instance()::instance>) at ./elf/dl-iteratephdr.c:74
#8  0x00007ffff7a9d172 in Symbols::parseLibraries (array=0x7ffff7af9e20 <Libraries::instance()::instance>, kernel_symbols=false) at /root/java-profiler/ddprof-lib/src/main/cpp/symbols_linux.cpp:550
#9  0x00007ffff7abdaa5 in Libraries::updateSymbols (this=0x7ffff7af9e20 <Libraries::instance()::instance>, kernel_symbols=false) at /root/java-profiler/ddprof-lib/src/main/cpp/libraries.cpp:36
#10 0x00007ffff7a990cc in VM::initShared (vm=0x7ffff79d16c0 <main_vm>) at /root/java-profiler/ddprof-lib/src/main/cpp/vmEntry.cpp:205
#11 0x00007ffff7a99593 in VM::initProfilerBridge (vm=0x7ffff79d16c0 <main_vm>, attach=false) at /root/java-profiler/ddprof-lib/src/main/cpp/vmEntry.cpp:302
#12 0x00007ffff7a9a282 in Agent_OnLoad (vm=0x7ffff79d16c0 <main_vm>, options=0x7ffff0003f40 "start,cpu=10ms,file=/tmp/ap.jfr", reserved=0x0) at /root/java-profiler/ddprof-lib/src/main/cpp/vmEntry.cpp:551
#13 0x00007ffff76f6674 in Threads::create_vm_init_agents() () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#14 0x00007ffff76f92e2 in Threads::create_vm(JavaVMInitArgs*, bool*) () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#15 0x00007ffff732c010 in JNI_CreateJavaVM () from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#16 0x00007ffff7f8b45a in JavaMain () from /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/../lib/amd64/jli/libjli.so
#17 0x00007ffff7f8f961 in call_continuation () from /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/../lib/amd64/jli/libjli.so
#18 0x00007ffff7c9caa4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#19 0x00007ffff7d29c3c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

The test does not crash on Java 11, Java 17 and Java 21, but I think It's just a coincidence. The virtual address offset just happens to be smaller than the debug file size on JDK 11, JDK 17 and JDK 21.

I don't know why this check is added. If there is no real example, the simple fix is to remove this check, and I can submit a PR.

Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions