Skip to content

Conversation

@aokblast
Copy link
Contributor

FreeBSD coredump uses program headers to store mmap information. It is possible for program to use more than PN_XNUM mmaps. Therefore, we implement the support of PN_XNUM in readelf.

@llvmbot
Copy link
Member

llvmbot commented Oct 27, 2025

@llvm/pr-subscribers-llvm-binary-utilities

Author: None (aokblast)

Changes

FreeBSD coredump uses program headers to store mmap information. It is possible for program to use more than PN_XNUM mmaps. Therefore, we implement the support of PN_XNUM in readelf.


Full diff: https://github.com/llvm/llvm-project/pull/165278.diff

4 Files Affected:

  • (added) llvm/test/tools/llvm-readobj/ELF/Inputs/many-segments.o.gz ()
  • (added) llvm/test/tools/llvm-readobj/ELF/invalid-e_phnum.test (+39)
  • (added) llvm/test/tools/llvm-readobj/ELF/many-segments.test (+79)
  • (modified) llvm/tools/llvm-readobj/ELFDumper.cpp (+50-22)
diff --git a/llvm/test/tools/llvm-readobj/ELF/Inputs/many-segments.o.gz b/llvm/test/tools/llvm-readobj/ELF/Inputs/many-segments.o.gz
new file mode 100644
index 0000000000000..0709ed1d6389e
Binary files /dev/null and b/llvm/test/tools/llvm-readobj/ELF/Inputs/many-segments.o.gz differ
diff --git a/llvm/test/tools/llvm-readobj/ELF/invalid-e_phnum.test b/llvm/test/tools/llvm-readobj/ELF/invalid-e_phnum.test
new file mode 100644
index 0000000000000..a174742af7192
--- /dev/null
+++ b/llvm/test/tools/llvm-readobj/ELF/invalid-e_phnum.test
@@ -0,0 +1,39 @@
+# RUN: yaml2obj --docnum=1 %s -o %t.o
+
+# RUN: llvm-readobj --headers %t.o 2>&1 | FileCheck %s --check-prefix=CASE-INVALID
+
+# CASE-INVALID: SectionHeaderOffset: 0
+# CASE-INVALID: ProgramHeaderCount: 65535 (corrupt)
+# CASE-INVALID: unable to dump program headers: program headers are longer than binary of size 336: e_phoff = 0x40, e_phnum = 65535, e_phentsize = 56
+
+--- !ELF
+FileHeader:
+  Class:   ELFCLASS64
+  Data:    ELFDATA2LSB
+  Type:    ET_EXEC
+  Machine: EM_X86_64
+  EPhNum:  65535
+  EShOff:  0
+ProgramHeaders:
+  - Type: PT_LOAD
+
+# RUN: yaml2obj --docnum=2 %s -o %t2.o
+
+# RUN: llvm-readobj --headers %t2.o 2>&1 | FileCheck %s --check-prefix=CASE-VALID
+
+# CASE-VALID: SectionHeaderOffset: 0
+# CASE-VALID: ProgramHeaderCount: 65535 (65536)
+# CASE-VALID: unable to dump program headers: program headers are longer than binary of size 336: e_phoff = 0x40, e_phnum = 65536, e_phentsize = 56
+
+--- !ELF
+FileHeader:
+  Class:   ELFCLASS64
+  Data:    ELFDATA2LSB
+  Type:    ET_EXEC
+  Machine: EM_X86_64
+  EPhNum:  65535
+Sections:
+  - Type: SHT_NULL
+    Info: 65536
+ProgramHeaders:
+  - Type: PT_LOAD
diff --git a/llvm/test/tools/llvm-readobj/ELF/many-segments.test b/llvm/test/tools/llvm-readobj/ELF/many-segments.test
new file mode 100644
index 0000000000000..2f154ddf53899
--- /dev/null
+++ b/llvm/test/tools/llvm-readobj/ELF/many-segments.test
@@ -0,0 +1,79 @@
+## Show that llvm-readelf can handle an input file with many segments.
+
+RUN: %python %p/../../llvm-objcopy/Inputs/ungzip.py %p/Inputs/many-segments.o.gz > %t
+RUN: llvm-readobj --file-headers --sections --segments %t | FileCheck %s
+RUN: llvm-readelf --segments %t | FileCheck --check-prefix=SYMS %s
+
+## The ELF header should have e_phnum == PN_XNUM
+# CHECK:        ProgramHeaderCount: 65535 (66549)
+## The first section header should store the real program header count in its fields.
+# CHECK:      Section {
+# CHECK-NEXT:   Index: 0
+# CHECK-NEXT:   Name:
+# CHECK-NEXT:   Type: SHT_NULL
+# CHECK-NEXT:   Flags [
+# CHECK-NEXT:   ]
+# CHECK-NEXT:   Address:
+# CHECK-NEXT:   Offset:
+# CHECK-NEXT:   Size:
+# CHECK-NEXT:   Link:
+# CHECK-NEXT:   Info: 66549
+
+## Show that the symbols with segments indexes around the reserved range still
+## have the right segment indexes afterwards.
+# 65535th segment
+# CHECK:         Offset: 0x1183B000
+# CHECK-NEXT:	 VirtualAddress: 0x349139F3000
+# CHECK:		 }
+# CHECK-NEXT  ProgramHeader {
+# CHECK-NEXT    Type: PT_LOAD (0x1)
+# CHECK-NEXT    Offset: 0x1183C000
+# CHECK-NEXT    VirtualAddress: 0x349139F4000
+# CHECK-NEXT    PhysicalAddress: 0x0
+# CHECK-NEXT    FileSize: 4096
+# CHECK-NEXT    MemSize: 4096
+# CHECK-NEXT    Flags [ (0x4)
+# CHECK-NEXT      PF_R (0x4)
+# CHECK-NEXT    ]
+# CHECK-NEXT    Alignment: 4096
+# CHECK-NEXT  }
+# CHECK-NEXT  ProgramHeader {
+# CHECK-NEXT    Type: PT_LOAD (0x1)
+# CHECK-NEXT    Offset: 0x1183D000
+# CHECK-NEXT    VirtualAddress: 0x349139F5000
+# CHECK-NEXT    PhysicalAddress: 0x0
+# CHECK-NEXT    FileSize: 4096
+# CHECK-NEXT    MemSize: 4096
+# CHECK-NEXT    Flags [ (0x6)
+# CHECK-NEXT      PF_R (0x4)
+# CHECK-NEXT      PF_W (0x2)
+# CHECK-NEXT    ]
+# CHECK-NEXT    Alignment: 4096
+# CHECK-NEXT  }
+# CHECK-NEXT  ProgramHeader {
+# CHECK-NEXT    Type: PT_LOAD (0x1)
+# CHECK-NEXT    Offset: 0x1183E000
+# CHECK-NEXT    VirtualAddress: 0x349139F6000
+# CHECK-NEXT    PhysicalAddress: 0x0
+# CHECK-NEXT    FileSize: 4096
+# CHECK-NEXT    MemSize: 4096
+# CHECK-NEXT    Flags [ (0x4)
+# CHECK-NEXT      PF_R (0x4)
+# CHECK-NEXT    ]
+# CHECK-NEXT    Alignment: 4096
+# CHECK-NEXT  }
+# CHECK        ProgramHeader {
+# CHECK-NEXT    Type: PT_LOAD (0x1)
+# CHECK-NEXT    Offset: 0x11C31000
+# CHECK-NEXT    VirtualAddress: 0x30D8E7868000
+# CHECK-NEXT    PhysicalAddress: 0x0
+# CHECK-NEXT    FileSize: 8192
+# CHECK-NEXT    MemSize: 8192
+# CHECK-NEXT    Flags [ (0x6)
+# CHECK-NEXT      PF_R (0x4)
+# CHECK-NEXT      PF_W (0x2)
+# CHECK-NEXT    ]
+# CHECK-NEXT    Alignment: 4096
+# CHECK-NEXT  }
+
+# SYMS: There are 66549 program headers, starting at offset 64
diff --git a/llvm/tools/llvm-readobj/ELFDumper.cpp b/llvm/tools/llvm-readobj/ELFDumper.cpp
index 9c9b2dd79e686..22431c4396ca5 100644
--- a/llvm/tools/llvm-readobj/ELFDumper.cpp
+++ b/llvm/tools/llvm-readobj/ELFDumper.cpp
@@ -3572,45 +3572,62 @@ static inline void printFields(formatted_raw_ostream &OS, StringRef Str1,
   OS.flush();
 }
 
+template <class ELFT>
+static std::string getProgramHeadersNumString(const ELFFile<ELFT> &Obj,
+                                              StringRef FileName) {
+
+  if (Obj.getHeader().e_phnum != ELF::PN_XNUM)
+    return to_string(Obj.getHeader().e_phnum);
+
+  Expected<uint32_t> PhNumOrErr = Obj.getPhNum();
+  if (!PhNumOrErr) {
+    // In this case we can ignore an error, because we have already reported a
+    // warning about the broken section header table earlier.
+    consumeError(PhNumOrErr.takeError());
+    return "<?>";
+  }
+
+  if (*PhNumOrErr == ELF::PN_XNUM)
+    return "65535 (corrupt)";
+  return "65535 (" + to_string(*PhNumOrErr) + ")";
+}
+
 template <class ELFT>
 static std::string getSectionHeadersNumString(const ELFFile<ELFT> &Obj,
                                               StringRef FileName) {
-  const typename ELFT::Ehdr &ElfHeader = Obj.getHeader();
-  if (ElfHeader.e_shnum != 0)
-    return to_string(ElfHeader.e_shnum);
+  if (Obj.getHeader().e_shnum != 0)
+    return to_string(Obj.getHeader().e_shnum);
 
-  Expected<ArrayRef<typename ELFT::Shdr>> ArrOrErr = Obj.sections();
-  if (!ArrOrErr) {
+  Expected<uint64_t> ShNumOrErr = Obj.getShNum();
+  if (!ShNumOrErr) {
     // In this case we can ignore an error, because we have already reported a
     // warning about the broken section header table earlier.
-    consumeError(ArrOrErr.takeError());
+    consumeError(ShNumOrErr.takeError());
     return "<?>";
   }
 
-  if (ArrOrErr->empty())
+  if (*ShNumOrErr == 0)
     return "0";
-  return "0 (" + to_string((*ArrOrErr)[0].sh_size) + ")";
+  return "0 (" + to_string(*ShNumOrErr) + ")";
 }
 
 template <class ELFT>
 static std::string getSectionHeaderTableIndexString(const ELFFile<ELFT> &Obj,
                                                     StringRef FileName) {
-  const typename ELFT::Ehdr &ElfHeader = Obj.getHeader();
-  if (ElfHeader.e_shstrndx != SHN_XINDEX)
-    return to_string(ElfHeader.e_shstrndx);
+  if (Obj.getHeader().e_shstrndx != ELF::SHN_XINDEX)
+    return to_string(Obj.getHeader().e_shstrndx);
 
-  Expected<ArrayRef<typename ELFT::Shdr>> ArrOrErr = Obj.sections();
-  if (!ArrOrErr) {
+  Expected<uint32_t> ShStrNdxOrErr = Obj.getShStrNdx();
+  if (!ShStrNdxOrErr) {
     // In this case we can ignore an error, because we have already reported a
     // warning about the broken section header table earlier.
-    consumeError(ArrOrErr.takeError());
+    consumeError(ShStrNdxOrErr.takeError());
     return "<?>";
   }
 
-  if (ArrOrErr->empty())
+  if (*ShStrNdxOrErr == ELF::SHN_XINDEX)
     return "65535 (corrupt: out of range)";
-  return to_string(ElfHeader.e_shstrndx) + " (" +
-         to_string((*ArrOrErr)[0].sh_link) + ")";
+  return "65535 (" + to_string(*ShStrNdxOrErr) + ")";
 }
 
 static const EnumEntry<unsigned> *getObjectFileEnumEntry(unsigned Type) {
@@ -3765,7 +3782,7 @@ template <class ELFT> void GNUELFDumper<ELFT>::printFileHeaders() {
   printFields(OS, "Size of this header:", Str);
   Str = to_string(e.e_phentsize) + " (bytes)";
   printFields(OS, "Size of program headers:", Str);
-  Str = to_string(e.e_phnum);
+  Str = getProgramHeadersNumString(this->Obj, this->FileName);
   printFields(OS, "Number of program headers:", Str);
   Str = to_string(e.e_shentsize) + " (bytes)";
   printFields(OS, "Size of section headers:", Str);
@@ -4778,8 +4795,10 @@ void GNUELFDumper<ELFT>::printProgramHeaders(
     return;
 
   if (PrintProgramHeaders) {
-    const Elf_Ehdr &Header = this->Obj.getHeader();
-    if (Header.e_phnum == 0) {
+    Expected<uint32_t> PhNumOrErr = this->Obj.getPhNum();
+    if (!PhNumOrErr) {
+      OS << '\n' << errorToErrorCode(PhNumOrErr.takeError()).message() << '\n';
+    } else if (*PhNumOrErr == 0) {
       OS << "\nThere are no program headers in this file.\n";
     } else {
       printProgramHeaders();
@@ -4795,10 +4814,18 @@ template <class ELFT> void GNUELFDumper<ELFT>::printProgramHeaders() {
   const Elf_Ehdr &Header = this->Obj.getHeader();
   Field Fields[8] = {2,         17,        26,        37 + Bias,
                      48 + Bias, 56 + Bias, 64 + Bias, 68 + Bias};
+  uint32_t PhNum;
+  if (Expected<uint32_t> PhNumOrErr = this->Obj.getPhNum())
+    PhNum = *PhNumOrErr;
+  else {
+    OS << '\n' << errorToErrorCode(PhNumOrErr.takeError()).message() << '\n';
+    return;
+  }
+
   OS << "\nElf file type is "
      << enumToString(Header.e_type, ArrayRef(ElfObjectFileType)) << "\n"
      << "Entry point " << format_hex(Header.e_entry, 3) << "\n"
-     << "There are " << Header.e_phnum << " program headers,"
+     << "There are " << PhNum << " program headers,"
      << " starting at offset " << Header.e_phoff << "\n\n"
      << "Program Headers:\n";
   if (ELFT::Is64Bits)
@@ -7470,7 +7497,8 @@ template <class ELFT> void LLVMELFDumper<ELFT>::printFileHeaders() {
       W.printFlags("Flags", E.e_flags);
     W.printNumber("HeaderSize", E.e_ehsize);
     W.printNumber("ProgramHeaderEntrySize", E.e_phentsize);
-    W.printNumber("ProgramHeaderCount", E.e_phnum);
+    W.printString("ProgramHeaderCount",
+                  getProgramHeadersNumString(this->Obj, this->FileName));
     W.printNumber("SectionHeaderEntrySize", E.e_shentsize);
     W.printString("SectionHeaderCount",
                   getSectionHeadersNumString(this->Obj, this->FileName));

@jh7370
Copy link
Collaborator

jh7370 commented Oct 29, 2025

@aokblast, is this ready for review? It's currently still attempting to merge into another of your user branches, whereas if it's ready for review, I'd expect it to be merging into main.

@aokblast
Copy link
Contributor Author

aokblast commented Oct 29, 2025

@aokblast, is this ready for review? It's currently still attempting to merge into another of your user branches, whereas if it's ready for review, I'd expect it to be merging into main.

Hello, yes, it is ready to review. But I create a user branch based on the one you approved yesterday temporarily since MaskRay haven't accepted that patch.
If he doesn't responese in tomorrow, I will merge that patch so that this patch can be merged into main.

In ELF file, there is a possible extended header for those phnum, shnum,
and shstrndx larger than the maximum of 16 bits. This extended header
use section 0 to record these fields in 32 bits.

We implment this feature so that programs rely on ELFFile::program_headers() can get the
correct number of segments. Also, the consumers don't have to check the
section 0 themselve, insteead, they can use the getPhNum() as an
alternative.
@aokblast aokblast force-pushed the users/aokblast/elf/fix_65535err branch from d661311 to cb48096 Compare October 30, 2025 16:38
…ments.

FreeBSD coredump uses program headers to store mmap information. It is possible
for program to use more than PN_XNUM mmaps. Therefore, we implement the support
of PN_XNUM in readelf.
@aokblast aokblast force-pushed the users/aokblast/readelf/pxnum_support branch from 1dc7fc0 to fe99baf Compare October 30, 2025 16:46
Expected<uint32_t> PhNumOrErr = Obj.getPhNum();
if (!PhNumOrErr) {
// In this case we can ignore an error, because we have already reported a
// warning about the broken section header table earlier.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels fragile and it's also not the right use-case for consumeError. llvm-readobj has reportUniqueWarning to ensure the same warning isn't printed more than once.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't resolve comment threads that I have initiated.

StringRef FileName) {

if (Obj.getHeader().e_phnum != ELF::PN_XNUM)
return to_string(Obj.getHeader().e_phnum);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be using Obj.getPhNum(), as we discussed in the other review that clients should avoid referencing e_phnum directly (although I acknowledge that the check versus PN_XNUM is still needed in this context).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we still need to ignore the error for the following reason.

I have the following change in my local tree:

+std::string
+ELFDumper<ELFT>::getProgramHeadersNumString(const ELFFile<ELFT> &Obj,
+                                            StringRef FileName) {
 
   Expected<uint32_t> PhNumOrErr = Obj.getPhNum();
   if (!PhNumOrErr) {
     // In this case we can ignore an error, because we have already reported a
     // warning about the broken section header table earlier.
-    consumeError(PhNumOrErr.takeError());
+    this->reportUniqueWarning(PhNumOrErr.takeError());
     return "<?>";
   }
 
-  if (*PhNumOrErr == ELF::PN_XNUM)
+  uint32_t PhNum;
+  PhNum = *PhNumOrErr;
+  if (PhNum < ELF::PN_XNUM)
+    return to_string(PhNum);
+  if (PhNum == ELF::PN_XNUM)
     return "65535 (corrupt)";
-  return "65535 (" + to_string(*PhNumOrErr) + ")";
+  return "65535 (" + to_string(PhNum) + ")";

If the section is invalid and need the Shdr0 info, it outputs twice as the following:

Format: elf64-unknown
Arch: unknown
AddressSize: 64bit
LoadName: <Not found>
ElfHeader {
  Ident {
    Magic: (7F 45 4C 46)
    Class: 64-bit (0x2)
    DataEncoding: LittleEndian (0x1)
    FileVersion: 1
    OS/ABI: SystemV (0x0)
    ABIVersion: 0
    Unused: (00 00 00 00 00 00 00)
  }
  Type: Relocatable (0x1)
  Machine: EM_NONE (0x0)
  Version: 1
  Entry: 0x0
  ProgramHeaderOffset: 0x0
  SectionHeaderOffset: 0x1000
  Flags [ (0x0)
  ]
  HeaderSize: 64
  ProgramHeaderEntrySize: 0
/home/blast/Gits/llvm-project/build/bin/llvm-readobj: warning: '/home/blast/Gits/llvm-project/build/test/tools/llvm-readobj/ELF/Output/file-headers.test.tmp.invalid2': section header table goes past the end of the file: e_shoff = 0x1000
  ProgramHeaderCount: <?>
  SectionHeaderEntrySize: 64
  SectionHeaderCount: <?>
  StringTableSectionIndex: <?>
}
/home/blast/Gits/llvm-project/build/bin/llvm-readobj: error: '/home/blast/Gits/llvm-project/build/test/tools/llvm-readobj/ELF/Output/file-headers.test.tmp.invalid2': unable to continue dumping, the file is corrupt: section header table goes past the end of the file: e_shoff = 0x1000

If the section 0 is invalid but we don't need shdr0 info, it outputs once since we don't trigger the getSection(0) error in getPhdr():

File: /home/blast/Gits/llvm-project/build/test/tools/llvm-readobj/ELF/Output/file-headers.test.tmp.invalid1
Format: elf64-unknown
Arch: unknown
AddressSize: 64bit
LoadName: <Not found>
ElfHeader {
  Ident {
    Magic: (7F 45 4C 46)
    Class: 64-bit (0x2)
    DataEncoding: LittleEndian (0x1)
    FileVersion: 1
    OS/ABI: SystemV (0x0)
    ABIVersion: 0
    Unused: (00 00 00 00 00 00 00)
  }
  Type: Relocatable (0x1)
  Machine: EM_NONE (0x0)
  Version: 1
  Entry: 0x0
  ProgramHeaderOffset: 0x0
  SectionHeaderOffset: 0x1000
  Flags [ (0x0)
  ]
  HeaderSize: 64
  ProgramHeaderEntrySize: 0
  ProgramHeaderCount: 0
  SectionHeaderEntrySize: 64
  SectionHeaderCount: 8192
  StringTableSectionIndex: 12288
}
/home/blast/Gits/llvm-project/build/bin/llvm-readobj: error: '/home/blast/Gits/llvm-project/build/test/tools/llvm-readobj/ELF/Output/file-headers.test.tmp.invalid1': unable to continue dumping, the file is corrupt: section header table goes past the end of the file: e_shoff = 0x1000

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it turns out I think there's some llvm-readobj behaviour that I think is actually slightly suboptimal. Take a look at the code around

if (Error ContentErr = Obj.initContent())
and
if (!ContentErrString.empty())
. A change was made a few years ago to allow dumping the file header even if the rest of the object was corrupt in some form. As you'll see from the linked lines, this was achieved by capturing an error produced by initContent, which attempts to read some section data, but delays printing the error until after the file header printing has happened.

The reason the error is reported at that point is to prevent other options from executing, which would require the properties populated by initContent. However, this error reporting occurs even if no other options have been specified. I think we could make a case for changing that behaviour (in a separate PR). This would mean the warning isn't duplicated in any form.

That being said, it doesn't really solve the duplicate warning problem, because if someone wants both file header and e.g. section header printing, we'll still want the initContent error to be reported after dumping the file header. I'm inclined to accept that though, given the additional context that the "cannot continue" error introduces. What do you think?

Assuming you agree with me, I think we should actually land this PR with the duplicate warnings in, then add the other PR I've suggested to change the behaviour.

if (ElfHeader.e_shnum != 0)
return to_string(ElfHeader.e_shnum);
if (Obj.getHeader().e_shnum != 0)
return to_string(Obj.getHeader().e_shnum);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, don't resolve comments that I initiated.

That being said, this code probably shouldn't change in this PR, as it's conflating PN_XNUM support with updating llvm-readobj to use the new interface we added. I suggest a separate PR that makes use of getShNum and getShStrndx.

if (ElfHeader.e_shstrndx != SHN_XINDEX)
return to_string(ElfHeader.e_shstrndx);
if (Obj.getHeader().e_shstrndx != ELF::SHN_XINDEX)
return to_string(Obj.getHeader().e_shstrndx);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

// In this case we can ignore an error, because we have already reported a
// warning about the broken section header table earlier.
consumeError(ArrOrErr.takeError());
consumeError(ShStrNdxOrErr.takeError());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

if (Header.e_phnum == 0) {
Expected<uint32_t> PhNumOrErr = this->Obj.getPhNum();
if (!PhNumOrErr) {
OS << '\n' << errorToErrorCode(PhNumOrErr.takeError()).message() << '\n';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't how warnings are printed in llvm-readobj. Please review the existing code and fix it accordingly.

if (Expected<uint32_t> PhNumOrErr = this->Obj.getPhNum())
PhNum = *PhNumOrErr;
else {
OS << '\n' << errorToErrorCode(PhNumOrErr.takeError()).message() << '\n';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above re. warnings.

@@ -0,0 +1,79 @@
## Show that llvm-readelf can handle an input file with many segments.

RUN: %python %p/../../llvm-objcopy/Inputs/ungzip.py %p/Inputs/many-segments.o.gz > %t
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not rely on a large file to test this behaviour when we don't have to.

As noted in a previous comment somewhere, you can use yaml2obj to craft an object with e_phnum set to PN_XNUM and then a low number in the SHT_NULL section header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question is that it seems not a valid ELF under your suggestions since gABI would require you have more than or equal to PN_XNUM sections when using sectionc 0.

It causes the following problem.
For example, if we have 65535 in phnum in ELF header and 4 in section 0. getPhNum returns 4.

That means we are not able to ouput 65535(4) in readelf without checking the getHeader().e_phnum == PN_XNUM since getPhNum always returns the true number.

As a result, we might not able to test 65535 (N) case without actually creating more than 65535 segments.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question is that it seems not a valid ELF under your suggestions since gABI would require you have more than or equal to PN_XNUM sections when using sectionc 0.

As I believed both @MaskRay and I have explained in different places, although the gABI requires an object to have >= PN_XNUM program headers for this functionality to be active, there is no reason to enforce that in our code, so we can use a smaller number without issue.

That means we are not able to ouput 65535(4) in readelf without checking the getHeader().e_phnum == PN_XNUM since getPhNum always returns the true number.

I may not have been clear in my other comments, but it is acceptable to check whether getHeader().e_phnum == PN_XNUM, to allow it to distinguish between the different cases, which means it should be possible to check the 65535 (N) case without lots of segments.


# CASE-INVALID: SectionHeaderOffset: 0
# CASE-INVALID: ProgramHeaderCount: 65535 (corrupt)
# CASE-INVALID: unable to dump program headers: program headers are longer than binary of size 336: e_phoff = 0x40, e_phnum = 65535, e_phentsize = 56
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't feel like the right warning to be printing. We should just be printing something about the section header table missing or similar. We should probably not continue to try to print the program header table, if the program header count cannot be determined due to an invalid section header table.

@aokblast aokblast changed the base branch from users/aokblast/elf/fix_65535err to main November 6, 2025 06:14
@aokblast
Copy link
Contributor Author

aokblast commented Nov 6, 2025

Sorry for the late reply, I give some feedback and please check it when you have time. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants