Skip to content

Conversation

@satyajanga
Copy link
Contributor

@satyajanga satyajanga commented Jul 8, 2025

We noticed this issue when processing a Meta internal RISCV coredump, currently the RISCV instruction emulation is not handling the Prologue and Epilogue instructions.

Also function unwinding using the instruction emulation is also not implemented.

This PR handles both of these issues.

NOTE: Not sure of the historic reason why this is done this way. This is done in https://reviews.llvm.org/D131759
in contrast MIPS, PPC, ARM all support this.

Test Plan:
Unfortunately there is no easy way to add testing for this. No RISCV hardware at the disposal. I welcome the suggestions.

cc: @clayborg

@satyajanga satyajanga changed the title Address gaps in RISCV function unwinding Implement RISCV function unwinding using instruction emulation Jul 8, 2025
@satyajanga satyajanga marked this pull request as ready for review July 8, 2025 16:44
@satyajanga satyajanga requested a review from JDevlieghere as a code owner July 8, 2025 16:44
@llvmbot llvmbot added the lldb label Jul 8, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 8, 2025

@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-lldb

Author: None (satyajanga)

Changes

We noticed this issue when processing a Meta internal RISCV coredump, currently the RISCV instruction emulation is not handling the Prologue and Epilogue instructions.

Also function unwinding using the instruction emulation is also not implemented.

This PR handles both of these issues.

NOTE: Not sure of the historic reason why this is done this way. This is done in https://reviews.llvm.org/D131759
in contrast MIPS, PPC, ARM all support this.

Test Plan:
Unfortunately there is no easy way to add testing for this. No RISCV hardware at the disposal. I welcome the suggestions.


Full diff: https://github.com/llvm/llvm-project/pull/147434.diff

2 Files Affected:

  • (modified) lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.cpp (+20)
  • (modified) lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.h (+7-4)
diff --git a/lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.cpp b/lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.cpp
index 2adde02aca3a1..90537587c0b23 100644
--- a/lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.cpp
+++ b/lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.cpp
@@ -1899,4 +1899,24 @@ RISCVSingleStepBreakpointLocationsPredictor::HandleAtomicSequence(
   return bp_addrs;
 }
 
+bool EmulateInstructionRISCV::CreateFunctionEntryUnwind(
+    UnwindPlan &unwind_plan) {
+  unwind_plan.Clear();
+  unwind_plan.SetRegisterKind(eRegisterKindLLDB);
+
+  UnwindPlan::Row row;
+
+  // Our previous Call Frame Address is the stack pointer
+  row.GetCFAValue().SetIsRegisterPlusOffset(gpr_sp_riscv, 0);
+  row.SetRegisterLocationToSame(gpr_fp_riscv, /*must_replace=*/false);
+
+  unwind_plan.AppendRow(std::move(row));
+  unwind_plan.SetSourceName("EmulateInstructionRISCV");
+  unwind_plan.SetSourcedFromCompiler(eLazyBoolNo);
+  unwind_plan.SetUnwindPlanValidAtAllInstructions(eLazyBoolYes);
+  unwind_plan.SetUnwindPlanForSignalTrap(eLazyBoolNo);
+  unwind_plan.SetReturnAddressRegister(gpr_ra_riscv);
+  return true;
+}
+
 } // namespace lldb_private
diff --git a/lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.h b/lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.h
index 3578a4ab03053..f5692efb03bd9 100644
--- a/lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.h
+++ b/lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.h
@@ -57,11 +57,12 @@ class EmulateInstructionRISCV : public EmulateInstruction {
 
   static bool SupportsThisInstructionType(InstructionType inst_type) {
     switch (inst_type) {
-    case eInstructionTypeAny:
-    case eInstructionTypePCModifying:
+    case lldb_private::eInstructionTypeAny:
+    case lldb_private::eInstructionTypePrologueEpilogue:
       return true;
-    case eInstructionTypePrologueEpilogue:
-    case eInstructionTypeAll:
+
+    case lldb_private::eInstructionTypePCModifying:
+    case lldb_private::eInstructionTypeAll:
       return false;
     }
     llvm_unreachable("Fully covered switch above!");
@@ -94,6 +95,8 @@ class EmulateInstructionRISCV : public EmulateInstruction {
   std::optional<RegisterInfo> GetRegisterInfo(lldb::RegisterKind reg_kind,
                                               uint32_t reg_num) override;
 
+  bool CreateFunctionEntryUnwind(UnwindPlan &unwind_plan) override;
+
   std::optional<DecodeResult> ReadInstructionAt(lldb::addr_t addr);
   std::optional<DecodeResult> Decode(uint32_t inst);
   bool Execute(DecodeResult inst, bool ignore_cond);

@DavidSpickett DavidSpickett changed the title Implement RISCV function unwinding using instruction emulation [lldb] Implement RISCV function unwinding using instruction emulation Jul 9, 2025
@DavidSpickett
Copy link
Collaborator

Unfortunately there is no easy way to add testing for this. No RISCV hardware at the disposal. I welcome the suggestions.

Make a core file that does not include any internal data and then it can be tested on any system. If you have to satisfy internal policies around that, consider obj2yaml-ing the file, and then either using the yaml in the test, or just using the yaml format as a way to hack out any internal data then convert it back to an object.

I know I personally ok'd a RISC-V core file change before that did not include a test, but that was a mistake and I should have pushed harder for a test case.

(for the avoidance of doubt, this next part is from the perspective of an upstream maintainer of LLDB who cares purely about the health of this project overall, and it does not represent the opinion of my employer)

Side note: If your organisation is using RISC-V tools more and more, consider supporting upstream testing of those tools. For example via Alex Bradbury's work on testing clang and llvm using qemu-system. There's only so long RISC-V LLDB can go on with spot testing, eventually it needs something comprehensive.

@JDevlieghere JDevlieghere requested a review from jasonmolenda July 9, 2025 22:56
Copy link
Collaborator

@jasonmolenda jasonmolenda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not work. The EmulateInstructionRISCV plugin today is capable of tracking where a given instruction will branch, so lldb can do a software-based instruction step operation (put a breakpoint on the target instruction, resume the cpu) for processors that do not have an instruction-step primitive. However, it does not track the information that UnwindAssemblyInstEmulation needs to create an UnwindPlan that accurately reflects when a function is using the stack pointer, when it is using a frame pointer, where it spills register to stack and where it loads them back into registers again. None of those features are present in EmulateInstructionRISCV today, and the UnwindPlans that this produces will not be correct. A good way to see this is load a binary in a riscv core with DWARF eh_frame or debug_frame. Then do image show-unwind -n FUNCNAME and lldb will show the UnwindPlan instructions at different offsets in the function for both the eh_frame/debug_frame UnwindPlans AND for the assembly emulation UnwindPlan, and if there is any divergence, it is likely a bug in the assembly emulation.

switch (inst_type) {
case eInstructionTypeAny:
case eInstructionTypePCModifying:
case lldb_private::eInstructionTypeAny:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is inside a namespace lldb_private, please don't add lldb_private:: specifiers.

case eInstructionTypePrologueEpilogue:
case eInstructionTypeAll:

case lldb_private::eInstructionTypePCModifying:
Copy link
Collaborator

@jasonmolenda jasonmolenda Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EmulateInstructionRISCV is currently used to correctly predict where a given instruction will branch to next, for instruction stepping on hardware that does not support instruction step. By changing eInstructionTypePCModifying to return false from SupportsThisInstructionType, you're going to break that use of this. Did you copy and paste this from another EmulateInstruction plugin? Modifying this plugin without a much deeper understanding of what it is doing & testing it carefully against a corpus of functions is not going to work; this isn't a minor edit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I told Satya to try this, but it seems they must have been already emulating PC modifying instructions. We don't get any backtrace if we don't say we handle eInstructionTypePrologueEpilogue and when we changed it we did get a backtrace. If we don't return true for eInstructionTypePrologueEpilogue then EmulateInstructionRISCV::CreateFunctionEntryUnwind() never gets called. But it sounds like we need to modify the instruction emulator to handle the prologue opcodes for this patch.

@jasonmolenda
Copy link
Collaborator

If you look at an existing EmulateInstruction target plugin that we live on today, such as ARM64, it returns context types for UnwindAssemblyInstEmulation like

eContextAdjustBaseRegister
eContextAdjustStackPointer
eContextAdvancePC
eContextImmediate
eContextPopRegisterOffStack
eContextPushRegisterOnStack
eContextReadOpcode
eContextRegisterLoad
eContextRegisterStore
eContextRelativeBranchImmediate
eContextRestoreStackPointer
eContextSetFramePointer

as it is emulating the prologue and epilogue instructions on ARM64. I don't think the existing EmulateInstructionRISCV plugin provides any of these, or at least not the important ones that I spot checked quickly.

@clayborg
Copy link
Collaborator

clayborg commented Jul 15, 2025

Unfortunately there is no easy way to add testing for this. No RISCV hardware at the disposal. I welcome the suggestions.

Make a core file that does not include any internal data and then it can be tested on any system. If you have to satisfy internal policies around that, consider obj2yaml-ing the file, and then either using the yaml in the test, or just using the yaml format as a way to hack out any internal data then convert it back to an object.

obj2yaml doesn't work well at all for core files. Core files have only program headers and obj2yaml doesn't allow program header to specify data. Only sections in the section header can have data. So if you obj2yaml a core file, you end up with a useless file that won't recreate things correctly.

@JDevlieghere
Copy link
Member

I think we've reached a point where we need to look into first-class support for building and running RISC-V tests on QEMU. I can't sign up to do the work, but I'm happy to help with reviews etc.

@DavidSpickett helpfully created a page about QEMU testing for system emulation. Maybe a first step could be to extend that for RISC-V?

@labath
Copy link
Collaborator

labath commented Jul 16, 2025

Just to throw a couple of other ideas:

  • write a unit test for the instruction emulator (other architectures have one)
  • extend yaml2obj to work on core files
  • use minidumps which are supported in yaml2obj (for unwind plan generation/instruction emulation to work, you don't actually need a core file that contains anything. It just needs to be there so that lldb can construct the register context)
  • teach lldb to emulate instructions without a core file. (by removing the requirement to have a (real) register context for unwind plan construction)

@jasonmolenda
Copy link
Collaborator

The general discussion of how to test this class of change is interesting, but I don't want to lose sight of the fact that this PR will not work. It does nothing but break existing functionality for RISC-V users (disabling instruction emulation used to do instruction stepping). It's not just a matter of how to automated test; it was not tested while it was being developed at-desk, let alone in CI. There's nothing here, currently, to test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants