Skip to content

Conversation

@sdimitro
Copy link

No description provided.

osandov and others added 30 commits November 27, 2020 01:27
If virtual address translation isn't implemented for the target
architecture, then we shouldn't add the page table memory reader. If we
do, we get a DRGN_ERROR_INVALID_ARGUMENT error from
linux_helper_read_vm() instead of a DRGN_ERROR_FAULT error as expected.

Signed-off-by: Omar Sandoval <[email protected]>
Python 3.9 was released back in October. No changes to drgn are
required.

Signed-off-by: Omar Sandoval <[email protected]>
There are several places where we'd like to enforce that every
enumeration is handled in a switch. Add SWITCH_ENUM() and
SWITCH_ENUM_DEFAULT() macros for that and use them.

Signed-off-by: Omar Sandoval <[email protected]>
I'd like to use the name drgn_object_kind to distinguish between values
and references. "Encoding" is more accurate than "kind", anyways.

Signed-off-by: Omar Sandoval <[email protected]>
To prepare for a new kind of object, replace the is_reference bool with
an enum drgn_object_kind.

Signed-off-by: Omar Sandoval <[email protected]>
There are some situations where we can find an object but can't
determine its value, like local variables that have been optimized out,
inlined functions without a concrete instance, and pure virtual methods.
It's still useful to get some information from these objects, namely
their types. Let's add the concept of an "unavailable" object, which is
an object with a known type but unknown value/address.

Signed-off-by: Omar Sandoval <[email protected]>
…r address

Now that we have the concept of unavailable objects, use it for DWARF
where appropriate.

Signed-off-by: Omar Sandoval <[email protected]>
On old kernels, we set the initial frame as containing only rbp and let
libdwfl unwind it assuming frame pointers from there. This means that
the initial frame has a garbage rip. Follow the frame pointer and set
the previous rbp and return address ourselves instead.

Signed-off-by: Omar Sandoval <[email protected]>
THREAD_SIZE is still broken and I haven't looked into the root cause
(see commit 95be142 ("tests: disable THREAD_SIZE test")). We don't
need it anymore anyways, so let's remove it entirely.

Signed-off-by: Omar Sandoval <[email protected]>
This is a little cleaner and saves on conversions back and forth between
C values and objects.

Signed-off-by: Omar Sandoval <[email protected]>
…n 3.9

Python 3.9 stopped emitting ast.Index nodes, which broke skipping
parentheses around tuples when they're used as subscripts (e.g., for
generic type annotations). Fix it by removing ast.Index nodes in the
pretransformation step on old versions and then handling the new layout
where the ast.Tuple node is directly in ast.Subscript.slice. While we're
here, make sure that we don't skip the parentheses for an empty tuple in
a subscript.

Signed-off-by: Omar Sandoval <[email protected]>
Having the signature in the class line is awkward, especially when
__init__() is overloaded. Instead, document __init__() separately, but
refer to it by the name of the class. There might still be a better way
to represent this, but this is at least better than before.

Signed-off-by: Omar Sandoval <[email protected]>
Instead of the current tangle of requirements on arguments, use
overloads.

Signed-off-by: Omar Sandoval <[email protected]>
There are a couple of reasons that it was the wrong choice to have a
bit_offset for value objects:

1. When we store a buffer with a bit_offset, we're storing useless
   padding bits.
2. bit_offset describes a location, or in other words, part of an
   address. This makes sense for references, but not for values, which
   are just a bag of bytes.

Get rid of union drgn_value.bit_offset in libdrgn, make
Object.bit_offset None for value objects, and disallow passing
bit_offset to the Object() constructor when creating a value. bit_offset
can still be passed when creating an object from a buffer, but we'll
shift the bytes down as necessary to store the value with no offset.

Signed-off-by: Omar Sandoval <[email protected]>
We can get struct drgn_object down from 40 bytes to 32 bytes (on x86-64)
by moving the bit_offset and little_endian members out of the value and
reference structs.

Signed-off-by: Omar Sandoval <[email protected]>
…eters

The TypeMember and TypeParameter instances referring to a libdrgn
drgn_lazy_type are only valid as long as the Type containing them is
still alive. Hold a reference on the containing Type from LazyType. We
can do this without growing LazyType by getting rid of the enum state
and using sentinel values for LazyType::lazy_type as the state.

Signed-off-by: Omar Sandoval <[email protected]>
Now that types are associated with their program, we don't need to pass
the program separately to drgn_program_member_info() and can replace it
with a more natural drgn_type_find_member() API that takes only the type
and member name. While we're at it, get rid of drgn_member_info and
return the drgn_type_member and bit_offset directly. This also fixes a
bug that drgn_error_member_not_found() ignores the member name length.

Signed-off-by: Omar Sandoval <[email protected]>
In Python, looking up a member in a drgn Type by name currently looks
something like:

  member = [member for member in type.members if member.name == "foo"][0]

Add a Type.member(name) method, which is both easier and more efficient.

Signed-off-by: Omar Sandoval <[email protected]>
Add drgn_type_has_member() to libdrgn and Type.has_member() to the
Python bindings. This can simplify some version checks, like the one in
_for_each_block_device() since commit 9a10a92 ("helpers: fix
for_each_{disk,partition}() on kernels >= v5.1").

Signed-off-by: Omar Sandoval <[email protected]>
osandov and others added 22 commits March 10, 2021 02:07
An upcoming change will introduce a similar function for when the
section isn't known. Rename the original so that the new one can take
its name.

Signed-off-by: Omar Sandoval <[email protected]>
Revive it from all the way back in commit 90fbec0 ("dwarfindex:
delete unused read_sleb128() and read_strlen()") and add an _into_u64
variant. These will be used for parsing .debug_frame, .eh_frame, and
DWARF expressions.

Signed-off-by: Omar Sandoval <[email protected]>
Along with _into_s64 and _into_u64 variants. These will be used for
parsing .eh_frame and DWARF expressions.

Signed-off-by: Omar Sandoval <[email protected]>
These will be used for parsing .debug_frame, .eh_frame, and DWARF
expressions.

Signed-off-by: Omar Sandoval <[email protected]>
It will be used to copy register values.

Signed-off-by: Omar Sandoval <[email protected]>
These sections are needed for stack unwinding. However, .debug_frame and
.eh_frame don't need to be read right away, and .text and .got don't
need to be read at all, so partition them accordingly. Also, check that
the sections are specifically SHT_PROGBITS rather than not SHT_NOBITS.

Signed-off-by: Omar Sandoval <[email protected]>
Stack unwinding depends on some platform-specific information. If for
some reason a program has debugging information with different
platforms, then we need to make sure that while we're unwinding the
stack, we don't end up in a frame with a different platform, because the
registers won't make sense. Additionally, we should parse debugging
information using the module's platform rather than the program's
platform, which may not match. So, cache the platform derived from each
module's ELF file.

Signed-off-by: Omar Sandoval <[email protected]>
libdwfl stores registers in an array of uint64_t indexed by the DWARF
register number. This is suboptimal for a couple of reasons:

1. Although the DWARF specification states that registers should be
   numbered for "optimal density", in practice this isn't the case. ABIs
   include unused ranges of numbers and don't order registers based on
   how likely they are to be known (e.g., caller-saved registers usually
   aren't recovered while unwinding the stack, but they are often
   numbered before callee-saved registers).
2. This precludes support for registers larger than 64 bits, like SSE
   registers.

For our own unwinder, we want to store registers in an
architecture-specific format to solve both of these problems.

So, have each architecture define its layout with registers arranged for
space efficiency and convenience when parsing saved registers from core
dumps. Instead of generating an arch_foo.c file from arch_foo.c.in,
separately define the logical register order in an arch_foo.defs file,
and use it to generate an arch_foo.inc file that is included from
arch_foo.c. The layout is defined as a macro in arch_foo.c. While we're
here, drop some register definitions that aren't useful at the moment.

Then, define struct drgn_register_state to efficiently store registers
in the defined format.

Signed-off-by: Omar Sandoval <[email protected]>
In preparation for adding our own unwinder, add support for parsing and
finding DWARF/EH call frame information. Use a generic representation of
call frame information so that we can support other formats like ORC in
the future.

Signed-off-by: Omar Sandoval <[email protected]>
For DW_CFA_def_cfa_expression, DW_CFA_expression, and
DW_CFA_val_expression, we need to be evaluate a DWARF expression. Add an
interface for this. It doesn't yet support operations that aren't
applicable to CFI or some more exotic operations.

Signed-off-by: Omar Sandoval <[email protected]>
The elfutils DWARF unwinder has a couple of limitations:

1. libdwfl doesn't have an interface for getting register values, so we
   have to bundle a patched version of elfutils with drgn.
2. Error handling is very awkward: dwfl_getthread_frames() can return an
   error even on success, so we have to squirrel away our own errors in
   the callback.

Furthermore, there are a couple of things that will be easier with our
own unwinder:

1. Integrating unwinding using ORC will be easier when we're handling
   unwinding ourselves.
2. Support for local variables isn't too far away now that we have DWARF
   expression evaluation.

Now that we have the register state, CFI, and DWARF expression pieces in
place, stitch them together with the new unwinder, and tweak the public
API a bit to reflect it.

Signed-off-by: Omar Sandoval <[email protected]>
We currently bundle a version of elfutils with patches to export
additional stack tracing functionality. This has a few drawbacks:

- Most of drgn's build time is actually building elfutils.
- Distributions don't like packages that bundle verions of other
  packages.
- elfutils, and thus drgn, can't be built with clang.

Now that we've replaced the elfutils DWARF unwinder with our own, we
don't need the patches, so we can drop the bundled elfutils and fix
these issues.

Signed-off-by: Omar Sandoval <[email protected]>
I missed this when I removed the code that used it.

Fixes: eec6776 ("libdrgn: replace elfutils DWARF unwinder with our own")
Signed-off-by: Omar Sandoval <[email protected]>
There's no reason to use GCC's zero-length array extension for this. Use
a standard flexible array instead.

Signed-off-by: Omar Sandoval <[email protected]>
If a member is a bit field, then we should format it with the underlying
Object so that it shows the bit field size.

Signed-off-by: Omar Sandoval <[email protected]>
The elfutils package in Ubuntu 18.04 does not include the
libdw-dev package required by libdrgn. Add this package
to our list of dependencies and add all other dependencies
of elfutils explicitly as an attempt to avoid similar
breakages in the future.
Currently libdrgn requires libelf to be of version 0.175 or
later. This patch allows the library to be compiled with libelf
0.170 (the newest version supported by Ubuntu 18.04 LTS).

Signed-off-by: Serapheim Dimitropoulos <[email protected]>
The oldest LTS version of Ubuntu, 16.04, has elfutils 0.165. This
version is missing some ELF and DWARF definitions used by drgn. Add
copies of elf.h from glibc 2.33 and dwarf.h and elfutils/known-dwarf.h
from elfutils 0.183 to get the latest definitions and drop the minimum
required version of elfutils further to 0.165.

Signed-off-by: Omar Sandoval <[email protected]>
debian packaging: be explicit about our dependencies
@sdimitro sdimitro requested review from ahrens and prakashsurya March 23, 2021 16:54
@prakashsurya
Copy link

Why do this, rather than let linux-pkg do the merge/update?

@prakashsurya
Copy link

Oh, sorry, nevermind.. linux-pkg doesn't handle master to 6.0/stage.

Copy link

@prakashsurya prakashsurya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should auto-backport things? E.g. enable the "sync-with-master" github action?

@sdimitro
Copy link
Author

@prakashsurya yes we should activate that!

@sdimitro sdimitro merged commit fd321b5 into 6.0/stage Mar 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

7 participants