forked from osandov/drgn
-
Notifications
You must be signed in to change notification settings - Fork 6
Merge master to 6.0/stage #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I've been wanting to add type hints for the _drgn C extension for awhile. The main blocker was that there is a large overlap between the documentation (in docs/api_reference.rst) and the stub file, and I really didn't want to duplicate the information. Therefore, it was a requirement that the the documentation could be generated from the stub file, or vice versa. Unfortunately, none of the existing tools that I could find supported this very well. So, I bit the bullet and wrote my own Sphinx extension that uses the stub file as the source of truth (and subsumes my old autopackage extension and gen_docstrings script). The stub file is probably incomplete/inaccurate in places, but this should be a good starting point to improve on. Closes #22.
String annotations (i.e., forward references) need to be parsed into an ast node. Do it as a transformation step immediately after parsing the source. We can also squash the constant node transformation into this one.
While we're here, make generate_dwarf_constants.py use the bundled dwarf.h, generate code that black is happy with, and use the keyword list from the standard library.
We only lazily evaluate compound type members and function type parameters, which are never void.
The plain variant is a trivial wrapper around the internal variant, so get rid of the wrapper and use the internal variant directly everywhere.
This way, languages can be identified by an index, which will be useful for adding Python bindings for drgn_language and for adding a language field to drgn_type.
For types obtained from DWARF, we determine it from the language of the CU. For other types, it can be specified manually or fall back to the default (C). Then, we can use the language for operations where the type is available.
For operations where we don't have a type available, we currently fall back to C. Instead, we should guess the language of the program and use that as the default. The heurisitic implemented here gets the language of the CU containing "main" (except for the Linux kernel, which is always C). In the future, we should allow manually overriding the automatically determined language.
Introduce bpf_inspect.py drgn script to list BPF programs and maps and
their properties unavailable to user space via kernel API.
The script was initially sent to kernel tree [1] but it was agreed that
drgn repo is a better place for it and it's a good idea to create
`tools/` directory in drgn to keep tools likes this. See [2] for
details.
The main use-case bpf_inspect.py covers is to show BPF programs attached
to other BPF programs via freplace/fentry/fexit mechanisms introduced
recently. There is no user-space API to get this info and, for example,
bpftool can show all BPF programs but can't show if program A replaces a
function in program B.
Example:
% sudo tools/bpf_inspect.py p | grep test_pkt_access
650: BPF_PROG_TYPE_SCHED_CLS test_pkt_access
654: BPF_PROG_TYPE_TRACING test_main linked:[650->25: BPF_TRAMP_FEXIT test_pkt_access->test_pkt_access()]
655: BPF_PROG_TYPE_TRACING test_subprog1 linked:[650->29: BPF_TRAMP_FEXIT test_pkt_access->test_pkt_access_subprog1()]
656: BPF_PROG_TYPE_TRACING test_subprog2 linked:[650->31: BPF_TRAMP_FEXIT test_pkt_access->test_pkt_access_subprog2()]
657: BPF_PROG_TYPE_TRACING test_subprog3 linked:[650->21: BPF_TRAMP_FEXIT test_pkt_access->test_pkt_access_subprog3()]
658: BPF_PROG_TYPE_EXT new_get_skb_len linked:[650->16: BPF_TRAMP_REPLACE test_pkt_access->get_skb_len()]
659: BPF_PROG_TYPE_EXT new_get_skb_ifindex linked:[650->23: BPF_TRAMP_REPLACE test_pkt_access->get_skb_ifindex()]
660: BPF_PROG_TYPE_EXT new_get_constant linked:[650->19: BPF_TRAMP_REPLACE test_pkt_access->get_constant()]
It can be seen that there is a program test_pkt_access, id 650 and there
are multiple other tracing and ext programs attached to functions in
test_pkt_access.
For example the line:
658: BPF_PROG_TYPE_EXT new_get_skb_len linked:[650->16: BPF_TRAMP_REPLACE test_pkt_access->get_skb_len()]
means that BPF program new_get_skb_len, id 658, type BPF_PROG_TYPE_EXT
replaces (BPF_TRAMP_REPLACE) function get_skb_len() that has BTF id 16
in BPF program test_pkt_access, prog id 650.
Just very simple output is supported now but it can be extended in the
future if needed.
The script is extendable and currently implements two subcommands:
* prog (alias: p) to list all BPF programs;
* map (alias: m) to list all BPF maps;
Developer can simply tweak the script to print interesting pieces of
programs or maps.
More examples of output:
% sudo tools/bpf_inspect.py p | shuf -n 3
81: BPF_PROG_TYPE_CGROUP_SOCK_ADDR tw_ipt_bind
94: BPF_PROG_TYPE_CGROUP_SOCK_ADDR tw_ipt_bind
43: BPF_PROG_TYPE_KPROBE kprobe__tcp_reno_cong_avoid
% sudo tools/bpf_inspect.py m | shuf -n 3
213: BPF_MAP_TYPE_HASH errors
30: BPF_MAP_TYPE_ARRAY sslwall_setting
41: BPF_MAP_TYPE_LRU_HASH flow_to_snd
Help:
% sudo tools/bpf_inspect.py
usage: bpf_inspect.py [-h] {prog,p,map,m} ...
drgn script to list BPF programs or maps and their properties
unavailable via kernel API.
See https://github.com/osandov/drgn/ for more details on drgn.
optional arguments:
-h, --help show this help message and exit
subcommands:
{prog,p,map,m}
prog (p) list BPF programs
map (m) list BPF maps
[1] https://lore.kernel.org/bpf/20200228201514.GB51456@rdna-mbp/T/
[2] https://lore.kernel.org/bpf/20200228201514.GB51456@rdna-mbp/T/#mefed65e8a98116bd5d07d09a570a3eac46724951
Signed-off-by: Andrey Ignatov <[email protected]>
`examples/linux/bpf.py` was superseded by `tools/bpf_inspect.py` so no reason to keep it around anymore. Remove it. Signed-off-by: Andrey Ignatov <[email protected]>
We should be looking at the kind of the previous token, not the kind of the unexpected token. Closes #52.
We need to keep the Program alive for its types to stay valid, not just the objects the Program has pinned. (I have no idea why I changed this in commit 565e034 ("libdrgn: make symbol index pluggable with callbacks").)
Instead, print a warning (unless in quiet mode).
The upcoming vmtest rework won't have any block devices, so let's add a loop device so that we always have a device to test with.
shartse
approved these changes
Mar 27, 2020
delphix-devops-bot
pushed a commit
that referenced
this pull request
Sep 27, 2025
The CI has intermittently been hitting the following test failures on
Python 3.8 with Clang:
======================================================================
ERROR: test_task_cpu (tests.linux_kernel.helpers.test_sched.TestSched)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/runner/work/drgn/drgn/tests/linux_kernel/helpers/test_sched.py", line 40, in test_task_cpu
with fork_and_stop(os.sched_setaffinity, 0, (cpu,)) as (pid, _):
File "/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/contextlib.py", line 113, in __enter__
return next(self.gen)
File "/home/runner/work/drgn/drgn/tests/linux_kernel/__init__.py", line 203, in fork_and_stop
ret = pickle.load(pipe_r)
EOFError: Ran out of input
The EOFError occurs because the forked process segfaults immediately:
python[132]: segfault at 7f8f87085014 ip 00007f8f891e9774 sp 00007ffccf7acf00 error 4 in ld-linux-x86-64.so.2[16774,7f8f891d5000+2a000] likely on CPU 0 (core 0, socket 0)
The segfault is on dereferencing cache_new in in _dl_load_cache_lookup()
in ld-linux here:
https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/dl-cache.c;h=88bf78ad7c914b02109d6ddef7e08c0e8fd4574d;hb=f94f6d8a3572840d3ba42ab9ace3ea522c99c0c2#l489
Which is coming from a libomp fork handler:
#0 0x00007f5566f9d774 in _dl_load_cache_lookup (name=name@entry=0x7f55654afde6 "libmemkind.so")
at ./elf/dl-cache.c:498
#1 0x00007f5566f91982 in _dl_map_object (loader=loader@entry=0x55f8a170b670,
name=name@entry=0x7f55654afde6 "libmemkind.so", type=type@entry=2, trace_mode=trace_mode@entry=0,
mode=mode@entry=-1879048191, nsid=<optimized out>) at ./elf/dl-load.c:2193
#2 0x00007f5566f959a9 in dl_open_worker_begin (a=a@entry=0x7fffcf5851f0) at ./elf/dl-open.c:534
#3 0x00007f5566b4ab08 in __GI__dl_catch_exception (exception=exception@entry=0x7fffcf585050,
operate=operate@entry=0x7f5566f95900 <dl_open_worker_begin>, args=args@entry=0x7fffcf5851f0)
at ./elf/dl-error-skeleton.c:208
#4 0x00007f5566f94f9a in dl_open_worker (a=a@entry=0x7fffcf5851f0) at ./elf/dl-open.c:782
#5 0x00007f5566b4ab08 in __GI__dl_catch_exception (exception=exception@entry=0x7fffcf5851d0,
operate=operate@entry=0x7f5566f94f60 <dl_open_worker>, args=args@entry=0x7fffcf5851f0)
at ./elf/dl-error-skeleton.c:208
#6 0x00007f5566f9534e in _dl_open (file=<optimized out>, mode=-2147483647, caller_dlopen=0x7f55653fa882, nsid=-2,
argc=9, argv=<optimized out>, env=0x55f8a1477e10) at ./elf/dl-open.c:883
#7 0x00007f5566a6663c in dlopen_doit (a=a@entry=0x7fffcf585460) at ./dlfcn/dlopen.c:56
#8 0x00007f5566b4ab08 in __GI__dl_catch_exception (exception=exception@entry=0x7fffcf5853c0, operate=<optimized out>,
args=<optimized out>) at ./elf/dl-error-skeleton.c:208
#9 0x00007f5566b4abd3 in __GI__dl_catch_error (objname=0x7fffcf585418, errstring=0x7fffcf585420,
mallocedp=0x7fffcf585417, operate=<optimized out>, args=<optimized out>) at ./elf/dl-error-skeleton.c:227
#10 0x00007f5566a6612e in _dlerror_run (operate=operate@entry=0x7f5566a665e0 <dlopen_doit>,
args=args@entry=0x7fffcf585460) at ./dlfcn/dlerror.c:138
#11 0x00007f5566a666c8 in dlopen_implementation (dl_caller=<optimized out>, mode=<optimized out>, file=<optimized out>)
at ./dlfcn/dlopen.c:71
#12 ___dlopen (file=<optimized out>, mode=<optimized out>) at ./dlfcn/dlopen.c:81
#13 0x00007f55653fa882 in ?? () from /usr/lib/llvm-14/lib/libomp.so.5
#14 0x00007f5565413556 in ?? () from /usr/lib/llvm-14/lib/libomp.so.5
#15 0x00007f5565421d1a in ?? () from /usr/lib/llvm-14/lib/libomp.so.5
#16 0x00007f5566ac0fc1 in __run_fork_handlers (who=who@entry=atfork_run_child, do_locking=do_locking@entry=true)
at ./posix/register-atfork.c:130
#17 0x00007f5566ac08d3 in __libc_fork () at ./posix/fork.c:108
#18 0x00007f5566e108ad in os_fork_impl (module=<optimized out>) at ./Modules/posixmodule.c:6250
#19 os_fork (module=<optimized out>, _unused_ignored=<optimized out>) at ./Modules/clinic/posixmodule.c.h:2750
This doesn't happen in Python 3.9, which I bisected to CPython commit
45a78f906d2d ("bpo-44434: Don't call PyThread_exit_thread() explicitly
(GH-26758)") (in v3.11, backported to v3.9.6).
That commit describes a different symptom where the process aborts
because libgcc_s can't be loaded. I don't understand how that issue can
cause our crash, but the fix appears to be the same. The discussion also
suggests a workaround: linking to libgcc_s explicitly.
Apply the workaround, which appears to fix our problem. We only do this
for the CI and not for the general build for a few reasons:
1. I'm nervous about explicitly linking to this low-level library
unconditionally, and the logic to decide when it's necessary (only
for Python 3.8 and glibc) isn't worth the trouble.
2. The situation required to hit it (drgn + Python threading + fork) is
unlikely outside of our test suite.
3. Python 3.8 is EOL.
4. Builds with libkdumpfile already pull in libgcc_s via libkdumpfile ->
libsnappy -> libstdc++ -> libgcc_s.
Signed-off-by: Omar Sandoval <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.