Skip to content

Conversation

@manoj-joseph
Copy link

@manoj-joseph manoj-joseph commented Sep 11, 2025

Problem

There is a small confict with the auto-merge with the upstream drgn repo.
The file .pre-commit-config.yaml has been deleted by us, but was modified
upstream.

Solution

Delete file from incomming merge.

Testing Done

https://selfservice-jenkins.eng-tools-prd.aws.delphixcloud.com/job/appliance-build-orchestrator-pre-push/12143/ RUNNING

osandov and others added 24 commits September 8, 2025 10:23
I'm getting "error: impossible constraint in 'asm'" build errors on
aarch64, which is apparently caused by compiling with -O0. We compile
drgn_test_kthread_fn* with -O0. Use more specific attributes and
barriers to achieve the same result instead.

Signed-off-by: Omar Sandoval <[email protected]>
Signed-off-by: Omar Sandoval <[email protected]>
It'd be better to not use --no-warn-return-any on vmtest, but I'd rather
not run mypy twice.

Signed-off-by: Omar Sandoval <[email protected]>
For parallel vmtests, we want a more flexible interface than the
queue-based API of download(). Refactor it into a class with methods for
specific downloads.

Signed-off-by: Omar Sandoval <[email protected]>
We don't need two synchronous APIs, so use Downloader everywhere and
fold download() into download_thread(). Some vestigial uses of
DownloadCompiler/DownloadKernel remain.

Signed-off-by: Omar Sandoval <[email protected]>
Otherwise, building the test kmod will fail if the compiler hasn't been
downloaded before.

Fixes: 033510a ("vmtest.vm: add --{build,insert}-test-kmod options")
Signed-off-by: Omar Sandoval <[email protected]>
…nd KernelFlavor

Various parts of the vmtest code go through some trouble to key
Architecture and KernelFlavor on name to avoid hashing and comparing the
other fields. Instead, we can use a dataclass with eq=False disabled so
that it's all done by identity.

Signed-off-by: Omar Sandoval <[email protected]>
Normal dicts are guaranteed to be ordered since Python 3.7.

Signed-off-by: Omar Sandoval <[email protected]>
This will be required for reliable parallel test runs. Even for serial
runs, the time to run tests is the same or slightly faster with fewer
CPUs, likely due to bottlenecking on 9pfs and less setup.

Signed-off-by: Stephen Brennan <[email protected]>
[Omar: expand commit message]
Signed-off-by: Omar Sandoval <[email protected]>
For running tests in parallel, we want to log to a file instead of
getting interleaved output.

Signed-off-by: Stephen Brennan <[email protected]>
[Omar: rebase, add rootfsbuild, remove main_thread argument superseded by pdeathsig]
Signed-off-by: Omar Sandoval <[email protected]>
The full test suite, including foreign architectures and alternative
kernel configurations, can take a long time to run. However it's mostly
work that can happen in parallel. Add a -j option to do this. By
default, everything still happens serially.

Closes osandov#489.

Co-authored-by: Stephen Brennan <[email protected]>
Signed-off-by: Stephen Brennan <[email protected]>
[Omar: rework threading model, various cleanups]
Signed-off-by: Omar Sandoval <[email protected]>
There is a window between a process being flagged as stopped and it
actually descheduling. Various stack tracing tests have been flaky with
"cannot unwind stack of running task" errors due to catching the process
in this window.

Fix it by waiting for /proc/pid/syscall to not return "running" (which
is what we did before the fixes commit, but now we don't need to check
for a specific syscall number).

Fixes: bab4f43 ("tests: replace fork_and_sigwait() and fork_and_call() with fork_and_stop()")
Signed-off-by: Omar Sandoval <[email protected]>
There are two issues with the error margin we allow for the counters in
these tests:

* VmRSS is the sum of three counters, so its error margin should also be
  tripled.
* Before the switch to per-CPU counters, the error margin was
  nr_threads * 64 * (fault_around_bytes / PAGE_SIZE).

Signed-off-by: Omar Sandoval <[email protected]>
Use typing.Deque instead.

Signed-off-by: Omar Sandoval <[email protected]>
…ntation

The semantics of this helper are really fuzzy because the underlying
timestamps are updated lazily, so let's do our best to explain it.

Signed-off-by: Omar Sandoval <[email protected]>
Especially when running vmtest in parallel, this test sometimes fails
because the rq clock hasn't been updated. Force it to update by forcing
the process to migrate CPUs.

Signed-off-by: Omar Sandoval <[email protected]>
Fixes: 91da9ac ("Migrate runq related helpers from drgn-tools")
Signed-off-by: Omar Sandoval <[email protected]>
It can be nice to modify or play around in the chroots after creation.
For example, to install new packages without rebuilding. While there is
a tool for running commands in a vmtest VM, the VMs have no network
access so they're less flexible. While the necessary command isn't
really all that complicated, it's nice to not have to think of it. Add a
new script to enter the rootfs.

Signed-off-by: Stephen Brennan <[email protected]>
The tests.linux_kernel.test_stack_trace.TestStackTrace.test_local_variable
test is failing on Arm on Linux 5.4 and 4.19. This was apparently fixed
by removing -fno-var-tracking-assignments from the compiler flags in
v5.10. Backport the patch.

Signed-off-by: Omar Sandoval <[email protected]>
In osandov#537 it was pointed out that the ability to pipe output produced by
executing Python statements would be very useful. Unfortunately the
shell redirection operators are part of the Python grammar as well, and
there are many cases of ambiguity, where a command could be split into
python code and shell pipeline in multiple valid ways.

However, these ambiguities may not be a dealbreaker. We can resolve them
by always splitting on the first shell operator which produces a valid
Python code on the left hand side. In cases where you want to force a
different interpretation, you can wrap your Python code in parentheses.
These ensure that any shell operator within the parentheses doesn't
introduce a pipeline, because the code prior to them is incomplete
without the closing parenthesis.

Signed-off-by: Stephen Brennan <[email protected]>
We need to append to KBUILD_CFLAGS, not reassign it. This was causing
weird build failures on every architecture.

Fixes: 27f069e ("vmtest.kbuild: add patch to fix missing debug info on old Arm kernels")
Signed-off-by: Omar Sandoval <[email protected]>
Add ptov command for drgn.

Signed-off-by: Ye Liu <[email protected]>
Signed-off-by: Song Hu <[email protected]>
Between this PR being tested and merged, commit 3f47e27
("vmtest.vm: Reduce smp to 2") was merged, which broke a test case that
hard-coded CPU 2. Change to getting the list of all CPUs instead.

Signed-off-by: Omar Sandoval <[email protected]>
@manoj-joseph manoj-joseph force-pushed the dlpx/pr/manoj-joseph/b0a767c6-fb18-486e-9367-70a6240931ae branch from f5991dd to 64991ef Compare September 11, 2025 17:30
@manoj-joseph manoj-joseph marked this pull request as ready for review September 11, 2025 17:32
@manoj-joseph
Copy link
Author

#74

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants