You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add benchmark utility to profile peak memory usage (#16814)
* add benchmark utility to profile memory usage
* get memory stats from mimalloc, not procfs
* support more benchmarks
* update benchmarks/README and refactor
* fix sort-tpch output format & taplo format
* add e2e test & comments
* update description to major page fault
---------
Co-authored-by: Yongting You <[email protected]>
The `dfbench` program contains subcommands to run the various
@@ -321,6 +322,66 @@ FLAGS:
321
322
...
322
323
```
323
324
325
+
# Profiling Memory Stats for each benchmark query
326
+
The `mem_profile` program wraps benchmark execution to measure memory usage statistics, such as peak RSS. It runs each benchmark query in a separate subprocess, capturing the child process’s stdout to print structured output.
327
+
328
+
Subcommands supported by mem_profile are the subset of those in `dfbench`.
329
+
Currently supported benchmarks include: Clickbench, H2o, Imdb, SortTpch, Tpch
330
+
331
+
Before running benchmarks, `mem_profile` automatically compiles the benchmark binary (`dfbench`) using `cargo build`. Note that the build profile used for `dfbench` is not tied to the profile used for running `mem_profile` itself. We can explicitly specify the desired build profile using the `--bench-profile` option (e.g. release-nonlto). By prebuilding the binary and running each query in a separate process, we can ensure accurate memory statistics.
332
+
333
+
Currently, `mem_profile` only supports `mimalloc` as the memory allocator, since it relies on `mimalloc`'s API to collect memory statistics.
334
+
335
+
Because it runs the compiled binary directly from the target directory, make sure your working directory is the top-level datafusion/ directory, where the target/ is also located.
336
+
337
+
The benchmark subcommand (e.g., `tpch`) and all following arguments are passed directly to `dfbench`. Be sure to specify `--bench-profile` before the benchmark subcommand.
0 commit comments