-
Notifications
You must be signed in to change notification settings - Fork 297
Overnight benchmarks #4583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Overnight benchmarks #4583
Changes from all commits
Commits
Show all changes
52 commits
Select commit
Hold shift + click to select a range
dfd19f5
Benchmarks Nox session improvements.
trexfeathers 0b640fb
Benchmarks Nox session improvements.
trexfeathers 43fa726
GHA overnight benchmark action.
trexfeathers cf4273f
Merge remote-tracking branch 'upstream/main' into overnight_benchmarks
trexfeathers befee3e
GHA testing.
trexfeathers 579a3e5
GHA testing
trexfeathers 46f4297
Fix benchmark GHA Nox invocation.
trexfeathers a1064ea
GHA testing.
trexfeathers d4609f8
GHA testing.
trexfeathers c50480b
Benchmark GHA fetch entire Git history.
trexfeathers eb2f201
GHA testing.
trexfeathers ac781f3
GHA testing.
trexfeathers 56e13e1
GHA benchmarks correct directory.
trexfeathers 9245cb3
Benchmark GHA authenticate GH CLI.
trexfeathers 06165e9
GHA testing.
trexfeathers 3b4003a
GHA testing.
trexfeathers f4bd66e
Benchmark GHA cd into performance-shifts.
trexfeathers 5cd35c8
GHA testing.
trexfeathers 51fd201
Benchmarks GHA correct file iteration.
trexfeathers a22da0d
GHA testing.
trexfeathers 98472b5
GHA testing.
trexfeathers aee7507
GHA testing.
trexfeathers 906d5f1
GHA testing.
trexfeathers c14f748
GHA testing.
trexfeathers d695484
GHA testing.
trexfeathers 616f4f5
GHA testing.
trexfeathers ed28ffc
GHA benchmark escape backticks.
trexfeathers 5dc7b70
Benchmark GHA restore behaviour after testing.
trexfeathers f18673a
Benchmark GHA correct string wrapping.
trexfeathers 3104a3c
Benchmark GHA better string formatting.
trexfeathers b631eed
GHA benchmark more issue formatting.
trexfeathers 8683bad
Benchmark GHA restore correct settings after testing.
trexfeathers b82b846
Benchmark GHA minor improvements.
trexfeathers f35f15a
Nox correct posargs syntax.
trexfeathers 807c35c
Remove Nox benchmarks session outdated note about environments.
trexfeathers 4d856f9
Nox correct posargs syntax.
trexfeathers 54cee4a
Benchmarks README.
trexfeathers 3861678
Nox benchmarks session note about accuracy and runtime.
trexfeathers 340b975
Benchmarks Nox improved posargs handling.
trexfeathers 41fac3e
Benchmark GHA cope with there being no commits since yesterday.
trexfeathers 84cbea7
Merge remote-tracking branch 'upstream/main' into overnight_benchmarks
trexfeathers 759e82a
Update to benchmarks README.
trexfeathers 339ce63
Minor noxfile docstring correction.
trexfeathers 74b5de5
Remove need for Nox benchmarks session to use Conda venv.
trexfeathers d0fce15
Benchmark GHA get the FINAL PR from the commit string.
trexfeathers a017b83
ASV show-stderr in Nox benchmark session.
trexfeathers 0a3fb73
String quote benchmark GHA CRON.
trexfeathers 08ec07f
Benchmark GHA more sensible title comment.
trexfeathers b1fd505
Benchmark Nox asv_command_type > asv_subcommand.
trexfeathers 6a4fcf9
Benchmark Nox check correct run_type.
trexfeathers f69ba64
Benchmark Nox clarify shifts_dir.
trexfeathers 9808a9b
Benchmark README clarify BENCHMARK_DATA.
trexfeathers File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
# Iris Performance Benchmarking | ||
|
||
Iris uses an [Airspeed Velocity](https://github.com/airspeed-velocity/asv) | ||
(ASV) setup to benchmark performance. This is primarily designed to check for | ||
performance shifts between commits using statistical analysis, but can also | ||
be easily repurposed for manual comparative and scalability analyses. | ||
|
||
The benchmarks are automatically run overnight | ||
[by a GitHub Action](../.github/workflows/benchmark.yml), with any notable | ||
shifts in performance being flagged in a new GitHub issue. | ||
|
||
## Running benchmarks | ||
|
||
`asv ...` commands must be run from this directory. You will need to have ASV | ||
installed, as well as Nox (see | ||
[Benchmark environments](#benchmark-environments)). | ||
|
||
[Iris' noxfile](../noxfile.py) includes a `benchmarks` session that provides | ||
conveniences for setting up before benchmarking, and can also replicate the | ||
automated overnight run locally. See the session docstring for detail. | ||
|
||
### Environment variables | ||
|
||
* ``DATA_GEN_PYTHON`` - required - path to a Python executable that can be | ||
used to generate benchmark test objects/files; see | ||
[Data generation](#data-generation). The Nox session sets this automatically, | ||
but will defer to any value already set in the shell. | ||
* ``BENCHMARK_DATA`` - optional - path to a directory for benchmark synthetic | ||
test data, which the benchmark scripts will create if it doesn't already | ||
exist. Defaults to ``<root>/benchmarks/.data/`` if not set. | ||
|
||
## Writing benchmarks | ||
|
||
[See the ASV docs](https://asv.readthedocs.io/) for full detail. | ||
|
||
### Data generation | ||
**Important:** be sure not to use the benchmarking environment to generate any | ||
test objects/files, as this environment changes with each commit being | ||
benchmarked, creating inconsistent benchmark 'conditions'. The | ||
[generate_data](./benchmarks/generate_data/__init__.py) module offers a | ||
solution; read more detail there. | ||
|
||
### ASV re-run behaviour | ||
|
||
Note that ASV re-runs a benchmark multiple times between its `setup()` routine. | ||
This is a problem for benchmarking certain Iris operations such as data | ||
realisation, since the data will no longer be lazy after the first run. | ||
Consider writing extra steps to restore objects' original state _within_ the | ||
benchmark itself. | ||
|
||
If adding steps to the benchmark will skew the result too much then re-running | ||
can be disabled by setting an attribute on the benchmark: `number = 1`. To | ||
maintain result accuracy this should be accompanied by increasing the number of | ||
repeats _between_ `setup()` calls using the `repeat` attribute. | ||
`warmup_time = 0` is also advisable since ASV performs independent re-runs to | ||
estimate run-time, and these will still be subject to the original problem. | ||
|
||
### Scaling / non-Scaling Performance Differences | ||
|
||
When comparing performance between commits/file-type/whatever it can be helpful | ||
to know if the differences exist in scaling or non-scaling parts of the Iris | ||
functionality in question. This can be done using a size parameter, setting | ||
one value to be as small as possible (e.g. a scalar `Cube`), and the other to | ||
be significantly larger (e.g. a 1000x1000 `Cube`). Performance differences | ||
might only be seen for the larger value, or the smaller, or both, getting you | ||
closer to the root cause. | ||
|
||
## Benchmark environments | ||
|
||
We have disabled ASV's standard environment management, instead using an | ||
environment built using the same Nox scripts as Iris' test environments. This | ||
is done using ASV's plugin architecture - see | ||
[asv_delegated_conda.py](asv_delegated_conda.py) and the extra config items in | ||
[asv.conf.json](asv.conf.json). | ||
|
||
(ASV is written to control the environment(s) that benchmarks are run in - | ||
minimising external factors and also allowing it to compare between a matrix | ||
of dependencies (each in a separate environment). We have chosen to sacrifice | ||
these features in favour of testing each commit with its intended dependencies, | ||
controlled by Nox + lock-files). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.