Skip to content

Commit b32fc71

Browse files
Wumpflucasmerlin
authored andcommitted
Add guidelines for image comparison tests (#5714)
Guidelines & why images may differ Based on (but slightly altered): * rerun-io/rerun#8989 (cherry picked from commit 40f002f)
1 parent c58aa8f commit b32fc71

File tree

1 file changed

+59
-0
lines changed

1 file changed

+59
-0
lines changed

crates/egui_kittest/README.md

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,3 +55,62 @@ You should add the following to your `.gitignore`:
5555
**/tests/snapshots/**/*.diff.png
5656
**/tests/snapshots/**/*.new.png
5757
```
58+
59+
### Guidelines for writing snapshot tests
60+
61+
* Whenever **possible** prefer regular Rust tests or `insta` snapshot tests over image comparison tests because…
62+
* …compared to regular Rust tests, they can be relatively slow to run
63+
* …they are brittle since unrelated side effects (like a change in color) can cause the test to fail
64+
* …images take up repo space
65+
* images should…
66+
* …be checked in or otherwise be available (egui use [git LFS](https://git-lfs.com/) files for this purpose)
67+
* …depict exactly what's tested and nothing else
68+
* …have a low resolution to avoid growth in repo size
69+
* …have a low comparison threshold to avoid the test passing despite unwanted differences (the default threshold should be fine for most usecases!)
70+
71+
### What do do when CI / another computer produces a different image?
72+
73+
The default tolerance settings should be fine for almost all gui comparison tests.
74+
However, especially when you're using custom rendering, you may observe images difference with different setups leading to unexpected test failures.
75+
76+
First check whether the difference is due to a change in enabled rendering features, potentially due to difference in hardware (/software renderer) capabilitites.
77+
Generally you should carefully enforcing the same set of features for all test runs, but this may happen nonetheless.
78+
79+
Once you validated that the differences are miniscule and hard to avoid, you can try to _carefully_ adjust the comparison tolerance setting (`SnapshotOptions::threshold`, TODO([#5683](https://github.com/emilk/egui/issues/5683)): as well as number of pixels allowed to differ) for the specific test.
80+
81+
⚠️ **WARNING** ⚠️
82+
Picking too high tolerances may mean that you are missing actual test failures.
83+
It is recommended to manually verify that the tests still break under the right circumstances as expected after adjusting the tolerances.
84+
85+
---
86+
87+
In order to avoid image differences, it can be useful to form an understanding of how they occur in the first place.
88+
89+
Discrepancies can be caused by a variety of implementation details that depend on the concrete GPU, OS, rendering backend (Metal/Vulkan/DX12 etc.) or graphics driver (even between different versions of the same driver).
90+
91+
Common issues include:
92+
* multi-sample anti-aliasing
93+
* sample placement and sample resolve steps are implementation defined
94+
* alpha-to-coverage algorithm/pattern can wary wildly between implementations
95+
* texture filtering
96+
* different implementations may apply different optimizations *even* for simple linear texture filtering
97+
* out of bounds texture access (via `textureLoad`)
98+
* implementations are free to return indeterminate values instead of clamping
99+
* floating point evaluation, for details see [WGSL spec § 15.7. Floating Point Evaluation](https://www.w3.org/TR/WGSL/#floating-point-evaluation). Notably:
100+
* rounding mode may be inconsistent
101+
* floating point math "optimizations" may occur
102+
* depending on output shading language, different arithmetic optimizations may be performed upon floating point operations even if they change the result
103+
* floating point denormal flush
104+
* even on modern implementations, denormal float values may be flushed to zero
105+
* `NaN`/`Inf` handling
106+
* whenever the result of a function should yield `NaN`/`Inf`, implementations may free to yield an indeterminate value instead
107+
* builtin-function function precision & error handling (trigonometric functions and others)
108+
* [partial derivatives (dpdx/dpdx)](https://www.w3.org/TR/WGSL/#dpdx-builtin)
109+
* implementations are free to use either `dpdxFine` or `dpdxCoarse`
110+
* [...]
111+
112+
From this follow a few simple recommendations (these may or may not apply as they may impose unwanted restrictions on your rendering setup):
113+
* avoid enabling mult-sample anti-aliasing whenever it's not explicitly tested or needed
114+
* do not rely on NaN, Inf and denormal float values
115+
* consider dedicated test paths for texture sampling
116+
* prefer explicit partial derivative functions

0 commit comments

Comments
 (0)