proposal: improved compiler test suite (compiler errors, run and compare, translate-c, etc.)

Requirement for #89 

# Proposal

An improved test harness for the compiler that supports the following:

- Utilizes a directory structure of real source files (Zig, C, and assembly) to define compiler test cases.
- Test settings and assertions are configured using a manifest file. The manifest file can be embedded in the first (top) comment of a Zig file or it can exist alongside a test file.
- Multiple test types:
    - `run` - Build and run to completion and compare output
    - `error` - Build and expect specific error output
    - `translate-c` - Build a “.c” file and assert an expected “zig” file is produced. The Zig file can further optionally define a subsequent test to run (i.e. `run`) so you can translate C then optionally assert the proper behavior also happens.
    - `incremental` - Build and incrementally update a program. Each incremental step is itself a test case so that we can support incremental updates introducing errors, and a subsequent update fixing the errors. [stage2-only]
    - `cli` - Run a specific `zig` CLI command (typically, `zig build test`) to test the Zig CLI or Zig build machinery.
- Concurrency. Test cases will run on a thread pool similar to compilation.
- Test filtering by name, test type, or backend.
- Test dependencies. Tests can note that they are dependent on another test in order to optimize parallelism or build order.
- Adding or modifying tests will not trigger a full rebuild of the compiler.

This will be built on the existing `test-stage2` harness (`src/test.zig`) and won’t a totally new implementation. This is almost purely iteratively improving the existing harness rather than writing anything new. 

The tests will continue to invoked using `zig build test-stage2` or similar targets.

**NOTE:** Some of these are not new features, but are features that are currently unimplemented in the stage2 test harness or just need slight improvement in the stage2 test harness.

## Backwards Compatibility

A limited amount of backwards compatibility with the existing compiler test API will be retained so we can incrementally introduce these test cases. In some cases, certain test types such as stack trace tests and certain native backend tests will continue to use the Zig API and *will not* support the file-based definition format. Therefore, we will continue to support the necessary subset of the Zig API.

Note that changes to these tests will not gain the benefits of the above harness. For example, changes will trigger a full rebuild. They won’t be parallelized as well, etc. However, these tests tend to have very few cases and/or are updated very infrequently.

# Test Definitions

Tests are now defined using files within a directory structure rather than a Zig API. This has benefits:

1. Adding, changing, or removing a test case does not cause the full test harness (including the compiler) to rebuild.
2. Adding, changing, or removing test cases is a lot more contributor friendly.

## Directory Structure

The directory structure has no meaning to the test definitions. Any test can go into any directory, and multiple tests can go into a single directory. The test harness determines test mode and settings using the manifest (defined below). **Why?** Originally, Andrew and I felt the directory structure _should_ imply some stuff such as backend (stage1 vs stage2). But we pulled back on this initial plan because many tests should behave the same on stage1 AND stage2, and incremental tests want to test multiple test modes.

However, we may wish to enforce directory idioms, such as all stage1-exclusive tests going into a `tests/stage1` directory. The test harness itself will not care about these details.

The name is determined by the filename. It is an error to have duplicate test names. Therefore, two tests named `foo.zig` and `foo.S` would produce an error since they are both named “foo” despite having different extensions.

## Test Manifests

### Manifest Syntax

The first line of the manifest is the test type.

Subsequent non-empty lines are `key=value` configuration for that test type.

An empty line followed by additional data is “trailing” configuration that is dependent on the test type. For example, for `error` tests, it defines the expected error output to match.

```
error
backend=stage1,stage2
output_mode=exe

:3:19: error: foo
```

```
run

I am expected stdout! Hello!
```

```
cli

build test
```

### Manifest Location

The test manifest can be either **embedded** or **adjacent**. Only Zig and C files support **embedded** manifests. 

A manifest must have a test. However, a test does not require a manifest. There a handful of implicit manifest conditions defined in a later section.

For embedded manifests, the manifest is the _last_ comment in a Zig or C file. The comment is not stripped for test types that compile the source. Example, filename `my_test.zig`:

```
// other comments, not the manifest

export fn main() void {
  // code
}

// error
// output_mode=lib
//
// 7:2: error: bar
```

Manifests can also be defined adjacent to a test. In this case, the manifest file must have the same filename as the test file and end in the `.manifest` suffix. Adjacent manifests are supported specifically for non-Zig tests and multi-file or incremental tests.

Example:

`my_test.zig`:

```
export fn main() void {
  // code
}
```

`my_test.manifest`:

```
error
output_mode=lib

7:2: error: bar
```

## Implicit Manifests

Whilst a manifest requires a test, a test does not require a manifest. If a manifest is not found for a test file, a default `run` manifest is assumed (must compile and exit with exit code 0).

## Incremental Tests

Tests that test that the stage2 compiler can do incremental compilation have an additional special filename format: each incremental compilation step is suffixed numerically to denote ordering.

For example, `foo_0.zig`, `foo_1.zig` would denote that the test named “foo” is incremental and will run test case 0 followed by test case 1 within the same stage2 compilation context. Each individual test case (`foo_0`, `foo_1`) can define their own test mode. This enables us to test that the incremental compiler handles errors after succeses and so on.

For naming, the following test names will be created and filterable:

- “foo” - This will run the full `foo` incremental sequence and all cases within foo.
- “foo_1” - This will run all the incremental steps *up to and including* step 1, but *no further*.

It is an error for the following conditions:

- Incremental test ordering cannot have holes. You cannot have `foo_0` and `foo_2` without a `foo_1`. Likewise, you cannot have a `foo_1` without a `foo_0`.
- Incremental tests cannot conflict with non-incremental tests. You cannot have a `foo_0.zig` and a `foo.zig`. This is an error since it would create a duplicate test name “foo” whilst also being unclear to contributors.
- Incremental tests must start with a `_0` suffix. The `_0` is not implied. For example `foo.zig` and `foo_1.zig` do not comprise an incremental test case, the harness will see `_1` as an incremental case with a hole missing `_0` and error.

# Test Execution and Ordering

The test harness will do the following at a very high-level:

1. Collect and queue all tests by recursively traversing the test case directory (filtering happens here).
2. Execute the tests using a threadpool in the order they were traversed.

There is no guarantee on test ordering.

Test failures are output to the console immediately upon happening. Failures are not buffered in memory except for the minimal amount of memory needed to build the error message. In the case of OOM, a message will be outputted with the test name that failed but no further information (if it reached this point, the filename is already in-memory).

## Backends

Backends are specified using the `backends` manifest configuration option that is available for all test types. This is a comma-separated list of the backends to test against. The backend can be prefixed with a `!` to test all backends but exclude that specific backend.

Supported backends:

- `stage1`
- `stage2` - Stage2 using the default backend (LLVM or native).

### Native Backends

There are handful of tests for the native backends. These are primarily testing things that behavior tests test. According to Andrew, they’re historically only there for backends before they are able to execute behavior tests using at the least the simplified test runner. Therefore, the plan for now is to remove these since they’re tested via behavior tests, or keep them using the old Zig API.

## Test Filtering

The `-Dtest-filter` build option will be used to filter tests by name.

Different test targets from the `zig build` command will be used to filter by test types, i.e. error tests vs run tests.

# Test Types

## Run Tests

Run tests build an exe and run it to completion. Any exit code other than zero is a test failure. Additionally, output can be specified and it will be asserted to match the output on stdout byte for byte.

The following type-specific manifest options are supported:

- `translate-c` (default: false) - If non-empty, C files will be processed through `translate-c` prior to being run. Without this, any C files will use the Zig compiler frontend for C (clang at the time of writing). 

Manifest trailing data is the stdout data to assert. stderr is not matched.

Run tests use the suffix of the associated test file to perform the proper build. For example, `.zig` files go through the normal Zig compilation pipeline. `.S` files are assembled and linked. 

Example, a run test that just asserts exit code 0 and ignores all output:

```
//! run
```

Example, a run test that asserts output in addition to the exit code being 0:

```
//! run
//!
//! Hello, World!
//!
```

## Error Tests

Error tests build an exe, obj, or lib and assert that specific errors exist. This only tests errors that happen during compilation and _not_ during runtime. The built exe (if the output mode is exe) is not executed. Runtime errors should be checked using the `run` test type with a panic handler.

The compiler is expected to error. If the compiler is subprocessed (i.e. testing stage1), then a non-zero exit code will be expected.

The following type-specific manifest options are supported:

- `output_mode` (default: `obj`) - one of `exe`, `lib`, `obj`. This is the compilation output mode.
- `is_test` (default: false) - non-empty specifies this has `is_test` set on the compilation mode.

The manifest trailing data is the error output to assert. One error is expected per line, and will be matched using a string subset match. The subset match lets you omit things like filename. Non-empty trailing data is required.

Example:

```
// error
//
// :1:11: error: use of undeclared identifier 'B'
```

Example that specifies an output mode:

```
// error
// output_mode=lib
//
// :1:11: error: use of undeclared identifier 'B'
```

Example that tests errors in test mode:

```
// error
// is_test=1
//
// :1:11: error: use of undeclared identifier 'B'
```

## Translate C Tests

`translate-c` tests test that a C file is translated into a specific Zig file by testing one or more substring matches against the resulting Zig source after translation.

This test has no type-specific configuration.

The manifest trailing data are the one or more sets of strings to compare against. Each set is separated by a newline followed by a comma. 

Example of a single match:

```
// translate-c
//
//pub const struct_Color = extern struct {
//    r: u8,
//    g: u8,
//    b: u8,
//};
```

Example of multiple matches, in which case every match must found:

```
// translate-c
//
//pub const struct_Color = extern struct {
//    r: u8,
//    g: u8,
//    b: u8,
//};
//,
//const struct_unnamed_2 = extern struct {};
```

## Deprecated or Ignored Test Types

The following test types will be deprecated since they are just a special case of the file-defineable test types. In all cases, these tests are simply migrated, not lost.

- `assemble_and_link` - This is just the `run` test case above where the run input supports assembly files.
- `runtime_safety` - This is a special case of `run`.

The following test types are **ignored and kept as-is**. They are tests that aren’t frequently updated and don’t have many cases:

- `stack_traces`
- `standalone` - This looks like something we can convert to this new test suite later, but is ignored for now due to some complexities and details that we should probably iron out in a dedicated issue.
- native-backend compiler error and run tests, more details on this in the "backends" section in this proposal

# Abandoned Ideas

The ideas below were initially discussed or existed but ultimately abandoned:

### Explicit Test Dependencies

One configuration setting that can be specified in the manifest for all test types is an `after` setting. This is a comma-separated list of tests that you want to be executed first before your test.

This is optional. Tests should not have side effects and therefore the ordreing should not matter. However, this feature can be used to point to other tests that test functionality that is used but not tested by the current test, therefore optimizing test suite speed.

### CLI Tests

CLI tests execute specific `zig` CLI commands and assert exit code zero. 

CLI tests have no type-specific manifest options.

The manifest trailing data specifies one command to run per line (all assumed to be subcommands of `zig`). This lets you test multiple targets seprately. They are run in order. If no trailing data is specified, `zig build test` is run.

CLI tests can run against any Zig binary, so these are a good way to test consistent behavior against both a stage1 and stage2 binary build.

Example, with a custom target:

```
cli

build test-custom-target
```

Example, running a file:

```
cli

run hello.zig
```

Example, running and building:

```
cli

run hello.zig
build
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

proposal: improved compiler test suite (compiler errors, run and compare, translate-c, etc.) #11288

Proposal

Backwards Compatibility

Test Definitions

Directory Structure

Test Manifests

Manifest Syntax

Manifest Location

Implicit Manifests

Incremental Tests

Test Execution and Ordering

Backends

Native Backends

Test Filtering

Test Types

Run Tests

Error Tests

Translate C Tests

Deprecated or Ignored Test Types

Abandoned Ideas

Explicit Test Dependencies

CLI Tests

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

proposal: improved compiler test suite (compiler errors, run and compare, translate-c, etc.) #11288

Description

Proposal

Backwards Compatibility

Test Definitions

Directory Structure

Test Manifests

Manifest Syntax

Manifest Location

Implicit Manifests

Incremental Tests

Test Execution and Ordering

Backends

Native Backends

Test Filtering

Test Types

Run Tests

Error Tests

Translate C Tests

Deprecated or Ignored Test Types

Abandoned Ideas

Explicit Test Dependencies

CLI Tests

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions