You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Feb 25, 2025. It is now read-only.
[Impeller] Implement framebuffer-fetch via subpasses in Vulkan without extensions.
* Subpasses are not exposed in the HAL and the need for subpasses in Vulkan can
be determined based on the presence and use of input-attachments in the
shaders. This information is already reflected by the compiler. Because of
this, all references to subpasses have been removed from APIs above the HAL.
* `RenderPassBuilderVK` is a lightweight object used to generate render passes
to use either with the pipelines (compat, base, or per-subpass) or during
rendering along with the framebuffer. Using the builder also sets up the
right subpass dependencies. As long as the builder contains compatible
attachments and subpass counts, different subpasses stamped by the builder
(via the `Build` method) are guaranteed to be compatible per the rules in the
spec.
* Pass attachments are now in the `eGeneral` layout. There was no observable
difference in performance when manually inserting the right transitions.
Except, a lot of transitions needed to be inserted. If we need it, we can add
it back in short order. I wouldn't be averse to adding it if reviewers
insist.
* Depending on where the subpass cursor is for each command, a different
pipeline variant is necessary. For instance, if a command uses a pipeline at
subpass 1 of 10 and that same pipeline is reused later in say subpass 6, the
variant for subpass 1 is not suitable for subpass 6
(`VkGraphicsPipelineCreateInfo::subpass(uint32_t)` is part of the compat
rules). Creation of these subpass variants from the lone-pass (subpass index
0 of count 1) prototype is done via a preload operation with jobs submitted
to a concurrent worker. The preload can only happen once the number of passes
needed can be determined. On mobile and desktop devices at hand, the
observation was that the variants obtained from the prototype already in a
`PipelineCacheVK` was extremely fast. Even so, once the variant is obtained,
it is cached in `PipelineVK`. Notably, this is not present in
`PipelineCacheVK`. That top-level cache only contains prototypes for the
lone-pass pipeline configuration. The allows for purging of subpass variants
of which there can theoretically be an unbounded number, and also a single
point where subpass prototype creation can be elided completely if the
`rasterization_order_attachment_access` extension is present.
* Speaking of the `rasterization_order_attachment_access` extension, its use has
been removed in this patch. I am prototyping adding it back to measure the
overhead introduced by manual subpass management. If the overhead is
measurable, we can use the extension on devices that have it as an added
optimization.
* The complexity of command encoding remains linear (to the number of commands)
per pass.
* This patch only works on a single color attachment being used as an input
attachment. While this is sufficient for current use cases, the Metal
implementation is significantly more capable since the multiple attachments
and attachment types (depth) are already supported. Rounding out support for
this is in progress.
* This patch contains some test harness updates for MoltenVK that will be backed
out and submitted separately.
0 commit comments