Skip to content

Vulkan/Qualcomm: temporary resolve textures from AlwaysResolveIntoZeroLevelAndLayer can stay alive too long and trigger OOM#65

Open
cabanier wants to merge 1 commit into
google:mainfrom
cabanier:qualcomm-patch
Open

Vulkan/Qualcomm: temporary resolve textures from AlwaysResolveIntoZeroLevelAndLayer can stay alive too long and trigger OOM#65
cabanier wants to merge 1 commit into
google:mainfrom
cabanier:qualcomm-patch

Conversation

@cabanier
Copy link
Copy Markdown

@cabanier cabanier commented May 2, 2026

On Android Qualcomm Vulkan, Dawn enables AlwaysResolveIntoZeroLevelAndLayer because resolving into a non-zero array layer is buggy. In workloads that hit this path every frame, the temporary resolve textures can stay alive longer than the submit that used them, which can accumulate enough Vulkan image memory to cause OOM.

Relevant existing comments in Dawn

  • src/dawn/native/vulkan/PhysicalDeviceVk.cpp:988-996
    • Qualcomm devices have a bug resolving into a non-zero level/layer of an array texture.
    • Qualcomm devices also have a separate bug where an empty render pass with a resolve target may not perform the resolve unless some work is injected.
  • src/dawn/native/Toggles.cpp:60-68
    • AlwaysResolveIntoZeroLevelAndLayer resolves into a temporary single-layer 2D texture first, then copies into the true resolve target.
  • src/dawn/native/Toggles.cpp:705-708
    • VulkanAddWorkToEmptyResolvePass exists to force empty resolve passes to execute on Qualcomm.

Problem

RenderPassWorkaroundsHelper::Initialize() allocates a temporary resolve texture whenever the resolve target is a non-zero mip or array layer:

  • src/dawn/native/RenderPassWorkaroundsHelper.cpp

That texture is then used as the actual resolve target for the render pass, and copied into the intended destination in the pass end callback.

The issue is lifetime management:

  • there is no explicit submit-tied early destroy path for these temporary textures
  • they can remain referenced by the CommandBufferBase object after submission
  • in WebGPU, command buffer wrapper lifetime can extend until JS GC, so these temp textures can live across many frames
  • on Qualcomm Vulkan, these temp textures are often large direct allocations, so repeated per-frame creation can grow memory quickly

In my repro, this shows up in a WebXR/WebGPU workload on Quest:

  • the XR color target is a 2-layer array texture
  • each eye uses a 2D view into one array layer
  • the right eye resolve goes to baseArrayLayer = 1
  • that reliably triggers AlwaysResolveIntoZeroLevelAndLayer
  • repeating this every frame eventually OOMs

This is separate from the empty-resolve-pass bug. VulkanAddWorkToEmptyResolvePass is needed to make the resolve happen at all on Qualcomm, but the memory growth is caused by the lifetime of the temporary resolve texture used by AlwaysResolveIntoZeroLevelAndLayer.

Expected behavior

Temporary resolve textures created solely for the workaround should be released as soon as the submit that uses them has been queued, independent of command buffer wrapper lifetime.

Proposed fix

Track these temporary textures from CommandEncoder into CommandBufferBase, then in Vulkan:

  1. extract them during Queue::SubmitImpl()
  2. call Destroy() on them immediately after vkQueueSubmit
  3. do this before the queue serial advances, so fenced deletion is associated with the submission that actually used the image

This makes their Vulkan image/memory eligible for release on the correct serial and avoids depending on command buffer destruction / GC timing.

…elAndLayer` can accumulate and OOM

On Qualcomm Vulkan, Dawn's non-zero-array-layer resolve workaround allocates per-pass temporary resolve textures, and their lifetime can extend longer than the submit that used them, causing avoidable memory growth and eventual OOM in workloads like WebXR.

`AlwaysResolveIntoZeroLevelAndLayer` exists because Qualcomm Vulkan has a bug resolving into a non-zero mip/layer of an array texture:

- `src/dawn/native/vulkan/PhysicalDeviceVk.cpp`
- `src/dawn/native/Toggles.cpp`

In that path, `RenderPassWorkaroundsHelper` allocates a temporary single-layer 2D resolve texture, resolves into it, and then copies into the real destination.

The issue is that these temporary resolve textures can live longer than the submit that actually used them. In workloads that hit this path every frame, this causes unnecessary Vulkan image memory growth and can eventually OOM. I hit this with a WebXR/WebGPU workload on Quest where the XR color target is a 2-layer array texture and the right eye resolve targets `baseArrayLayer = 1`, which reliably triggers this workaround every frame.

There is also a separate Qualcomm workaround for empty resolve passes (`VulkanAddWorkToEmptyResolvePass`), but that is not the memory issue here. The memory issue is the lifetime of the temporary resolve texture used by `AlwaysResolveIntoZeroLevelAndLayer`.

Proposed fix: track these workaround textures on the command buffer and explicitly `Destroy()` them immediately after `vkQueueSubmit`, before the queue serial advances, so their fenced deletion is tied to the submit that used them rather than to later command-buffer/object teardown timing.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 2, 2026

👋 Thanks for your contribution! Your PR has been imported to Gerrit.
Please visit https://dawn-review.googlesource.com/c/dawn/+/306335 to see it and CC yourself on the change.
After iterating on feedback, please comment on the Gerrit review to notify reviewers.
All reviews are handled within Gerrit, any comments on the GitHub PR may be missed.
You can continue to upload commits to this PR, and they will be automatically imported
into Gerrit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant