Add depth resolve to the mobile renderer#108636
Conversation
d861bbf to
38358d1
Compare
BastiaanOlij
left a comment
There was a problem hiding this comment.
This looks good to me. Not entirely sure why CI is complaining about the resolve_effects variable hiding a class member as it looks like you're removing the class member, but might be something defined somewhere else still.
Will be nice to add in support for hardware resolve but I think this covers the most pressing use case.
I suspect hardware resolve is faster on supported hardware. But even it may be emulated on TBDR devices. See the ARM documentation linked above. Mali devices can't merge subpasses that change the depth/stencil buffer which means that they will have to resolve the depth buffer in a separate pass anyway. |
Indeed, and right now with the primary use case for which I did the original work now no longer being an issue (as everyone in XR land who actually uses this is switching to the ASW/frame synthesis approach for which a lower res depth buffer is rendered), and the remaining use cases all target devices where its questionable the API is supported and would even be faster, I don't think this is top of the priority list. In due time, if we have a spare minute, it would be interesting to bring up my PR and do comparisons, but I don't think there is a pressing need right now. |
38358d1 to
a8efa55
Compare
I think there's still potential use cases for it. For example, allowing |
Ansraer
left a comment
There was a problem hiding this comment.
Two minor nitpicks, otherwise this looks great.
| void _render_buffers_copy_screen_texture(const RenderDataRD *p_render_data); | ||
| void _render_buffers_ensure_depth_texture(const RenderDataRD *p_render_data); | ||
| void _render_buffers_copy_depth_texture(const RenderDataRD *p_render_data); | ||
| void _render_buffers_copy_depth_texture(const RenderDataRD *p_render_data, bool p_use_msaa = false); |
There was a problem hiding this comment.
| void _render_buffers_copy_depth_texture(const RenderDataRD *p_render_data, bool p_use_msaa = false); | |
| void _render_buffers_copy_depth_texture(const RenderDataRD *p_render_data, bool p_used_msaa = false); |
There was a problem hiding this comment.
So right now this shader can only resolve the depth texture. Are there any plans to support manually resolving other stuff in the future?
Because otherwise I would argue that "depth" should be mentioned in the name of this shader.
There was a problem hiding this comment.
We may need to extend it to resolve other formats in the future. I'm indifferent on the naming though. If you think it is a big deal I am fine with changing it.
There was a problem hiding this comment.
In that case I would appreciate it if you could rename it to explicilt mention depth.
|
Thanks! |
Add depth resolve to the mobile renderer



Fixes: #80991
Fixes: #103425
This is needed for #78598 to move ahead.
Resolving depth textures on mobile devices is a huge pain. In general, its best to never resolve the depth texture as mobile devices do a lot of special optimizations with it so they don't want you to touch it at all.
In a perfect world, we could use a resolve attachment and have the driver resolve the depth texture for us before the memory is copied out of tile memory (which is what currently happens for color). However, doing so is only supported with an extension (which #78598 implements).
Earlier I suggested adding a subpass to the main opaque pass that would read from the MSAA depth texture and manually resolve it into an R32 buffer. This would have the benefit of keeping the MSAA texture in tile memory only and saving a lot of bandwidth. Unfortunately, doing isn't supported widely either as it is optional in Vulkan for depth textures to support usage as input attachments. Further, as it turns out, it wouldn't even help on Mali, which would fall back to doing two passes anyway https://developer.arm.com/documentation/101897/0304/Fragment-shading/Multipass-rendering.
Therefore, the only option that is supported everywhere is to do a manual resolve of the depth texture in its own pass. Since we don't use subpasses anyway when reading from the depth texture, this didn't require much of a change in architecture.
Right now the code is structured to do the following:
With this PR I combine steps 2 and 3 and copy directly into the backbuffer. Doing so has the effect of offsetting the vast majority of the cost of doing the resolve manually.
Once this is merged, we can add #78598 as an optional path that is more optimized for devices that support the depth resolve extension.