Add shader baker to project exporter.#102552
Conversation
|
This is great! Will you merge to the main branch, and then I can follow up with a PR to implement Metal support? If the user is on Windows or macOS, we can utilise the Metal compiler toolchain to generate Metal libraries, reducing load times even more, as that compiles the Metal source into a platform-independent, intermediate format. I notice that Unreal Engine has an option to do this. |
I think it's pretty far from being merged to main at the moment due to 4.4 going into RC soon, I think it'd be best to just PR to this branch as I don't think it'll take too long to adapt what we have to it.
Yes, this would be great. There's a scheme for adding "Platforms" and you can definitely do a Windows-specific version that loads the toolchain if you're under Windows to produce the MIL instead. Under the new Shader Container design, you won't need to handle anything about serialization of the Shader reflection. All you need is to just convert to the shader binary and you can insert whatever extra bytes you wish to serialize that the platform might need. |
|
Will this feature bake shaders for all backends by default? If yes, can users filter out certain backends out of the export process? Say if developers decide to support Vulkan only on a platform that supports Vulkan and Dx12. |
It bakes the shaders for the driver selected for the platform. It doesn't cover the case at the moment of the user offering options for multiple backends. |
b70f2b9 to
1248ac0
Compare
|
One concern I have is for users exporting to Windows from Linux (which is a common scenario on CI). While it should be possible to export SPIR-V already for projects using Vulkan, exporting DXIL for Direct3D doesn't sound feasible right now. None of the D3D12 code is compiled in the Linux editor which is used for exporting on CI. This also applies to users exporting for macOS from other platforms. Of course, you can sidestep this by using a Windows CI runner, but these are generally slower to perform a full CI run due to slower I/O (and may have higher demand too, leading to increased queues). More generally, I don't know if this shader compilation process will work in headless anyway (since no GPU is initialized, and none is available on GitHub Actions unless you pay for it). I suppose we'd need a way to build the NIR stuff regardless of whether Direct3D 12 is enabled in the current build, as long as it's an editor build. |
The only D3D12 code that is required at the moment is root signature serialization to a binary blob. If that can be worked around (CC @RandomShaper), then D3D12 is not a requirement for building D3D12 shaders.
The shader classes aren't tied to a particular driver running. No GPU is required for the process, as that was part of most of the refactoring that was done to take it out of the drivers and into their own classes that can be used independently. |
1248ac0 to
5f922e8
Compare
|
@Calinou Just brought this PR to my attention! I am super excited to test this out! Please feel free to @ me when this is ready to be tested :) |
|
Would it be possible to schedule this to 4.5? What would be required to do so? |
Metal's the only component missing as far as I can tell. I can get around to it by the time we enter 4.5 but I'd like to give Stuart time to see if he can manage it as he's more familiar with the driver than I am. |
|
@DarioSamo do you think you could merge to the main branch once 4.4 is release, so I can work from my fork with my build configuration? I will be able to implement it fairly easily from there. |
I'm not sure it's possible as I can't figure out a way that isn't very cumbersome to have the current scheme and the new scheme working in tandem without, in the process, just adapting the Metal backend to use the new shader container format and basically ending up with a working shader baker most of the way there already. |
|
I'm actually confused by both the question and the answer.
Aren't all PRs merged into the main branch?
Same reason for confusion. Why wouldn't it be possible? Aren't all PRs merged into the main branch? In fact, isn't this PR explicitly requesting to merge into master? Thank you! :) |
|
Oh I think I see what you are asking now. This branch has merge conflicts. Are you asking if these can be resolved? |
|
@TCROC It's not the merge conflicts, it's the fact that Metal does not build at the moment on this PR. It can't be merged as it breaks the platform. I don't have an easy way to not break it as the changes are fundamental to how the shader methods work. The amount of work to make it build as a bandaid fix would be roughly equivalent to the amount of work to implement the shader container in Metal that is necessary for shader baking to work. |
|
Ah I see. Thank you for the explanation! :) |
|
👋🏻 @kisg Overview: MetalCurrently, we use SPIRV-Cross to generate Metal Shader Language (MSL) from the SPIR-V and serialise this source to the binary data. We want to be able to support using the offline Metal compiler toolchain so that we can generate a Solution Sketch: MetalTo support MSL and .metallib, we should extend and a enum LibraryType {
METAL_SHADER_LANGUAGE,
METAL_LIBRARY,
}Note Adding a field will require the version is updated: The remainder of the work is just implementing the container, as @DarioSamo has done for Vulkan and D3D12. Don't worry about implementing offline compilation for your initial PR Offline compilationOffline compilation takes the MSL and create a Future work will add support to spawn the Metal compiler toolchain, which is available for macOS and Window platforms, and generate godot/drivers/metal/metal_objects.mm Lines 2028 to 2044 in 9fc39ae which results in background compilation, we can use the |
|
@DarioSamo when we're baking shaders, do you think it might be possible to provide the parameters required to generate a pipeline state descriptor? @kisg I suggest you watch this Apple developer video, as it is possible we could provide a 3rd level of compilation, to completely remove runtime compilation. We would need the pipeline descriptor state to achieve this deeper level of customisation, but that would have to come from Godot so we could generate the appropriate JSON descriptor. |
I'm more interested in compilations that happen in the exported project because they were missed in shader baking |
|
Verbose mode currently prints cache misses, so you could definitely add to that spot and add it to a monitor. |
|
Will the shader baker be added to web exports in the future? I can't find it in the web export properties and I couldn't find this limitation mentioned anywhere. |
|
It's for Forward+/Mobile renderers and web exports are Compatibility only. |
|
I've opened a documentation PR, feel free to take a look: |
Would it make sense to have "Feature" definitions built-in as part of the Shader resource to be used by the preprocessor? For example, let's say I have a shader with 3 features. The first two are enable/disable features, the third is an integer feature ranging from 0-2. Feature two depends on feature one (can't be enabled without feature one enabled). In the shader resource (simple example only, can be done better than this): Dictionary<string, string> BooleanFeatures;
Dictionary<string, int> IntegerFeatures;In the inspector for the shader resource: In the shader code: #ifdef FEATURE_ONE
// shader code
#endif
#ifdef FEATURE_TWO
// shader code
#endif
#ifdef INTEGER_FEATURE
// shader code
#endifIn the inspector for the ShaderMaterial instance: Set Shader to example shader. Now, every possible permutation of the shader (at least from the gdshader perspective) is technically known, and a hash can be generated through a combination of the shader's name or resource path (probably better to have a dedicated name property to support embedded subresource shaders), as well as each each of the shader's features. The hash could probably even be precalcuated and stored on the ShaderMaterial itself as metadata or a dedicated property. This may also even allow for easier pre-compilation of all gdshader shaders, as individual ShaderMaterial instances would not be required, since all permutations of the shader are already known and can be iteratively cycled through and compiled to SPIRV for each valid engine shader variant on export. Whether or not this is a viable solution, not including the source code for game shaders is definitely possible, I'm not aware of any other engines/non-godot games which ship with the raw shader code, especially not AAA games. |
|
@MarioBossReal See godotengine/godot-proposals#8076 for a proposal that appears to do the same thing you are suggesting. |
|
Is there some way to confirm that the shader baker is working? Because for my project the exporter doesn't take very long to bake, and startup times in the exported game are as slow as they are in editor with the shader cache cleared. |
Ya, just run the executable with verbose mode |
|
Well, I'm getting a bunch of messages about "Loading cache for shader" and I don't see any cache misses. So I guess the benefits just aren't that noticeable on Vulkan. Which is unfortunate, but at least the baker seems to be working. |
Did you profile your load times and validate that loading shaders are the slow part? Depending on how many textures/scripts you have, it can be very common to be bottlenecked on Filesystem access |
|
I have a custom shader warmer at the start so I know when shaders need to be compiled. After I clear the shader cache in the graphics drivers and game's app data folder, it takes about 8 seconds for the Godot splash to appear, and 6 seconds for the shader warmer to go through all the necessary shaders. Otherwise each takes less than a quarter of a second. It's not a huge issue, but it is a bit annoying, and for future projects it might increase if I ever write additional custom shaders. I also doubt filesystem access has a role here, at least in this case. The game's pack file is small enough to fit on an N64 cartridge so it's probably I/O cached by the OS. |
|
As reported here: #106757 look for MTLCompilerService utilizing all cores on the system while you wait... (Vulkan is still using Metal on MacOS, but it's a lower layer)
|
|
On MacOS, yes. In my case I'm talking about native Vulkan on Windows (and by extension, Linux), where shader compilation is done in the driver rather than a centralized system service. Again, I may have simply misjudged the gains from the shader baker on native Vulkan. #111452 mentions that |
Godot's shader compilation for Vulkan in particular is tuned to be very quick because it doesn't even feature a shader optimizer. Glslang operates as a very quick GLSL -> SPIR-V translator. Backends such as D3D12 and Metal are heavily benefited by the inclusion of the shader baker because they can skip an expensive conversion process from SPIR-V to their own formats. The Vulkan driver doesn't need to do this, so the gains are pretty small. But as you found out from what #111452 claims, we could indeed improve the loading times at the driver level if we had said optimizer, which would in turn make the shader compilation time no longer be as quick (think in the order of around a couple of milliseconds to several hundred). The shader baker is basically our scaffolding to be able to now include that in Vulkan without sacrificing big amounts of runtime performance dedicated to shader compilation. |

Overview
Based mostly on the work done by @RandomShaper, this PR adds a new Editor Export Plugin that will scan resources and scenes for shaders and pre-compile them on the right format used by the driver in the target platform.
Shaders on SPIR-V, DXIL and MIL formats are interchangeable between systems and can be shared to end-users to skip long startup times resulting from having to compile them on the target platform. While pipeline compilation is still unavoidable and a requirement, Godot is currently and unnecessarily doing work on the end user's system that can be done ahead of time in the Editor and shipped as part of the final project.
This PR required a large amount of work on refactoring the Shader classes and decoupling the Shader Compilers from the Rendering Device Drivers we currently have. A new generic Shader Container class has been introduced and allows for heavy customization of the exported shader if required by the target platform. A significant amount of work has gone into also taking out any platform-specific definitions that were being added to shaders that may differ in the end user's system, and in the cases this is unavoidable for optimization reasons, shader variants have been created instead.
When using this PR, Shader Baking is an optional step that will increase the export time of a project with the major benefit that the end user who plays the game will be able to skip shader compilation entirely.
Another important change is the ability for the Shader class to use a multi-level shader cache: one that it reads and writes from as regular and one it can use as the fallback and is read-only. This one is filled in with the directory from the exported project's embedded shader cache in the .pck.
This feature is intended to be finished for Godot 4.5 if possible.
Results
The results speak for themselves when dealing with backends that have very long shader conversion times. While in Vulkan the improvement is there but not as noticeable on a system with many threads, the difference is astounding when dealing with a backend like D3D12, which has very long conversion times due to the NIR transpilation process.
Even on a system with 32 threads, a D3D12 project goes from taking over a minute to load to just ~2 seconds.
TPS demo using D3D12 backend without and with using the shader baker functionality.
masterWITHOUT_SHADER_CACHE_D3D12.mp4
shader-bakerWITH_SHADER_CACHE_D3D12.mp4
The results are reproducible but not as drastic on Vulkan, although you'll gain the biggest benefit out of this feature the less CPU threads you have at your disposal.
Notice that for testing this effectively, you must delete the shader_cache present in the user directory for the project you're testing, as between runs, Godot will cache compiled shader binaries in this directory. On Windows, this directory can be found in
%AppData%/Godot/app_userdata/<Project Name>/shader_cache.TODO
Bugsquad edit: Should fix: #94734
Contributed by W4 Games. 🍀