X Tutup
Skip to content

Add shader baker to project exporter.#102552

Merged
Repiteo merged 1 commit intogodotengine:masterfrom
DarioSamo:shader-baker
May 28, 2025
Merged

Add shader baker to project exporter.#102552
Repiteo merged 1 commit intogodotengine:masterfrom
DarioSamo:shader-baker

Conversation

@DarioSamo
Copy link
Contributor

@DarioSamo DarioSamo commented Feb 7, 2025

Overview

Based mostly on the work done by @RandomShaper, this PR adds a new Editor Export Plugin that will scan resources and scenes for shaders and pre-compile them on the right format used by the driver in the target platform.

Shaders on SPIR-V, DXIL and MIL formats are interchangeable between systems and can be shared to end-users to skip long startup times resulting from having to compile them on the target platform. While pipeline compilation is still unavoidable and a requirement, Godot is currently and unnecessarily doing work on the end user's system that can be done ahead of time in the Editor and shipped as part of the final project.

This PR required a large amount of work on refactoring the Shader classes and decoupling the Shader Compilers from the Rendering Device Drivers we currently have. A new generic Shader Container class has been introduced and allows for heavy customization of the exported shader if required by the target platform. A significant amount of work has gone into also taking out any platform-specific definitions that were being added to shaders that may differ in the end user's system, and in the cases this is unavoidable for optimization reasons, shader variants have been created instead.

When using this PR, Shader Baking is an optional step that will increase the export time of a project with the major benefit that the end user who plays the game will be able to skip shader compilation entirely.

image

Another important change is the ability for the Shader class to use a multi-level shader cache: one that it reads and writes from as regular and one it can use as the fallback and is read-only. This one is filled in with the directory from the exported project's embedded shader cache in the .pck.

This feature is intended to be finished for Godot 4.5 if possible.

Results

The results speak for themselves when dealing with backends that have very long shader conversion times. While in Vulkan the improvement is there but not as noticeable on a system with many threads, the difference is astounding when dealing with a backend like D3D12, which has very long conversion times due to the NIR transpilation process.

Even on a system with 32 threads, a D3D12 project goes from taking over a minute to load to just ~2 seconds.

TPS demo using D3D12 backend without and with using the shader baker functionality.

master

WITHOUT_SHADER_CACHE_D3D12.mp4

shader-baker

WITH_SHADER_CACHE_D3D12.mp4

The results are reproducible but not as drastic on Vulkan, although you'll gain the biggest benefit out of this feature the less CPU threads you have at your disposal.

Notice that for testing this effectively, you must delete the shader_cache present in the user directory for the project you're testing, as between runs, Godot will cache compiled shader binaries in this directory. On Windows, this directory can be found in %AppData%/Godot/app_userdata/<Project Name>/shader_cache.

TODO

  • Metal support (@stuartcarnie has shown interest in tackling this).
  • Verify how this can interact with imported GLSL files.
  • Find and account for more edge cases the shader baker is not catching currently by testing on a wider variety of projects.
  • Account for the cases where the renderer must be set to the matching renderer of the exported platform for embedded shaders to be baked. Warn appropriately on the editor.
  • Verify there's no regression in BaseMaterial3D being updated automatically in the viewport from a user editing it.
  • Find the remaining global shader defines that might be around the codebase from querying the current rendering device's capabilities.
  • Check this project (Add shader baker to project exporter. #102552 (comment)).
  • Check multiview support.
  • Add renderer as part of the hash.
  • See if it's possible to improve the progress bar feedback.
  • Add project export warning when renderer doesn't match the renderer forced for the platform being exported.
  • Make IntegrateDfgShaderRD not delete itself after creation.

Bugsquad edit: Should fix: #94734


Contributed by W4 Games. 🍀

@stuartcarnie
Copy link
Contributor

stuartcarnie commented Feb 7, 2025

This is great!

Will you merge to the main branch, and then I can follow up with a PR to implement Metal support?

If the user is on Windows or macOS, we can utilise the Metal compiler toolchain to generate Metal libraries, reducing load times even more, as that compiles the Metal source into a platform-independent, intermediate format. I notice that Unreal Engine has an option to do this.

@DarioSamo
Copy link
Contributor Author

DarioSamo commented Feb 7, 2025

Will you merge to the main branch, and then I can follow up with a PR to implement Metal support?

I think it's pretty far from being merged to main at the moment due to 4.4 going into RC soon, I think it'd be best to just PR to this branch as I don't think it'll take too long to adapt what we have to it.

If the user is on Windows or macOS, we can utilise the Metal compiler toolchain to generate Metal libraries, reducing load times even more, as that compiles the Metal source into a platform-independent, intermediate format. I notice that Unreal Engine has an option to do this.

Yes, this would be great. There's a scheme for adding "Platforms" and you can definitely do a Windows-specific version that loads the toolchain if you're under Windows to produce the MIL instead.

Under the new Shader Container design, you won't need to handle anything about serialization of the Shader reflection. All you need is to just convert to the shader binary and you can insert whatever extra bytes you wish to serialize that the platform might need.

@warriormaster12
Copy link
Contributor

Will this feature bake shaders for all backends by default? If yes, can users filter out certain backends out of the export process? Say if developers decide to support Vulkan only on a platform that supports Vulkan and Dx12.

@DarioSamo
Copy link
Contributor Author

Will this feature bake shaders for all backends by default? If yes, can users filter out certain backends out of the export process? Say if developers decide to support Vulkan only on a platform that supports Vulkan and Dx12.

It bakes the shaders for the driver selected for the platform. It doesn't cover the case at the moment of the user offering options for multiple backends.

@Calinou
Copy link
Member

Calinou commented Feb 10, 2025

One concern I have is for users exporting to Windows from Linux (which is a common scenario on CI). While it should be possible to export SPIR-V already for projects using Vulkan, exporting DXIL for Direct3D doesn't sound feasible right now. None of the D3D12 code is compiled in the Linux editor which is used for exporting on CI. This also applies to users exporting for macOS from other platforms.

Of course, you can sidestep this by using a Windows CI runner, but these are generally slower to perform a full CI run due to slower I/O (and may have higher demand too, leading to increased queues).

More generally, I don't know if this shader compilation process will work in headless anyway (since no GPU is initialized, and none is available on GitHub Actions unless you pay for it).

I suppose we'd need a way to build the NIR stuff regardless of whether Direct3D 12 is enabled in the current build, as long as it's an editor build.

@DarioSamo
Copy link
Contributor Author

DarioSamo commented Feb 10, 2025

One concern I have is for users exporting to Windows from Linux (which is a common scenario on CI). While it should be possible to export SPIR-V already for projects using Vulkan, exporting DXIL for Direct3D doesn't sound feasible right now. None of the D3D12 code is compiled in the Linux editor which is used for exporting on CI. This also applies to users exporting for macOS from other platforms.

The only D3D12 code that is required at the moment is root signature serialization to a binary blob. If that can be worked around (CC @RandomShaper), then D3D12 is not a requirement for building D3D12 shaders.

More generally, I don't know if this shader compilation process will work in headless anyway (since no GPU is initialized, and none is available on GitHub Actions unless you pay for it).

The shader classes aren't tied to a particular driver running. No GPU is required for the process, as that was part of most of the refactoring that was done to take it out of the drivers and into their own classes that can be used independently.

@TCROC
Copy link
Contributor

TCROC commented Feb 14, 2025

@Calinou Just brought this PR to my attention! I am super excited to test this out! Please feel free to @ me when this is ready to be tested :)

@kisg
Copy link
Contributor

kisg commented Feb 21, 2025

Would it be possible to schedule this to 4.5? What would be required to do so?

@DarioSamo
Copy link
Contributor Author

DarioSamo commented Feb 21, 2025

Would it be possible to schedule this to 4.5? What would be required to do so?

Metal's the only component missing as far as I can tell. I can get around to it by the time we enter 4.5 but I'd like to give Stuart time to see if he can manage it as he's more familiar with the driver than I am.

@stuartcarnie
Copy link
Contributor

@DarioSamo do you think you could merge to the main branch once 4.4 is release, so I can work from my fork with my build configuration? I will be able to implement it fairly easily from there.

@DarioSamo
Copy link
Contributor Author

DarioSamo commented Feb 24, 2025

@DarioSamo do you think you could merge to the main branch once 4.4 is release, so I can work from my fork with my build configuration? I will be able to implement it fairly easily from there.

I'm not sure it's possible as I can't figure out a way that isn't very cumbersome to have the current scheme and the new scheme working in tandem without, in the process, just adapting the Metal backend to use the new shader container format and basically ending up with a working shader baker most of the way there already.

@TCROC
Copy link
Contributor

TCROC commented Feb 24, 2025

@stuartcarnie @DarioSamo

I'm actually confused by both the question and the answer.

do you think you could merge to the main branch once 4.4 is release, so I can work from my fork with my build configuration? I will be able to implement it fairly easily from there.

Aren't all PRs merged into the main branch?

I'm not sure it's possible as I can't figure out a way that isn't very cumbersome to have the current scheme and the new scheme working in tandem without, in the process, just adapting the Metal backend to use the new shader container format and basically ending up with a working shader baker most of the way there already.

Same reason for confusion. Why wouldn't it be possible? Aren't all PRs merged into the main branch? In fact, isn't this PR explicitly requesting to merge into master?

Thank you! :)

@TCROC
Copy link
Contributor

TCROC commented Feb 24, 2025

Oh I think I see what you are asking now. This branch has merge conflicts. Are you asking if these can be resolved?

@DarioSamo
Copy link
Contributor Author

DarioSamo commented Feb 24, 2025

@TCROC It's not the merge conflicts, it's the fact that Metal does not build at the moment on this PR. It can't be merged as it breaks the platform. I don't have an easy way to not break it as the changes are fundamental to how the shader methods work.

The amount of work to make it build as a bandaid fix would be roughly equivalent to the amount of work to implement the shader container in Metal that is necessary for shader baking to work.

@TCROC
Copy link
Contributor

TCROC commented Feb 24, 2025

Ah I see. Thank you for the explanation! :)

@stuartcarnie
Copy link
Contributor

👋🏻 @kisg

Overview: Metal

Currently, we use SPIRV-Cross to generate Metal Shader Language (MSL) from the SPIR-V and serialise this source to the binary data. We want to be able to support using the offline Metal compiler toolchain so that we can generate a .metallib file, when the toolchain is available. It isn't required, but will further reduce startup time, as devices such as iOS won't have to execute the Metal Compiler background task to compile the MSL first.

Solution Sketch: Metal

To support MSL and .metallib, we should extend ShaderBinaryData:

struct API_AVAILABLE(macos(11.0), ios(14.0), tvos(14.0)) ShaderBinaryData {

and a library_type field, that is an enumeration:

enum LibraryType {
  METAL_SHADER_LANGUAGE,
  METAL_LIBRARY,
}

Note

Adding a field will require the version is updated:

const uint32_t SHADER_BINARY_VERSION = 4;

The remainder of the work is just implementing the container, as @DarioSamo has done for Vulkan and D3D12. Don't worry about implementing offline compilation for your initial PR

Offline compilation

Offline compilation takes the MSL and create a .metallib. See this page for more information.

Future work will add support to spawn the Metal compiler toolchain, which is available for macOS and Window platforms, and generate .metallib files. We can serialise these instead of the raw MSL. Instead of creating a MTLLibrary from source:

[device newLibraryWithSource:source
options:options
completionHandler:^(id<MTLLibrary> library, NSError *error) {
os_signpost_interval_end(LOG_INTERVALS, compile_id, "shader_compile");
self->_library = library;
self->_error = error;
if (error) {
ERR_PRINT(vformat(U"Error compiling shader %s: %s", entry->name.get_data(), error.localizedDescription.UTF8String));
}
{
std::lock_guard<std::mutex> lock(self->_cv_mutex);
_ready = true;
}
_cv.notify_all();
_complete = true;
}];

which results in background compilation, we can use the newLibraryWithData:error: API to load a compiled Metal library.

@stuartcarnie
Copy link
Contributor

@DarioSamo when we're baking shaders, do you think it might be possible to provide the parameters required to generate a pipeline state descriptor?

@kisg I suggest you watch this Apple developer video, as it is possible we could provide a 3rd level of compilation, to completely remove runtime compilation. We would need the pipeline descriptor state to achieve this deeper level of customisation, but that would have to come from Godot so we could generate the appropriate JSON descriptor.

@jamie-pate
Copy link
Contributor

We can print a message after export

I'm more interested in compilations that happen in the exported project because they were missed in shader baking

@DarioSamo
Copy link
Contributor Author

Verbose mode currently prints cache misses, so you could definitely add to that spot and add it to a monitor.

@nubels
Copy link
Contributor

nubels commented Aug 14, 2025

Will the shader baker be added to web exports in the future? I can't find it in the web export properties and I couldn't find this limitation mentioned anywhere.

@bruvzg
Copy link
Member

bruvzg commented Aug 14, 2025

It's for Forward+/Mobile renderers and web exports are Compatibility only.

@Calinou
Copy link
Member

Calinou commented Aug 14, 2025

I've opened a documentation PR, feel free to take a look:

@MarioBossReal
Copy link

Does this feature pave the way for .gdshader source code being omitted from exported games?

The source code resulting from gdshader is pretty crucial to generating the correct hash, so you'd need to modify that part (changing the input of the hash) to be able to omit it.

Would it make sense to have "Feature" definitions built-in as part of the Shader resource to be used by the preprocessor?

For example, let's say I have a shader with 3 features. The first two are enable/disable features, the third is an integer feature ranging from 0-2. Feature two depends on feature one (can't be enabled without feature one enabled).

In the shader resource (simple example only, can be done better than this):

Dictionary<string, string> BooleanFeatures;
Dictionary<string, int> IntegerFeatures;

In the inspector for the shader resource:
Add boolean feature ["FEATURE_ONE", null] // Add feature with no dependencies
Add boolean feature ["FEATURE_TWO", "FEATURE_ONE"] // Add feature that depends on another feature
Add integer feature ["INTEGER_FEATURE", 2] // Add integer feature that accepts values 0-2

In the shader code:

#ifdef FEATURE_ONE
// shader code
#endif

#ifdef FEATURE_TWO
// shader code
#endif

#ifdef INTEGER_FEATURE
// shader code
#endif

In the inspector for the ShaderMaterial instance:

Set Shader to example shader.
Inspector is populated with checkboxes for Boolean features and fields for integer features.

Now, every possible permutation of the shader (at least from the gdshader perspective) is technically known, and a hash can be generated through a combination of the shader's name or resource path (probably better to have a dedicated name property to support embedded subresource shaders), as well as each each of the shader's features. The hash could probably even be precalcuated and stored on the ShaderMaterial itself as metadata or a dedicated property.

This may also even allow for easier pre-compilation of all gdshader shaders, as individual ShaderMaterial instances would not be required, since all permutations of the shader are already known and can be iteratively cycled through and compiled to SPIRV for each valid engine shader variant on export.

Whether or not this is a viable solution, not including the source code for game shaders is definitely possible, I'm not aware of any other engines/non-godot games which ship with the raw shader code, especially not AAA games.

@clayjohn
Copy link
Member

@MarioBossReal See godotengine/godot-proposals#8076 for a proposal that appears to do the same thing you are suggesting.

@KeyboardDanni
Copy link
Contributor

KeyboardDanni commented Nov 10, 2025

Is there some way to confirm that the shader baker is working? Because for my project the exporter doesn't take very long to bake, and startup times in the exported game are as slow as they are in editor with the shader cache cleared.

@clayjohn
Copy link
Member

Is there some way to confirm that the shader baker is working? Because for my project the exporter doesn't take very long to bake, and startup times in the exported game are as slow as they are in editor with the shader cache cleared.

Ya, just run the executable with verbose mode --verbose Shader cache misses will be printed out to the console

@KeyboardDanni
Copy link
Contributor

Well, I'm getting a bunch of messages about "Loading cache for shader" and I don't see any cache misses. So I guess the benefits just aren't that noticeable on Vulkan. Which is unfortunate, but at least the baker seems to be working.

@clayjohn
Copy link
Member

Well, I'm getting a bunch of messages about "Loading cache for shader" and I don't see any cache misses. So I guess the benefits just aren't that noticeable on Vulkan. Which is unfortunate, but at least the baker seems to be working.

Did you profile your load times and validate that loading shaders are the slow part? Depending on how many textures/scripts you have, it can be very common to be bottlenecked on Filesystem access

@KeyboardDanni
Copy link
Contributor

KeyboardDanni commented Nov 11, 2025

I have a custom shader warmer at the start so I know when shaders need to be compiled. After I clear the shader cache in the graphics drivers and game's app data folder, it takes about 8 seconds for the Godot splash to appear, and 6 seconds for the shader warmer to go through all the necessary shaders. Otherwise each takes less than a quarter of a second.

It's not a huge issue, but it is a bit annoying, and for future projects it might increase if I ever write additional custom shaders.

I also doubt filesystem access has a role here, at least in this case. The game's pack file is small enough to fit on an N64 cartridge so it's probably I/O cached by the OS.

@jamie-pate
Copy link
Contributor

jamie-pate commented Nov 11, 2025

As reported here: #106757 look for MTLCompilerService utilizing all cores on the system while you wait... (Vulkan is still using Metal on MacOS, but it's a lower layer)

image

@KeyboardDanni
Copy link
Contributor

On MacOS, yes. In my case I'm talking about native Vulkan on Windows (and by extension, Linux), where shader compilation is done in the driver rather than a centralized system service.

Again, I may have simply misjudged the gains from the shader baker on native Vulkan. #111452 mentions that spirv-opt could be useful for the baker if it can output an IR that takes less time for drivers to optimize. But that seems like a whole 'nother can of worms. Maybe best implemented as a separate tool download to avoid bloating the main editor binary?

@DarioSamo
Copy link
Contributor Author

DarioSamo commented Nov 17, 2025

Well, I'm getting a bunch of messages about "Loading cache for shader" and I don't see any cache misses. So I guess the benefits just aren't that noticeable on Vulkan. Which is unfortunate, but at least the baker seems to be working.

Godot's shader compilation for Vulkan in particular is tuned to be very quick because it doesn't even feature a shader optimizer. Glslang operates as a very quick GLSL -> SPIR-V translator. Backends such as D3D12 and Metal are heavily benefited by the inclusion of the shader baker because they can skip an expensive conversion process from SPIR-V to their own formats. The Vulkan driver doesn't need to do this, so the gains are pretty small.

But as you found out from what #111452 claims, we could indeed improve the loading times at the driver level if we had said optimizer, which would in turn make the shader compilation time no longer be as quick (think in the order of around a couple of milliseconds to several hundred). The shader baker is basically our scaffolding to be able to now include that in Vulkan without sacrificing big amounts of runtime performance dedicated to shader compilation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Can't import GLSL shaders in headless mode
X Tutup