X Tutup
Skip to content

Check for NUL characters in string parsing functions.#110556

Merged
Repiteo merged 1 commit intogodotengine:masterfrom
Ivorforce:safe-nul-parse
Sep 17, 2025
Merged

Check for NUL characters in string parsing functions.#110556
Repiteo merged 1 commit intogodotengine:masterfrom
Ivorforce:safe-nul-parse

Conversation

@Ivorforce
Copy link
Member

@Ivorforce Ivorforce commented Sep 16, 2025

Fixes #110545

Before my recent 4.5 changes, these functions used to stop parsing when they encountered a NUL in the given span.
This is incorrect behavior, because NUL is a valid unicode character, and we should be handling it gracefully.

However, a large part of our APIs do not currently support NUL in strings (see godotengine/godot-proposals#11249). Therefore, we have to make do for the moment and replace them with the dedicated replacement char. Notably, append_utf8 and append_utf16 APIs still terminate early on NUL characters. This should be changed in a future PR.

This behavior is more correct than terminating the string early, which may lose information and return strings of incorrect length. In the future, we should strive towards support for NUL characters, to comply with the unicode standard correctly.

@Ivorforce Ivorforce added this to the 4.6 milestone Sep 16, 2025
@Ivorforce Ivorforce requested a review from a team as a code owner September 16, 2025 08:54
@Ivorforce Ivorforce added bug topic:core cherrypick:4.5 Considered for cherry-picking into a future 4.5.x release labels Sep 16, 2025
Copy link
Contributor

@Repiteo Repiteo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proper NUL support remains a great long-term goal, but this is an appropriate stopgap until then

@Repiteo Repiteo merged commit cc7397c into godotengine:master Sep 17, 2025
20 checks passed
@Repiteo
Copy link
Contributor

Repiteo commented Sep 17, 2025

Thanks!

@Repiteo
Copy link
Contributor

Repiteo commented Sep 22, 2025

Cherry-picked to 4.5

@Repiteo Repiteo removed the cherrypick:4.5 Considered for cherry-picking into a future 4.5.x release label Sep 22, 2025
ProfessorOctopus pushed a commit to ProfessorOctopus/godot that referenced this pull request Oct 5, 2025
Check for `NUL` characters in string parsing functions.
@hhyyrylainen
Copy link

After updating from 4.5 to 4.5.1 I started seeing a bunch of errors like this:

  ERROR: Unicode parsing error, some characters were replaced with � (U+FFFD): Unexpected NUL character
  ERROR: Unicode parsing error, some characters were replaced with � (U+FFFD): Unexpected NUL character
  ERROR: Unicode parsing error, some characters were replaced with � (U+FFFD): Unexpected NUL character
  ERROR: Unicode parsing error, some characters were replaced with � (U+FFFD): Unexpected NUL character
  ERROR: Unicode parsing error, some characters were replaced with � (U+FFFD): Unexpected NUL character

Is this change related? I ask because I can't see anything else in the release notes that might be relevant.

@Ivorforce
Copy link
Member Author

Ivorforce commented Nov 10, 2025

Yes, that's this check triggering. Can you pinpoint when the error messages are printed? What's causing it?

@hhyyrylainen
Copy link

hhyyrylainen commented Nov 10, 2025

As far as I can tell, there's no error callstack associated with the print, so I cannot easily tell where it is coming from. However, after a bit more searching I discovered this: #111792

And in our game we do have quite extensive key handling and those error prints begin after another print related to our custom key rebinding system doing its thing. So it may be triggered by trying to convert keycodes and apply input settings. Though we don't have an absolute ton of startup logging, so it could also be triggered later.

Edit: I should have looked a tiny bit further, as going to our keybinding menu sure enough I see where the character replacements happened:

kuva

Edit 2: for reference here is how it looks with Godot 4.5:

kuva

(those ö, and å letters do show up in 4.5.1 but I cropped the screenshot differently as I was doing it quickly)

@Ivorforce
Copy link
Member Author

Ivorforce commented Nov 11, 2025

As far as I can tell, there's no error callstack associated with the print, so I cannot easily tell where it is coming from. However, after a bit more searching I discovered this: #111792

And in our game we do have quite extensive key handling and those error prints begin after another print related to our custom key rebinding system doing its thing. So it may be triggered by trying to convert keycodes and apply input settings. Though we don't have an absolute ton of startup logging, so it could also be triggered later.

Edit: I should have looked a tiny bit further, as going to our keybinding menu sure enough I see where the character replacements happened:

kuva Edit 2: for reference here is how it looks with Godot 4.5: kuva (those ö, and å letters do show up in 4.5.1 but I cropped the screenshot differently as I was doing it quickly)

Yea, that would be the kind of bug introduced by these changes I made to string parsing. I'm very sorry about that!

In your custom key handling code, do you happen to call String::chr with keycodes? Keycodes being converted through String::chr has been the primary cause of this error message being printed in Godot's own codebase (hopefully, all fixed in 4.5.1?).
String::chr used to (in 4.4) silently return an empty string. In 4.5, it returned a string with the \0 character in it, which was unintended, because it can cause all kinds of string handling bugs. In 4.5.1, it errors and returns a string with the replacement character instead.

@hhyyrylainen
Copy link

hhyyrylainen commented Nov 11, 2025

I did a bit of debugging and I do not think our code calls keycode to character conversion (like I initially thought). However, I found out from where the incorrect text was coming from.

It looks like DisplayServer.KeyboardGetLabelFromPhysical does not work correctly in 4.5.1

In 4.5 if I run the following code:

DisplayServer.KeyboardGetLabelFromPhysical(Key.Left)

The result is Key.Left as expected.

But in 4.5.1 I get the result from the same line of code that it is: 65533 which is not a valid value in the Key enum. So somehow that call to the engine is scrambling up the key codes.

I put this code in my game's startup and it makes it quit in 4.5.1 whereas 4.5 works normally:

        if (DisplayServer.KeyboardGetLabelFromPhysical(Key.Left) != Key.Left)
        {
            GD.PrintErr("Key code changed!");
            GetTree().Quit();
            return;
        }

That code in 4.5.1 also triggers the print I saw:

Unicode parsing error, some characters were replaced with � (U+FFFD): Unexpected NUL character

So I think that is the thing our game does that triggers the error print from the engine as well.

Should I open a new issue for this?

Edit: I checked the documentation just in case but it doesn't seem like the return value of the function should have changed: https://docs.godotengine.org/en/stable/classes/class_displayserver.html#class-displayserver-method-keyboard-get-label-from-physical

@Ivorforce
Copy link
Member Author

Ivorforce commented Nov 11, 2025

Thank you for digging.
I checked the code of the keyboard_get_label_from_physical function implementations.

Looks like I was mistaken, the x11 fix for the keyboard issue hasn't been released yet in 4.5.1, it will only be part of 4.5.2 (if there will be a 4.5.2 release), or 4.6: #111795

Looking at macOS and Windows, I don't think they have the same issue.

@hhyyrylainen
Copy link

Thanks. I'll keep an eye out for the next Godot release then and not try further with 4.5.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

String.chr and char have unreported changes in godot 4.5 + related breaking print bug

3 participants

X Tutup