Use Grisu2 algorithm in String::num_scientific to fix serializing#98750
Use Grisu2 algorithm in String::num_scientific to fix serializing#98750Repiteo merged 1 commit intogodotengine:masterfrom
String::num_scientific to fix serializing#98750Conversation
|
Is it worth modifying json to native and json from native? |
|
@fire What do you mean? |
|
I was curious why you renamed |
|
@fire I can undo the name change if it's not desired, but I think this is a clearer name. |
|
I have no opinion on the name change. It's not that important. |
bruvzg
left a comment
There was a problem hiding this comment.
I'm definitely in favor of using the same code for float serialization/print, and the implementation looks good. sprintf is too implementation dependent and unreliable.
I was curious why you renamed rtos_fixed to serialize_real.
It's internal method, so doesn't matter. But I like serialize_real more.
dc17bc5 to
850a082
Compare
Should we try to remove |
|
@arkology This PR only affects |
|
This also fixes #99103 |
String::num_scientific to fix serializing
|
Discussed in the core meeting today. Attendees were generally happy with the implementation and agree that more accurate serialization is worth pursuing. Overall, we are happy to move forward with this PR Before merging it would be good to have @akien-mga's opinion on the third party code and the diff noise. |
akien-mga
left a comment
There was a problem hiding this comment.
Looks good to me overall.
A bit concerned about the impact this may have on user projects when it comes to git noise on upgrade. We'll want to make sure users get a suggestion to run the tool we have to re-save all scenes to apply those changes all at once. (CC @KoBeWi )
|
Thanks! |
We could recommend it in migration guide (I think there is one?). |
Are you getting the diffs only once, or every time that you save? We expect that you would get a diff in many scenes the first time that you save after updating. But you should only get the diff once. |
Multiple times, but looking at the commit log, it looks like I'm "fighting" with other people on the project as some of these digits go back and forth between us. |
Are you both on the same version of Godot? Are you using different systems? |
|
Same version of Godot (custom build) - Windows 10 and Windows 11. Surprisingly my Windows 10 and MacOS builds come up with the same number. Is it possible this is only happening (sometimes) for AABB because they are getting recomputed as part of the load? |
It's hard to say. The Grisu2 algorithm is very stable and shouldn't be introducing any floating point issues. But it is possible that the increased precision is exposing some existing floating point precision/determinism issues in your game. |
| } | ||
| const double value = p_variant.operator double(); | ||
| String s; | ||
| // Hack to avoid garbage digits when the underlying float is 32-bit. |
There was a problem hiding this comment.
So I found that this "Hack" is also necessary on all the Vectors, AABBs, etc. when compiling with precision=double. That or the current grisu2 implementation isn't fully stable for double precision.
There was a problem hiding this comment.
This hack works by checking if it's equivalent to the closest 32-bit float value. If you are seeing places where this is needed for vectors with precision=double then this means the underlying value isn't double somewhere, it's not a bug in Grisu2 itself. We could look for those places to try and fix issues, but it's also possible that the underlying value you're encountering is intended to be a 32-bit float, in which case we'd indeed need to apply this hack there if we want to avoid unnecessary digits being written to scene files and verison control.
There was a problem hiding this comment.
The two places I see constant git changes are when saving a mesh to a file on import, the AABB is changing between computers, even without a re-import. I haven't been able to follow that code path but it's possible that is getting downcast at some point.
The second place is surprisingly on transforms on a saved scene. Many of the node transforms will sometimes change in the scene file despite not being moved. Usually just one digit of change.
Given we use double precision for runtime physics and rendering and our individual scenes and meshes don't need the full precision, I'll likely just always cast down to single precision (when it matches) in our variant writer.
There was a problem hiding this comment.
I opened this PR to apply this to all float serialization: #110616

Supersedes PR #96676, PR #86951, and fixes #78204, fixes #99103, fixes #99763, see also #100414.
This PR replaces the algorithm in
String::num_scientificwith Grisu2 to serialize numbers with more precision. The implementation was copied from simdjson here: https://github.com/simdjson/simdjson/blob/master/src/to_chars.cpp and adjusted slightly to match the existing behavior ofString::num_scientific.What: Grisu2 is an algorithm for serializing floats in scientific notation, with enough precision to ensure they can be read back exactly, while also having the minimum amount of digits, ensuring compactness and human readability. It uses integer operations and a table of pre-computed powers of ten, so it is extremely fast.
Why: We need to serialize with more precision to ensure that a serialized number can be deserialized into the same number. For example, for the number
123456789, the closest 32-bit float is123456792. In master this is serialized as1.23457e8, which becomes123457000, over 200 off from the closest 32-bit float. With this PR, if a 32-bit float, it will be serialized with 9 digits as123456790, which can be read back as exactly123456792. 32-bit floats have 6 reliable digits, but up to 9 are needed to serialize to decimal in order to read back with full precision.For an example with 64-bit floats, I have
1.234567898765432123456789e30included in the test cases. The closest 64-bit float is1.23456789876543218850569440461e30(differs at the 8 which used to be a 2). This gets serialized as1.2345678987654322e+30which is deserialized to exactly1.23456789876543218850569440461e30. 64-bit floats have 14 reliable digits, but up to 17 are needed to serialize to decimal in order to read back with full precision.Note that the code in Variant writer for Vector2/Vector3/etc has been adjusted to work with both 32-bit and 64-bit floats, so it will correctly serialize the numbers for builds with either precision level.
Note that the docs have special code that always use the 32-bit version, since we don't need high precision in the docs.
Note that I kept the existing behavior where
num_scientificdoes not have a trailing.0, but the code I grabbed from simdjson included that, so I removed it. It would be easy to add that back in. However I also separately re-added the trailing.0for the documentation to ensure the docs are generated with.0like before.