GDScript: Elide unnecessary copies in CONSTRUCT_TYPED_* opcodes#110717
GDScript: Elide unnecessary copies in CONSTRUCT_TYPED_* opcodes#110717Repiteo merged 1 commit intogodotengine:masterfrom
CONSTRUCT_TYPED_* opcodes#110717Conversation
CONSTRUCT_TYPED_* opcodesCONSTRUCT_TYPED_* opcodes
dalexeev
left a comment
There was a problem hiding this comment.
I haven't tested this, but it looks right to me. Good catch! This might change some error messages and possibly cause tests to fail, but that's okay.
|
The error messages in the tests that appear to be impacted from this change are surprisingly unhelpful. It seems like Going to have to fiddle around to see why, I guess. |
|
Okay, had to find/replace uses of One of the failing tests is this hunk from This fails because of this deviation: Note the missing Or is this some kind of floating-point printing error |
|
Well, I guess that answers my question, lol Guessing the case is the same for |
|
Well, at least for the dictionary case, using From the benchmark in #110709 (but without the |
0badca0 to
b18beb2
Compare
| const StringName native_type = _global_names_ptr[native_type_idx]; | ||
|
|
||
| Array array; | ||
| array.set_typed(builtin_type, native_type, *script_type); |
There was a problem hiding this comment.
Note that typed array resize needs to construct all elements as empty elements of the needed type. And then .set needs to replace them with the actual values.
It might still be faster than repeated push_back calls, but I'm guessing not by as much as might be expected. This should probably be tested.
Once Array gets reserve (scheduled for early 4.6 dev), I'm pretty sure the reserve + push_back approach should be faster than resize + set.
There was a problem hiding this comment.
I ran a similar test to the dictionary one with an array of length 40; the time difference between the approaches for Array is negligible. In both cases the typed array is about 50% slower than the untyped array.
Prior to this change:
time untyped: 2396ms
time typed: 3691ms
After this change:
time untyped: 2400ms
time typed: 3542ms
So it's ever so slightly faster; enough to mean we don't need to explicitly hold this change for reserve, but it would be better if reserve was present.
There was a problem hiding this comment.
Ok, thanks for testing!
Personally, since both approaches are the same speed, I'd take this as an indication to use push_back instead, since it's simpler than resize and set. Plus, we'll be able to insert reserve when it's here. But either would be fine for me right now, since we normally don't plan for future code changes.
|
Not sure what happened with the |
|
Thanks! Congratulations on your first merged contribution! 🎉 |
This PR fixes an issue that was noticed by @AdriaandeJongh while looking at the benchmark results used in #110709
In those benchmark results, the time it took to use initializer-list syntax to create a typed dictionary with 64 entries was 3x as long as the time it took to create an untyped dictionary.
This is because instead of using
set_typed, bothOPCODE_CONSTRUCT_TYPED_ARRAYandOPCODE_CONSTRUCT_TYPED_DICTIONARYfirst construct an untyped dictionary, fill it, and use the copy constructor that accepts type parameters to create a typed copy of the dictionary.By migrating these to use
set_typedinstead, and re-running the benchmark from #110709, the time for typed and untyped dictionaries is effectively identical.