Use `LocalVector` on `GDScriptInstance::members` by DeeJayLSP · Pull Request #105555 · godotengine/godot

DeeJayLSP · 2025-04-19T04:46:54Z

Modifies GDScriptInstance::members to use a LocalVector instead of Vector. This should be slightly faster and go easier on memory due to the lack of CoW.

Benchmarks

Updated as of December 3, 2025

Project used for benchmarking: bunnymark_custom.zip (modify the line in _ready() that adds bunnies as intended).

It should start and immediately spawn the desired amount of bunnies. After 30 seconds, prints the FPS amount of the last 15 seconds (the first 15 are discarded as an anti-fluctuation measure) and exits.

The command was something like for i in {1..3}; do gamemoderun ./before.x86_64 --no-header && gamemoderun ./after.x86_64 --no-header; done > bunny_count.txt on a machine with almost zero programs running in the background. More than one run is necessary due to GDScript being too unstable and sensible to fluctuations.

Unlike some previous benchmarks, I opted to keep the numbers as low as possible to keep closer to more common cases. There is virtually no difference if the amount is lower than this.

500 bunnies

`master`	PR
2514	2511
2531	2547
2537	2549

Average increase of 0.31%

1000 bunnies

`master`	PR
1677	1682
1668	1683
1670	1673

Average increase of 0.45%

5000 bunnies

`master`	PR
360	371
364	361
361	367

Average increase of 1.29%

Other considerations

On a very high amount of instances (probably near 50k bunnies), the improvement would certainly reach around 10%, it's just that it's nowhere near a common use case.

While the improvement numbers aren't that high, template release production builds on Linux had a decrease of 8 KiB. When running both before and after cases and looking at a system monitor, most of the times the after one will use a few KBs less memory.

DeeJayLSP · 2025-04-21T17:03:39Z

It seems upstream made this slower (probably #100944).

However, changing the members to a TightLocalVector instead made the performance improvement come back. GDScriptInstance::members is resized only once when an instance is created, so maybe nothing is lost by doing this. Force-pushed.

clayjohn · 2025-04-21T17:10:12Z

CC @Ivorforce @Nazarwadim

Ivorforce · 2025-04-21T18:04:54Z

Some regressions are to be expected from resize changes by pure chance. If this is significantly impacted by it it definitely needs a better solution in general.

Not sure how TighLocalVector could even help unless it's being resized every single frame. It has no effect if there's no resize. Do you have a repro for us to toy with?

DeeJayLSP · 2025-04-21T18:23:17Z

Do you have a repro for us to toy with?

I use this stress test for every PR that affects GDScript performance: https://github.com/GaijinEntertainment/godot-das/tree/master/examples/bunnymark

After further testing, I noticed there is a performance improvement only when there are too much instances:

6000 bunnies:

branch	FPS
master	122-125
PR	114-116

20000 bunnies:

branch	FPS
master	34-36
PR	36-37

Didn't bother measuring with regular LocalVector as all results were lower performance.

It seems a bit situational. I'll report after testing on production builds which matter the most.

DeeJayLSP · 2025-04-21T20:06:27Z

I'll report after testing on production builds which matter the most.

The proportions are the same, with or without production=yes

I admit that back then I was pairing the changes of this PR with #90082, which already caused a massive boost to GDScript.

I decided to test everything again with both changes from #90082 and this PR over the current master.

At 5000 bunnies:

branch	avg. FPS
#90082	163-166
#90082 + LocalVector members	163-171
#90082 + TightLocalVector members	161-171

At 10000 bunnies:

branch	avg. FPS
#90082	79-81
#90082 + LocalVector members	79-82
#90082 + TightLocalVector members	78-81

At 20000 bunnies:

branch	FPS
#90082	37-40
#90082 + LocalVector members	39-41
#90082 + TightLocalVector members	38-39

At 50000 bunnies:

branch	FPS
#90082	15-16
#90082 + LocalVector members	16-17
#90082 + TightLocalVector members	15-17

The results are too unstable, but I think maybe we should wait for #90082. A regular LocalVector seems more stable than a Tight one.

DeeJayLSP · 2025-04-23T18:22:40Z

While experimenting how fast it would be with a std::vector, I realized GDScriptInstance::reload_members() would overwrite GDScriptInstance::members by giving it a regular Vector, and casting that to LocalVector has a bit of overhead.

I have modified so it makes and assigns a LocalVector instead.

It seems whatever made the FPS drop is now gone (not sure if it was this missing change as it was fine with TightLocalVector, probably because both it and CowData are tight, or probably because of #104985 being merged), but the FPS gain is minimal (around 1.5%).

Calinou

Tested locally, it works as expected. I'm getting mixed performance results though, so it's hard to say if this is really a win.

Benchmark

PC specifications

CPU: Intel Core i9-13900K
GPU: NVIDIA GeForce RTX 4090
RAM: 64 GB (2×32 GB DDR5-5800 C30)
SSD: Solidigm P44 Pro 2 TB
OS: Linux (Fedora 42)

Using a release export template binary (production=yes lto=full) on https://github.com/GaijinEntertainment/godot-das/tree/master/examples/bunnymark's GDScript version.

Count	`master`	This PR
6,000 bunnies	566 FPS	550 FPS
20,000 bunnies	159 FPS	159 FPS
50,000 bunnies	59 FPS	60 FPS

DeeJayLSP · 2025-04-25T19:10:35Z

I'm getting mixed performance results though.

That's why I say we should wait for #90082 or something similar.

Every time some PR touches GDScript or LocalVector it changes the outcome of this one. But in all cases combined with #90082 the performance was better.

Out of curiosity I overlapped this PR over some changes of #90082 in a production build (6000 bunnies as that was the one affected negatively):

#90082	#90082 + PR
216 FPS	224 FPS

The same improvement ratio from above can be seen with 10000 or 50000 bunnies

DeeJayLSP · 2025-05-13T03:37:21Z

Rebased so I could check the benefits from #106020

While the profiler shows that the overhead of a LocalVector is significantly lower than that of CowData, there is a Variant::operator== that skyrockets. This also happens in the screenshots at the top of this PR page.

Before:

After:

At 10000 bunnies, I can measure a 9% performance improvement with a production template build, but at 1000 bunnies it varies between a 0.3% improvement and a 2.3% decrease. Performance decrease disaappears after 2000 bunnies.

I hope the issue with Variant isn't some LocalVector to Vector conversion.

EDIT: turns out most of CowData::get() overhead is simply being inherited by Variant::operator[].

I have tried to keep a regular Vector, but make GDScriptInstance store members's ptrw() solely on instantiation (GDScriptFunction::call() calls ptrw() on every iteration), but iterating over it revealed where the overhead came from. And was still slower.

Using LocalVector still results in both vector get and Variant::operator== being in their least overhead forms.

DeeJayLSP · 2025-05-14T12:52:30Z

At 1000 bunnies, master vs PR

Looks like the problem was never inside Variant::get_named or Variant::set_named, which are actually benefited by this PR. VariantSetGet_Vector2_x/y::validated_get seems to get a massive increase in cycles.

Does anyone know a possible solution for this? Or is it too reliant on CoW?

Ivorforce

I'm not sure if I agree that's a massive difference. With such small percentages, i'd be inclined to say you may be measuring CPU fluctuations. Perhaps you can coax more extreme differences with a different, more tailored benchmark?

modules/gdscript/gdscript.cpp

DeeJayLSP · 2025-05-14T19:58:57Z

It seems like whatever overhead in GDScriptInstance::set/get() created by this change can be mitigated by using std::move() when setting/getting the variant.

I don't know if doing this would create unwanted behavior or possibly cause crashes, but so far it recovers some of the performance loss (the increase in Variant::operator= mentioned here goes away and the FPS with 1000 bunnies is higher), tests pass and TPS Demo runs just fine.

I think the lack of CoW is causing Variant constructors to be called in the VM, resulting in overhead. It was fixed on set/get's side by using std::move(), and I don't know how to do it on the VM's side, or if this is even possible or viable at this point.

DeeJayLSP · 2025-12-03T02:07:25Z

Rebased after a very long time.

By the time I opened this PR I thought I knew how to read perf outputs, but #112129 taught me a lesson (didn't know callback cycles around the change increasing could potentially indicate a perfomance gain)

Here are a few changes I did:

Keeping it as Tight, as the vector never gets resized except when the GDScript instance is made or when reloading the script.
All std::move() workarounds removed (except the script reload one) as they turned out to be doing more harm than good.

On very recent tests, when it comes to Bunnymark, there should be no performance difference at 500 bunnies, maybe a 0.06% gain (from many runs with the least amount programs running on background + Feral GameMode, as GDScript is extremely sensible to fluctuations when not performing too heavy tasks).

Lastly, when running before and after at the same time, it's possible to see sometimes that memory usage isn't stable. Sometimes before consumes many more MBs, sometimes after does. But when they match, after is always one MB lower than before. Also the build is 4KiB smaller.

From a recent discussion in the Contributors Chat, the conclusion is that members here shouldn't need CoW at all (members uses a CoW vector since before the engine was open-sourced). Maybe #107002 could be dismissed.

Will update the PR description later.

DeeJayLSP · 2025-12-03T15:25:33Z

PR description updated with newer benchmarks and more considerations that are not the wrong profiler reading.

DeeJayLSP · 2026-03-09T20:55:07Z

Switched back to regular LocalVector. While the performance change is around the same and the vector is never resized, members being a LocalVector decreased build sizes by a total of 8KiB.

DeeJayLSP requested a review from a team as a code owner April 19, 2025 04:46

AThousandShips added enhancement topic:gdscript performance labels Apr 19, 2025

AThousandShips added this to the 4.x milestone Apr 19, 2025

DeeJayLSP force-pushed the gds-local-members branch from 2af5907 to 265ec22 Compare April 21, 2025 17:02

DeeJayLSP force-pushed the gds-local-members branch from 265ec22 to ebc059f Compare April 23, 2025 18:16

DeeJayLSP requested a review from a team as a code owner April 23, 2025 18:16

DeeJayLSP force-pushed the gds-local-members branch 2 times, most recently from 0ecb418 to 6a11011 Compare April 24, 2025 22:57

Calinou reviewed Apr 25, 2025

View reviewed changes

DeeJayLSP mentioned this pull request May 2, 2025

GDScript: Reserve local function Vectors with known final size #105998

Open

DeeJayLSP force-pushed the gds-local-members branch from 6a11011 to e2bee63 Compare May 13, 2025 02:28

Ivorforce reviewed May 14, 2025

View reviewed changes

modules/gdscript/gdscript.cpp Outdated Show resolved Hide resolved

DeeJayLSP force-pushed the gds-local-members branch from e2bee63 to 1e88bf9 Compare May 14, 2025 18:40

DeeJayLSP force-pushed the gds-local-members branch from 1e88bf9 to 71f6bc5 Compare May 17, 2025 05:50

DeeJayLSP mentioned this pull request May 31, 2025

GDScriptInstance: use write pointer for members #107002

Open

DeeJayLSP force-pushed the gds-local-members branch from 71f6bc5 to a64422f Compare December 3, 2025 01:54

DeeJayLSP requested review from Calinou and Ivorforce December 3, 2025 15:25

Use LocalVector on GDScriptInstance::members

f77f6ab

DeeJayLSP force-pushed the gds-local-members branch from a64422f to f77f6ab Compare March 9, 2026 20:47

Uh oh!

Conversation

DeeJayLSP commented Apr 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

500 bunnies

1000 bunnies

5000 bunnies

Other considerations

Uh oh!

DeeJayLSP commented Apr 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clayjohn commented Apr 21, 2025

Uh oh!

Ivorforce commented Apr 21, 2025

Uh oh!

DeeJayLSP commented Apr 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DeeJayLSP commented Apr 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DeeJayLSP commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Calinou left a comment

Choose a reason for hiding this comment

Benchmark

Uh oh!

DeeJayLSP commented Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DeeJayLSP commented May 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DeeJayLSP commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ivorforce left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DeeJayLSP commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DeeJayLSP commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DeeJayLSP commented Dec 3, 2025

Uh oh!

DeeJayLSP commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

DeeJayLSP commented Apr 19, 2025 •

edited

Loading

DeeJayLSP commented Apr 21, 2025 •

edited

Loading

DeeJayLSP commented Apr 21, 2025 •

edited

Loading

DeeJayLSP commented Apr 21, 2025 •

edited

Loading

DeeJayLSP commented Apr 23, 2025 •

edited

Loading

DeeJayLSP commented Apr 25, 2025 •

edited

Loading

DeeJayLSP commented May 13, 2025 •

edited

Loading

DeeJayLSP commented May 14, 2025 •

edited

Loading

Ivorforce left a comment •

edited

Loading

DeeJayLSP commented May 14, 2025 •

edited

Loading

DeeJayLSP commented Dec 3, 2025 •

edited

Loading

DeeJayLSP commented Mar 9, 2026 •

edited

Loading