X Tutup
Skip to content

Reuse/optimize common OperatorEvaluator*::evaluate logic#113132

Merged
akien-mga merged 1 commit intogodotengine:masterfrom
aaronp64:variant_op_eval
Dec 2, 2025
Merged

Reuse/optimize common OperatorEvaluator*::evaluate logic#113132
akien-mga merged 1 commit intogodotengine:masterfrom
aaronp64:variant_op_eval

Conversation

@aaronp64
Copy link
Contributor

Added CommonEvaluate class to be used by OperatorEvaluator* classes that have the same logic for evaluate and validated_evaluate. CommonEvaluate initializes the return Variant to the correct type, then passes through to validated_evaluate, reducing code duplication between evaluate/validated_evaluate and avoiding the overhead of creating/destroying temporary Variants from previous evaluate implementations.

The main performance benefit from this change is for methods that call Variant::evaluate directly, such as sort and min/max methods. GDScript handling of operators already has some optimizations to use VariantInternal::initialize/validated_evaluate instead of evaluate, as long as the same operator expression continues to be called with the same Variant types. If the types change, it will fall back to evaluate, which would be faster with this change. The sum_int and sum_float benchmarks below are meant to show this case - sum_int uses ints when calculating the sum in sum_array, then sum_float calls sum_array using floats.

Compared with gdscript below:

func _ready():
	run_test("sort_int")
	run_test("sort_str")
	run_test("min_max")
	run_test("sum_int")
	run_test("sum_float")
	run_test("sort_dict")

func run_test(test_name: String):
	var start := Time.get_ticks_msec()
	call(test_name)
	var end := Time.get_ticks_msec()
	print("%s: %dms" % [test_name.rpad(10), end - start])

func sort_int():
	var a := range(10000)
	for i in 100:
		a.shuffle()
		a.sort()

func sort_str():
	var a := range(10000).map(str)
	for i in 100:
		a.shuffle()
		a.sort()

func min_max():
	var a := range(10000)
	for i in 1000:
		a.min()
		a.max()

func sum_int():
	var a := range(10000)
	for i in 1000:
		sum_array(a)

func sum_float():
	var a := range(10000).map(func(i): return float(i))
	for i in 1000:
		sum_array(a)

func sum_array(a):
	var s = 0
	for n in a:
		s += n
	return s

func sort_dict():
	var d := {}
	for i in 10000:
		d[str(i)] = i
	for i in 1000:
		d.sort()

old:

sort_int  : 205ms
sort_str  : 283ms
min_max   : 224ms
sum_int   : 284ms
sum_float : 363ms
sort_dict : 106ms

new:

sort_int  : 126ms
sort_str  : 231ms
min_max   : 111ms
sum_int   : 286ms
sum_float : 326ms
sort_dict : 80ms

Copy link
Member

@Ivorforce Ivorforce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I was really hoping this would be refactored some time, so I'm grateful you are taking care of this.

The code makes sense and looks good to me. I'm actually surprised it makes this much of a difference!
I've got just one comment where I think we should be able to squeeze out some more performance.

Added CommonEvaluate class to be used by OperatorEvaluator* classes that have the same logic for `evaluate` and `validated_evaluate`. CommonEvaluate initializes the return Variant to the correct type, then passes through to validated_evaluate, reducing code duplication between evaluate/validated_evaluate and avoiding the overhead of creating/destroying temporary Variants from previous evaluate implementations.
Copy link
Member

@Ivorforce Ivorforce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me!

I benchmarked with your tests as well.
The improvement is not nearly as good for me as it is for you (did you benchmark an unoptimized build perhaps?), but the improvement is consistent, and what I'm measuring is more along the lines of what I would expect from this kind of change too. (I hope my latest proposed change isn't what broke it though! 😅 Edit: Seems not to be the case :) )

master (235a32ad11f40ecba26d6d9ceea8ab245c13adb0)
sort_int  : 161ms
sort_str  : 265ms
min_max   : 183ms
sum_int   : 299ms
sum_float : 357ms
sort_dict : 94ms

this pr (5f7566a72ec265335b006a337287b976fd77e3fd)
sort_int  : 130ms
sort_str  : 242ms
min_max   : 143ms
sum_int   : 301ms
sum_float : 340ms
sort_dict : 84ms

this pr, earlier state (d4a55a2206a2c8721d49794f2d1124f419d3286f)
sort_int  : 128ms
sort_str  : 242ms
min_max   : 142ms
sum_int   : 300ms
sum_float : 340ms
sort_dict : 86ms

@Ivorforce Ivorforce modified the milestones: 4.x, 4.6 Nov 28, 2025
@aaronp64
Copy link
Contributor Author

The improvement is not nearly as good for me as it is for you (did you benchmark an unoptimized build perhaps?), but the improvement is consistent, and what I'm measuring is more along the lines of what I would expect from this kind of change too. (I hope my latest proposed change isn't what broke it though! 😅 Edit: Seems not to be the case :) )

Sorry, should have mentioned in my last comment, I compared with initialize/change and got pretty much the same numbers, so I don't think that slowed it down. :)

I built with scons platform=linuxbsd production=yes debug_symbols=yes tests=yes, I'm actually not sure if there's more optimization options, or if debug_symbols limits what it can do.

@Ivorforce
Copy link
Member

Ivorforce commented Nov 28, 2025

I built with scons platform=linuxbsd production=yes debug_symbols=yes tests=yes, I'm actually not sure if there's more optimization options, or if debug_symbols limits what it can do.

The most prominent optimization option would be optimize=. It defaults to speed (-O3) for template_release builds, and to speed_trace (-O2) for template_debug builds (which is what you are building by default).
Testing with -O2 is appropriate, so I have no idea why the improvement is so much better for you than it is for me. I guess it could come down to compiler and hardware specifics.

@akien-mga akien-mga merged commit 37cfae8 into godotengine:master Dec 2, 2025
20 checks passed
@akien-mga
Copy link
Member

Thanks!

@aaronp64 aaronp64 deleted the variant_op_eval branch December 2, 2025 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

X Tutup