Clean up `String::find` and similar functions to remove duplicate code, and speed up comparison. by Ivorforce · Pull Request #101247 · godotengine/godot

Ivorforce · 2025-01-07T21:40:26Z

I cleaned up all String find-like functions. They were very inconsistent implementation wise, unnecessarily long, and had one or two opportunities for optimization.

Changes

Abstracted ranged string compare into string_equal_lower, to make implementations simpler.
Made variable names consistent.
Shortened implementation where possible.
Moved bounds checks out of the loop (-> 10% speed up).
Used memcmp where possible (-> 2x speed up).
I missed that one of the two implementations had if (src_len == 1) already. Due to my check added in Optimize string single char contains calls. #100024, this branch is no longer necessary.
Added a few edge case tests.

I'm pretty sure the new code doesn't change any behavior, except print no errors if p_from was too large on find.

Benchmark

My changes can improve string-string case sensitive finds (find, rfind, findmk) by 2x:

	String s1 = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.";
	String s2 = "Ut enim ad minim veniam";

	{
		auto t0 = std::chrono::high_resolution_clock::now();
		for (int i = 0; i < 10000000; i ++) {
			// Test
			int r = s1.find(s2);
		}
		auto t1 = std::chrono::high_resolution_clock::now();
		std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(t1 - t0).count() << "ms\n";
	}

2131ms on master (5b52b4b)
1054ms on this PR (3e000ed43dd6905f77c4840d1ab49bc443ab6cb7)

All other find functions will experience a ~10% speed up (tested earlier).

Ivorforce · 2025-01-13T14:55:22Z

When implementing #101493 I realized I should have used memcmp in the first place! This simple change improved performance by about 2x, see above.

core/string/ustring.cpp

Ivorforce · 2025-01-16T12:08:00Z

I included the other find-like function in the refactor (thanks to @ColinSORourke for the suggestion).
The functions were very inconsistent and need the clean up. This also means that rfind and findmk can benefit from the memcmp optimization.
@kiroxas I invalidated your approval since the change is quite large.

core/string/ustring.cpp

Ivorforce · 2025-09-22T12:48:19Z

Since #104332 was merged, a good bit of this PR was superseded.
I ended up rewriting it (almost) from scratch. But there is still significant simplification to be had in these functions.

core/string/ustring.cpp

kiroxas · 2025-09-22T14:23:37Z

core/templates/span.h

-constexpr int64_t Span<T>::rfind_sequence(const Span<T> &p_span, uint64_t p_from) const {
+template <typename T1>
+constexpr int64_t Span<T>::rfind_sequence(const Span<T1> &p_span, uint64_t p_from) const {
 	for (int64_t i = p_from; i >= 0; i--) {


should probably be i = Minimum(p_from, size() - p_span.size()) or spans_equal might search out of bounds

Span is designed to be a low-level API, much like C++ low level APIs. In that, it assumes the caller knows what they're doing, and passes valid arguments.
If Span was defensive, and checked for the size against the parameter, it could measurably slow down rfind_sequence calls in a loop (which is something we actually do). The reason being that, on every iteration, it would make an unnecessary size check, even though it is known that the passed index is correct.

If that's intentionnal then it's all good. It's just that you cannot shoot yourself in the foot with find_sequence, but you can with rfind_sequence, just need (a little) extra care when reviewing code where it's implied.

Is it worth DEV_ASSERT()ing this?

Good idea! Note that the function wasn't introduced in this PR, but I guess it's a good opportunity to foolproof it a bit.

lawnjelly · 2025-11-18T17:41:50Z

This looks fine barring query about the DEV_ASSERT(). Once cleared up am happy to approve. 👍

core/templates/span.h

…on with `memcmp` where possible.

lawnjelly

Looks good.

(I haven't tested, buy I trust the author has tested, and from discussions in RC it seems like the string stuff would crash immediately if problems.)

Ivorforce · 2025-11-20T09:19:30Z

Looks good.

(I haven't tested, buy I trust the author has tested, and from discussions in RC it seems like the string stuff would crash immediately if problems.)

I added the new unit tests for this reason. They should cover the changes' correctness.

Repiteo · 2025-11-20T17:18:38Z

Thanks!

Ivorforce requested a review from a team as a code owner January 7, 2025 21:40

Ivorforce changed the title ~~Clean up String::find to remove duplicate code, and speed up comparison slightly.~~ Clean up String::find to remove duplicate code, and speed up comparison slightly. Jan 7, 2025

clayjohn added enhancement topic:core performance labels Jan 7, 2025

clayjohn added this to the 4.x milestone Jan 7, 2025

Ivorforce force-pushed the string-find-cleanup branch from 22cebbb to 3e000ed Compare January 13, 2025 14:52

Ivorforce changed the title ~~Clean up String::find to remove duplicate code, and speed up comparison slightly.~~ Clean up String::find to remove duplicate code, and speed up comparison. Jan 13, 2025

kiroxas approved these changes Jan 13, 2025

View reviewed changes

core/string/ustring.cpp Outdated Show resolved Hide resolved

core/string/ustring.cpp Outdated Show resolved Hide resolved

Ivorforce force-pushed the string-find-cleanup branch 2 times, most recently from 4205232 to 88bedca Compare January 16, 2025 12:05

Ivorforce requested a review from a team as a code owner January 16, 2025 12:05

Ivorforce requested a review from kiroxas January 16, 2025 12:05

Ivorforce force-pushed the string-find-cleanup branch 4 times, most recently from effe962 to e4b39cf Compare January 16, 2025 12:55

Ivorforce changed the title ~~Clean up String::find to remove duplicate code, and speed up comparison.~~ Clean up String::find and similar functions to remove duplicate code, and speed up comparison. Jan 16, 2025

kiroxas approved these changes Jan 16, 2025

View reviewed changes

core/string/ustring.cpp Outdated Show resolved Hide resolved

core/string/ustring.cpp Outdated Show resolved Hide resolved

core/string/ustring.cpp Outdated Show resolved Hide resolved

core/string/ustring.cpp Outdated Show resolved Hide resolved

Ivorforce force-pushed the string-find-cleanup branch from e4b39cf to 348433a Compare January 16, 2025 13:33

Ivorforce force-pushed the string-find-cleanup branch from 348433a to 90468e2 Compare May 12, 2025 19:57

Ivorforce force-pushed the string-find-cleanup branch from 90468e2 to b61ed1a Compare September 22, 2025 12:47

Ivorforce force-pushed the string-find-cleanup branch 2 times, most recently from 0c12dc1 to 8206228 Compare September 22, 2025 13:05

kiroxas reviewed Sep 22, 2025

View reviewed changes

Ivorforce force-pushed the string-find-cleanup branch from 8206228 to e0046ca Compare November 17, 2025 17:41

Ivorforce requested a review from lawnjelly November 17, 2025 17:46

Ivorforce force-pushed the string-find-cleanup branch from e0046ca to 750efec Compare November 19, 2025 13:09

lawnjelly reviewed Nov 19, 2025

View reviewed changes

core/templates/span.h Outdated Show resolved Hide resolved

Clean up String::find to remove duplicate code, and speed up comparis…

43a8009

…on with `memcmp` where possible.

Ivorforce force-pushed the string-find-cleanup branch from 750efec to 43a8009 Compare November 19, 2025 21:39

lawnjelly approved these changes Nov 20, 2025

View reviewed changes

Repiteo modified the milestones: 4.x, 4.6 Nov 20, 2025

Repiteo merged commit 8deb221 into godotengine:master Nov 20, 2025
20 checks passed

Ivorforce deleted the string-find-cleanup branch November 20, 2025 17:52

Ivorforce mentioned this pull request Nov 22, 2025

Fix String::rfindn for strings with only one character. #113044

Merged

Uh oh!

Conversation

Ivorforce commented Jan 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Benchmark

Uh oh!

Ivorforce commented Jan 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Ivorforce commented Jan 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Ivorforce commented Sep 22, 2025

Uh oh!

Uh oh!

kiroxas Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

Ivorforce Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kiroxas Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

lawnjelly Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

Ivorforce Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

lawnjelly commented Nov 18, 2025

Uh oh!

Uh oh!

lawnjelly left a comment

Choose a reason for hiding this comment

Uh oh!

Ivorforce commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Repiteo commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Ivorforce commented Jan 7, 2025 •

edited

Loading

Ivorforce commented Jan 13, 2025 •

edited

Loading

Ivorforce commented Jan 16, 2025 •

edited

Loading

Ivorforce Sep 22, 2025 •

edited

Loading

Ivorforce commented Nov 20, 2025 •

edited

Loading