X Tutup
Skip to content

ResourceLoader: Fix potential infinite recursion in progress reporting#113114

Merged
akien-mga merged 1 commit intogodotengine:masterfrom
RandomShaper:fix_load_progress_cycles
Dec 2, 2025
Merged

ResourceLoader: Fix potential infinite recursion in progress reporting#113114
akien-mga merged 1 commit intogodotengine:masterfrom
RandomShaper:fix_load_progress_cycles

Conversation

@RandomShaper
Copy link
Member

The issue has been found by @DarioSamo.

The problem is that the boundaries between multiple dependency trees are trespassed for progress reporting, which is not bad per se, but that can lead to cycles, which in turn lead to infinite recursion.

@RandomShaper RandomShaper added this to the 4.6 milestone Nov 24, 2025
@RandomShaper RandomShaper requested a review from a team as a code owner November 24, 2025 16:34
@RandomShaper RandomShaper added bug topic:core needs testing cherrypick:4.4 Considered for cherry-picking into a future 4.4.x release cherrypick:4.5 Considered for cherry-picking into a future 4.5.x release labels Nov 24, 2025
@RandomShaper RandomShaper force-pushed the fix_load_progress_cycles branch from 46dcc5e to 17ac9b3 Compare November 24, 2025 16:35
Copy link
Contributor

@DarioSamo DarioSamo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the issue for me, although I think there's potentially a better way to fix it by addressing the root cause. However, this should address a critical issue present at the moment that makes the threaded background loader not viable to ship otherwise.

@RandomShaper
Copy link
Member Author

Elaborating, this fix is natural to the current approach to progress computation. The current algorithm doesn't isolate different load trees so a given dependency being loaded, present in multiple ones, can contribute to the progress report of both trees. This implies that cycles can happen, something we hadn't add a countermeasure against yet.

The "root cause," as I see this, is precisely the whole current approach to progress reporting. There may be a better way to do it, which is more direct or freer or pathological cases.

@RandomShaper RandomShaper force-pushed the fix_load_progress_cycles branch from 17ac9b3 to 84dc324 Compare November 26, 2025 13:58
@MarianoGnu
Copy link
Contributor

CICD complains
'awaited': default member initializers for bit-fields requires at least '/std:c++20'
default member initializer for bit-field is a C++20 extension

@RandomShaper RandomShaper force-pushed the fix_load_progress_cycles branch from 84dc324 to 8480dc2 Compare December 1, 2025 06:22
Copy link
Member

@Ivorforce Ivorforce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what's causing cycles here, but the addition of the bool is cheap, so it's a simple fix.
I assume you have made sure that the function cannot be called / isn't called on multiple threads at once.


bool awaited : 1; // If it's in the pool, this helps not awaiting from more than one dependent thread.
bool need_wait : 1;
bool in_progress_check : 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add a comment like // Measure against recursion cycles in progress reporting. Cycles are not expected, but can happen due to how it's currently implemented.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm adding an explanation, not exactly that one.

@RandomShaper RandomShaper force-pushed the fix_load_progress_cycles branch from 8480dc2 to 3596d75 Compare December 1, 2025 11:21
@RandomShaper RandomShaper force-pushed the fix_load_progress_cycles branch from 3596d75 to 5806e3c Compare December 1, 2025 11:33
@RandomShaper
Copy link
Member Author

I assume you have made sure that the function cannot be called / isn't called on multiple threads at once.

This function is only called with the lock held.

@DarioSamo
Copy link
Contributor

DarioSamo commented Dec 1, 2025

I'm not sure I understand what's causing cycles here, but the addition of the bool is cheap, so it's a simple fix. I assume you have made sure that the function cannot be called / isn't called on multiple threads at once.

It is thread-safe as far as data races go, but there's a random factor from said threads affecting the "stack" in the relevant code that means there's a good chance child resources will add a parent resource as a subtask, leading to an infinite loop. For example, a .mesh ends up adding a sub-task of a .tscn being considered a sub-task, and said .tscn has the subtask of loading the same mesh. This is what led to this being very random and hard to track when I investigated it and why it's not easy to make a consistent MRP for it.

Pedro claims fixing this scenario would require a redesign, and I'm not really sure I follow along the logic of the code too well to understand how it works yet, but fixing it this way is certainly better than leaving it unattended. Since the function is used only for cosmetic purposes (computing the progress), it should be a better outcome than it causing a crash from stack overflow.

@akien-mga akien-mga merged commit a184801 into godotengine:master Dec 2, 2025
20 checks passed
@akien-mga
Copy link
Member

Thanks!

@akien-mga akien-mga removed cherrypick:4.4 Considered for cherry-picking into a future 4.4.x release cherrypick:4.5 Considered for cherry-picking into a future 4.5.x release labels Jan 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

X Tutup