X Tutup
Skip to content

gh-145678: Fix use-after-free in itertools.groupby _grouper_next()#145679

Open
sampsonc wants to merge 1 commit intopython:mainfrom
sampsonc:gh-145678-grouper-uaf
Open

gh-145678: Fix use-after-free in itertools.groupby _grouper_next()#145679
sampsonc wants to merge 1 commit intopython:mainfrom
sampsonc:gh-145678-grouper-uaf

Conversation

@sampsonc
Copy link

@sampsonc sampsonc commented Mar 9, 2026

Summary

Fixes a use-after-free (UAF) in _grouper_next() in Modules/itertoolsmodule.c.

Root Cause

_grouper_next() passed igo->tgtkey and gbo->currkey directly to
PyObject_RichCompareBool() without first holding strong references.
A user-defined __eq__ can re-enter the parent groupby iterator during
the comparison. That re-entry calls groupby_step(), which executes:

Py_XSETREF(gbo->currkey, newkey);

This frees gbo->currkey while it is still under comparison — a use-after-free.

Fix

Take INCREF'd local snapshots before calling PyObject_RichCompareBool(),
mirroring the protection added to groupby_next() in gh-143543:

PyObject *tgtkey = igo->tgtkey;
PyObject *currkey = gbo->currkey;
Py_INCREF(tgtkey);
Py_INCREF(currkey);
rcmp = PyObject_RichCompareBool(tgtkey, currkey, Py_EQ);
Py_DECREF(tgtkey);
Py_DECREF(currkey);

Test plan

  • ./python -m test test_itertools -k test_grouper_next_reentrant_eq_does_not_crash
  • Run with AddressSanitizer (./configure --with-pydebug && make with ASAN enabled) to confirm no UAF

Closes gh-145678.

🤖 Generated with Claude Code

@sampsonc sampsonc requested a review from rhettinger as a code owner March 9, 2026 14:42
@python-cla-bot
Copy link

python-cla-bot bot commented Mar 9, 2026

All commit authors signed the Contributor License Agreement.

CLA signed

_grouper_next() passed igo->tgtkey and gbo->currkey directly to
PyObject_RichCompareBool() without first holding strong references.
A re-entrant __eq__ that advanced the parent groupby iterator would
call groupby_step(), which executes Py_XSETREF(gbo->currkey, newkey),
freeing currkey while it was still under comparison.

Fix by taking INCREF'd local snapshots before the comparison, mirroring
the protection added to groupby_next() in pythongh-143543.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@sampsonc sampsonc force-pushed the gh-145678-grouper-uaf branch from eb0bb61 to 79bcea0 Compare March 9, 2026 15:12
@sampsonc
Copy link
Author

sampsonc commented Mar 9, 2026

The two CI failures (Windows free-threading arm64 and Docs) are pre-existing issues in main unrelated to this PR. The docs check-warnings.py script confirmed zero new warnings from our changes, and the Windows failure is ENV_CHANGED from test_multiprocessing_spawn.test_threads.

Comment on lines +681 to +688
/* A user-defined __eq__ can re-enter groupby (via the parent iterator)
and call groupby_step(), which frees gbo->currkey via Py_XSETREF while
we are still comparing it. Take local snapshots with strong references
so INCREF/DECREF apply to the same objects even under re-entrancy. */
PyObject *tgtkey = igo->tgtkey;
PyObject *currkey = gbo->currkey;
Py_INCREF(tgtkey);
Py_INCREF(currkey);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to record historical issues in the source. General reasons for holding refs while calling user code are clear -- the trick is finding cases where this was forgotten.

Suggested change
/* A user-defined __eq__ can re-enter groupby (via the parent iterator)
and call groupby_step(), which frees gbo->currkey via Py_XSETREF while
we are still comparing it. Take local snapshots with strong references
so INCREF/DECREF apply to the same objects even under re-entrancy. */
PyObject *tgtkey = igo->tgtkey;
PyObject *currkey = gbo->currkey;
Py_INCREF(tgtkey);
Py_INCREF(currkey);
PyObject *tgtkey = Py_NewRef(igo->tgtkey);
PyObject *currkey = Py_NewRef(gbo->currkey);

(cc @picnixz, reviewer of the earlier PR)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to record historical issues in the source

Victor sometimes asks me to do that, while Serhiy usually prefers not to, and I personally avoid but I did so in the past so I'm not that consistent either. I think I did add one such comment in OrderedDict so I don't mind.

+1 for Py_NewRef, it slipped through my review my bad.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, comments are good when it's not clear why the code does something. Here, it's relatively clear that we're calling user code. The exact way user code can clear references isn't that important -- and may change in the future.


Py_NewRef: no worries, that's a style nitpick.

values = [1, 1, 2]
keys_iter = iter([Key(1, True), Key(1, False), Key(2, False)])
g = itertools.groupby(values, lambda _: next(keys_iter))
outer_grouper = g
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just call it outer_grouper, or g. No need for two names.

Comment on lines +758 to +761
# regression test for gh-145678: _grouper_next() did not protect
# gbo->currkey / igo->tgtkey before calling PyObject_RichCompareBool,
# so a reentrant __eq__ that advanced the parent groupby could free
# those objects while they were still being compared (use-after-free).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference is enough.

Suggested change
# regression test for gh-145678: _grouper_next() did not protect
# gbo->currkey / igo->tgtkey before calling PyObject_RichCompareBool,
# so a reentrant __eq__ that advanced the parent groupby could free
# those objects while they were still being compared (use-after-free).
# regression test for gh-145678

Comment on lines +772 to +777
# Advance the parent groupby iterator from inside __eq__,
# which calls groupby_step() and frees the old currkey.
try:
next(outer_grouper)
except StopIteration:
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The try/except is not necessary; we don't get here with an exhausted iterator.

Lose the comment.

Comment on lines +1 to +7
Fix a use-after-free in :func:`itertools.groupby`: the internal
``_grouper_next()`` function did not hold strong references to
``gbo->currkey`` and ``igo->tgtkey`` before calling
:c:func:`PyObject_RichCompareBool`, so a re-entrant ``__eq__`` that advanced
the parent iterator (triggering ``groupby_step()`` and ``Py_XSETREF`` on
``currkey``) could free those objects while they were still being compared.
The fix mirrors the protection added in :gh:`143543` for ``groupby_next()``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Fix a use-after-free in :func:`itertools.groupby`: the internal
``_grouper_next()`` function did not hold strong references to
``gbo->currkey`` and ``igo->tgtkey`` before calling
:c:func:`PyObject_RichCompareBool`, so a re-entrant ``__eq__`` that advanced
the parent iterator (triggering ``groupby_step()`` and ``Py_XSETREF`` on
``currkey``) could free those objects while they were still being compared.
The fix mirrors the protection added in :gh:`143543` for ``groupby_next()``.
Fix a use-after-free crash in :func:`itertools.groupby` when a user-defined
``__eq__`` advanced the parent iterator while the iterator of groups was advanced.
The fix mirrors the protection added in :gh:`143543` for ``groupby_next()``.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

itertools.groupby: use-after-free in _grouper_next() when __eq__ re-enters the iterator

3 participants

X Tutup