gh-145678: Fix use-after-free in itertools.groupby _grouper_next()#145679
gh-145678: Fix use-after-free in itertools.groupby _grouper_next()#145679sampsonc wants to merge 1 commit intopython:mainfrom
Conversation
_grouper_next() passed igo->tgtkey and gbo->currkey directly to PyObject_RichCompareBool() without first holding strong references. A re-entrant __eq__ that advanced the parent groupby iterator would call groupby_step(), which executes Py_XSETREF(gbo->currkey, newkey), freeing currkey while it was still under comparison. Fix by taking INCREF'd local snapshots before the comparison, mirroring the protection added to groupby_next() in pythongh-143543. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
eb0bb61 to
79bcea0
Compare
|
The two CI failures (Windows free-threading arm64 and Docs) are pre-existing issues in main unrelated to this PR. The docs check-warnings.py script confirmed zero new warnings from our changes, and the Windows failure is ENV_CHANGED from test_multiprocessing_spawn.test_threads. |
| /* A user-defined __eq__ can re-enter groupby (via the parent iterator) | ||
| and call groupby_step(), which frees gbo->currkey via Py_XSETREF while | ||
| we are still comparing it. Take local snapshots with strong references | ||
| so INCREF/DECREF apply to the same objects even under re-entrancy. */ | ||
| PyObject *tgtkey = igo->tgtkey; | ||
| PyObject *currkey = gbo->currkey; | ||
| Py_INCREF(tgtkey); | ||
| Py_INCREF(currkey); |
There was a problem hiding this comment.
I don't think we need to record historical issues in the source. General reasons for holding refs while calling user code are clear -- the trick is finding cases where this was forgotten.
| /* A user-defined __eq__ can re-enter groupby (via the parent iterator) | |
| and call groupby_step(), which frees gbo->currkey via Py_XSETREF while | |
| we are still comparing it. Take local snapshots with strong references | |
| so INCREF/DECREF apply to the same objects even under re-entrancy. */ | |
| PyObject *tgtkey = igo->tgtkey; | |
| PyObject *currkey = gbo->currkey; | |
| Py_INCREF(tgtkey); | |
| Py_INCREF(currkey); | |
| PyObject *tgtkey = Py_NewRef(igo->tgtkey); | |
| PyObject *currkey = Py_NewRef(gbo->currkey); |
(cc @picnixz, reviewer of the earlier PR)
There was a problem hiding this comment.
I don't think we need to record historical issues in the source
Victor sometimes asks me to do that, while Serhiy usually prefers not to, and I personally avoid but I did so in the past so I'm not that consistent either. I think I did add one such comment in OrderedDict so I don't mind.
+1 for Py_NewRef, it slipped through my review my bad.
There was a problem hiding this comment.
IMO, comments are good when it's not clear why the code does something. Here, it's relatively clear that we're calling user code. The exact way user code can clear references isn't that important -- and may change in the future.
Py_NewRef: no worries, that's a style nitpick.
| values = [1, 1, 2] | ||
| keys_iter = iter([Key(1, True), Key(1, False), Key(2, False)]) | ||
| g = itertools.groupby(values, lambda _: next(keys_iter)) | ||
| outer_grouper = g |
There was a problem hiding this comment.
Just call it outer_grouper, or g. No need for two names.
| # regression test for gh-145678: _grouper_next() did not protect | ||
| # gbo->currkey / igo->tgtkey before calling PyObject_RichCompareBool, | ||
| # so a reentrant __eq__ that advanced the parent groupby could free | ||
| # those objects while they were still being compared (use-after-free). |
There was a problem hiding this comment.
The reference is enough.
| # regression test for gh-145678: _grouper_next() did not protect | |
| # gbo->currkey / igo->tgtkey before calling PyObject_RichCompareBool, | |
| # so a reentrant __eq__ that advanced the parent groupby could free | |
| # those objects while they were still being compared (use-after-free). | |
| # regression test for gh-145678 |
| # Advance the parent groupby iterator from inside __eq__, | ||
| # which calls groupby_step() and frees the old currkey. | ||
| try: | ||
| next(outer_grouper) | ||
| except StopIteration: | ||
| pass |
There was a problem hiding this comment.
The try/except is not necessary; we don't get here with an exhausted iterator.
Lose the comment.
| Fix a use-after-free in :func:`itertools.groupby`: the internal | ||
| ``_grouper_next()`` function did not hold strong references to | ||
| ``gbo->currkey`` and ``igo->tgtkey`` before calling | ||
| :c:func:`PyObject_RichCompareBool`, so a re-entrant ``__eq__`` that advanced | ||
| the parent iterator (triggering ``groupby_step()`` and ``Py_XSETREF`` on | ||
| ``currkey``) could free those objects while they were still being compared. | ||
| The fix mirrors the protection added in :gh:`143543` for ``groupby_next()``. |
There was a problem hiding this comment.
| Fix a use-after-free in :func:`itertools.groupby`: the internal | |
| ``_grouper_next()`` function did not hold strong references to | |
| ``gbo->currkey`` and ``igo->tgtkey`` before calling | |
| :c:func:`PyObject_RichCompareBool`, so a re-entrant ``__eq__`` that advanced | |
| the parent iterator (triggering ``groupby_step()`` and ``Py_XSETREF`` on | |
| ``currkey``) could free those objects while they were still being compared. | |
| The fix mirrors the protection added in :gh:`143543` for ``groupby_next()``. | |
| Fix a use-after-free crash in :func:`itertools.groupby` when a user-defined | |
| ``__eq__`` advanced the parent iterator while the iterator of groups was advanced. | |
| The fix mirrors the protection added in :gh:`143543` for ``groupby_next()``. |
Summary
Fixes a use-after-free (UAF) in
_grouper_next()inModules/itertoolsmodule.c.Root Cause
_grouper_next()passedigo->tgtkeyandgbo->currkeydirectly toPyObject_RichCompareBool()without first holding strong references.A user-defined
__eq__can re-enter the parentgroupbyiterator duringthe comparison. That re-entry calls
groupby_step(), which executes:This frees
gbo->currkeywhile it is still under comparison — a use-after-free.Fix
Take INCREF'd local snapshots before calling
PyObject_RichCompareBool(),mirroring the protection added to
groupby_next()in gh-143543:Test plan
./python -m test test_itertools -k test_grouper_next_reentrant_eq_does_not_crash./configure --with-pydebug && makewith ASAN enabled) to confirm no UAFCloses gh-145678.
🤖 Generated with Claude Code