X Tutup
Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions Doc/howto/regex.rst
Original file line number Diff line number Diff line change
Expand Up @@ -230,13 +230,13 @@ while ``+`` requires at least *one* occurrence. To use a similar example,
``ca+t`` will match ``'cat'`` (1 ``'a'``), ``'caaat'`` (3 ``'a'``\ s), but won't
match ``'ct'``.

There are two more repeating qualifiers. The question mark character, ``?``,
There are two more repeating operators or quantifiers. The question mark character, ``?``,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrap at 80 chars.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that in this case it is better to keep a line slightly longer that 80 characters than split it on two very short lines or reformat the whole paragraph. PEP 8 allows some liberty.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrongly assumed that in Python docs (as we do in Django docs) re-wrapping is stricter. TIL. Thanks 👍

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small compromises are possible. If it would be a new text, it would be stricter.

matches either once or zero times; you can think of it as marking something as
being optional. For example, ``home-?brew`` matches either ``'homebrew'`` or
``'home-brew'``.

The most complicated repeated qualifier is ``{m,n}``, where *m* and *n* are
decimal integers. This qualifier means there must be at least *m* repetitions,
The most complicated quantifier is ``{m,n}``, where *m* and *n* are
decimal integers. This quantifier means there must be at least *m* repetitions,
and at most *n*. For example, ``a/{1,3}b`` will match ``'a/b'``, ``'a//b'``, and
``'a///b'``. It won't match ``'ab'``, which has no slashes, or ``'a////b'``, which
has four.
Expand All @@ -245,7 +245,7 @@ You can omit either *m* or *n*; in that case, a reasonable value is assumed for
the missing value. Omitting *m* is interpreted as a lower limit of 0, while
omitting *n* results in an upper bound of infinity.

Readers of a reductionist bent may notice that the three other qualifiers can
Readers of a reductionist bent may notice that the three other quantifiers can
all be expressed using this notation. ``{0,}`` is the same as ``*``, ``{1,}``
is equivalent to ``+``, and ``{0,1}`` is the same as ``?``. It's better to use
``*``, ``+``, or ``?`` when you can, simply because they're shorter and easier
Expand Down Expand Up @@ -803,7 +803,7 @@ which matches the header's value.
Groups are marked by the ``'('``, ``')'`` metacharacters. ``'('`` and ``')'``
have much the same meaning as they do in mathematical expressions; they group
together the expressions contained inside them, and you can repeat the contents
of a group with a repeating qualifier, such as ``*``, ``+``, ``?``, or
of a group with a quantifier, such as ``*``, ``+``, ``?``, or
``{m,n}``. For example, ``(ab)*`` will match zero or more repetitions of
``ab``. ::

Expand Down Expand Up @@ -1326,7 +1326,7 @@ backtrack character by character until it finds a match for the ``>``. The
final match extends from the ``'<'`` in ``'<html>'`` to the ``'>'`` in
``'</title>'``, which isn't what you want.

In this case, the solution is to use the non-greedy qualifiers ``*?``, ``+?``,
In this case, the solution is to use the non-greedy quantifiers ``*?``, ``+?``,
``??``, or ``{m,n}?``, which match as *little* text as possible. In the above
example, the ``'>'`` is tried immediately after the first ``'<'`` matches, and
when it fails, the engine advances a character at a time, retrying the ``'>'``
Expand Down
16 changes: 8 additions & 8 deletions Doc/library/re.rst
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ Some characters, like ``'|'`` or ``'('``, are special. Special
characters either stand for classes of ordinary characters, or affect
how the regular expressions around them are interpreted.

Repetition qualifiers (``*``, ``+``, ``?``, ``{m,n}``, etc) cannot be
Repetition operators or quantifiers (``*``, ``+``, ``?``, ``{m,n}``, etc) cannot be
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrap at 80 chars.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not PEP 8, but some other PEP for documentation. In any case a common sense is the main principle.

directly nested. This avoids ambiguity with the non-greedy modifier suffix
``?``, and with other modifiers in other implementations. To apply a second
repetition to an inner repetition, parentheses may be used. For example,
Expand Down Expand Up @@ -146,10 +146,10 @@ The special characters are:
single: ??; in regular expressions

``*?``, ``+?``, ``??``
The ``'*'``, ``'+'``, and ``'?'`` qualifiers are all :dfn:`greedy`; they match
The ``'*'``, ``'+'``, and ``'?'`` quantifiers are all :dfn:`greedy`; they match
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrap at 80 chars.

as much text as possible. Sometimes this behaviour isn't desired; if the RE
``<.*>`` is matched against ``'<a> b <c>'``, it will match the entire
string, and not just ``'<a>'``. Adding ``?`` after the qualifier makes it
string, and not just ``'<a>'``. Adding ``?`` after the quantifier makes it
perform the match in :dfn:`non-greedy` or :dfn:`minimal` fashion; as *few*
characters as possible will be matched. Using the RE ``<.*?>`` will match
only ``'<a>'``.
Expand All @@ -160,11 +160,11 @@ The special characters are:
single: ?+; in regular expressions

``*+``, ``++``, ``?+``
Like the ``'*'``, ``'+'``, and ``'?'`` qualifiers, those where ``'+'`` is
Like the ``'*'``, ``'+'``, and ``'?'`` quantifiers, those where ``'+'`` is
appended also match as many times as possible.
However, unlike the true greedy qualifiers, these do not allow
However, unlike the true greedy quantifiers, these do not allow
back-tracking when the expression following it fails to match.
These are known as :dfn:`possessive` qualifiers.
These are known as :dfn:`possessive` quantifiers.
For example, ``a*a`` will match ``'aaaa'`` because the ``a*`` will match
all 4 ``'a'``s, but, when the final ``'a'`` is encountered, the
expression is backtracked so that in the end the ``a*`` ends up matching
Expand Down Expand Up @@ -198,15 +198,15 @@ The special characters are:
``{m,n}?``
Causes the resulting RE to match from *m* to *n* repetitions of the preceding
RE, attempting to match as *few* repetitions as possible. This is the
non-greedy version of the previous qualifier. For example, on the
non-greedy version of the previous quantifier. For example, on the
6-character string ``'aaaaaa'``, ``a{3,5}`` will match 5 ``'a'`` characters,
while ``a{3,5}?`` will only match 3 characters.

``{m,n}+``
Causes the resulting RE to match from *m* to *n* repetitions of the
preceding RE, attempting to match as many repetitions as possible
*without* establishing any backtracking points.
This is the possessive version of the qualifier above.
This is the possessive version of the quantifier above.
For example, on the 6-character string ``'aaaaaa'``, ``a{3,5}+aa``
attempt to match 5 ``'a'`` characters, then, requiring 2 more ``'a'``s,
will need more characters than available and thus fail, while
Expand Down
2 changes: 1 addition & 1 deletion Doc/whatsnew/3.11.rst
Original file line number Diff line number Diff line change
Expand Up @@ -298,7 +298,7 @@ os
re
--

* Atomic grouping (``(?>...)``) and possessive qualifiers (``*+``, ``++``,
* Atomic grouping (``(?>...)``) and possessive quantifiers (``*+``, ``++``,
``?+``, ``{m,n}+``) are now supported in regular expressions.
(Contributed by Jeffrey C. Jacobs and Serhiy Storchaka in :issue:`433030`.)

Expand Down
10 changes: 5 additions & 5 deletions Lib/test/test_re.py
Original file line number Diff line number Diff line change
Expand Up @@ -2038,9 +2038,9 @@ def test_bug_40736(self):
with self.assertRaisesRegex(TypeError, "got 'type'"):
re.search("x*", type)

def test_possessive_qualifiers(self):
"""Test Possessive Qualifiers
Test qualifiers of the form @+ for some repetition operator @,
def test_possessive_quantifiers(self):
"""Test Possessive Quantifiers
Test quantifiers of the form @+ for some repetition operator @,
e.g. x{3,5}+ meaning match from 3 to 5 greadily and proceed
without creating a stack frame for rolling the stack back and
trying 1 or more fewer matches."""
Expand Down Expand Up @@ -2077,7 +2077,7 @@ def test_possessive_qualifiers(self):
self.assertIsNone(re.match("^x{}+$", "xxx"))
self.assertTrue(re.match("^x{}+$", "x{}"))

def test_fullmatch_possessive_qualifiers(self):
def test_fullmatch_possessive_quantifiers(self):
self.assertTrue(re.fullmatch(r'a++', 'a'))
self.assertTrue(re.fullmatch(r'a*+', 'a'))
self.assertTrue(re.fullmatch(r'a?+', 'a'))
Expand All @@ -2096,7 +2096,7 @@ def test_fullmatch_possessive_qualifiers(self):
self.assertIsNone(re.fullmatch(r'(?:ab)?+', 'abc'))
self.assertIsNone(re.fullmatch(r'(?:ab){1,3}+', 'abc'))

def test_findall_possessive_qualifiers(self):
def test_findall_possessive_quantifiers(self):
self.assertEqual(re.findall(r'a++', 'aab'), ['aa'])
self.assertEqual(re.findall(r'a*+', 'aab'), ['aa', '', ''])
self.assertEqual(re.findall(r'a?+', 'aab'), ['a', 'a', '', ''])
Expand Down
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
Add support of atomic grouping (``(?>...)``) and possessive qualifiers
Add support of atomic grouping (``(?>...)``) and possessive quantifiers
(``*+``, ``++``, ``?+``, ``{m,n}+``) in :mod:`regular expressions <re>`.
X Tutup