55****************************
66
77:Author: A.M. Kuchling <amk@amk.ca>
8- :Release: 0.05
98
109.. TODO:
1110 Document lookbehind assertions
2423Introduction
2524============
2625
27- The :mod: `re ` module was added in Python 1.5, and provides Perl-style regular
28- expression patterns. Earlier versions of Python came with the :mod: `regex `
29- module, which provided Emacs-style patterns. The :mod: `regex ` module was
30- removed completely in Python 2.5.
31-
3226Regular expressions (called REs, or regexes, or regex patterns) are essentially
3327a tiny, highly specialized programming language embedded inside Python and made
3428available through the :mod: `re ` module. Using this little language, you specify
@@ -264,7 +258,7 @@ performing string substitutions. ::
264258 >>> import re
265259 >>> p = re.compile('ab*')
266260 >>> p
267- <_sre.SRE_Pattern object at 80b4150 >
261+ <_sre.SRE_Pattern object at 0x... >
268262
269263:func: `re.compile ` also accepts an optional *flags * argument, used to enable
270264various special features and syntax variations. We'll go over the available
@@ -362,8 +356,8 @@ information about the match: where it starts and ends, the substring it matched,
362356and more.
363357
364358You can learn about this by interactively experimenting with the :mod: `re `
365- module. If you have Tkinter available, you may also want to look at
366- :file: `Tools/scripts /redemo.py `, a demonstration program included with the
359+ module. If you have :mod: ` tkinter ` available, you may also want to look at
360+ :file: `Tools/demo /redemo.py `, a demonstration program included with the
367361Python distribution. It allows you to enter REs and strings, and displays
368362whether the RE matches or fails. :file: `redemo.py ` can be quite useful when
369363trying to debug a complicated RE. Phil Schwartz's `Kodos
@@ -373,11 +367,10 @@ testing RE patterns.
373367This HOWTO uses the standard Python interpreter for its examples. First, run the
374368Python interpreter, import the :mod: `re ` module, and compile a RE::
375369
376- Python 2.2.2 (#1, Feb 10 2003, 12:57:01)
377370 >>> import re
378371 >>> p = re.compile('[a-z]+')
379372 >>> p
380- <_sre.SRE_Pattern object at 80c3c28 >
373+ <_sre.SRE_Pattern object at 0x... >
381374
382375Now, you can try matching various strings against the RE ``[a-z]+ ``. An empty
383376string shouldn't match at all, since ``+ `` means 'one or more repetitions'.
@@ -395,7 +388,7 @@ result in a variable for later use. ::
395388
396389 >>> m = p.match('tempo')
397390 >>> m
398- <_sre.SRE_Match object at 80c4f68 >
391+ <_sre.SRE_Match object at 0x... >
399392
400393Now you can query the :class: `MatchObject ` for information about the matching
401394string. :class: `MatchObject ` instances also have several methods and
@@ -434,7 +427,7 @@ case. ::
434427 >>> print(p.match('::: message'))
435428 None
436429 >>> m = p.search('::: message') ; print(m)
437- <re.MatchObject instance at 80c9650 >
430+ <_sre.SRE_Match object at 0x... >
438431 >>> m.group()
439432 'message'
440433 >>> m.span()
@@ -459,11 +452,11 @@ Two pattern methods return all of the matches for a pattern.
459452
460453:meth: `findall ` has to create the entire list before it can be returned as the
461454result. The :meth: `finditer ` method returns a sequence of :class: `MatchObject `
462- instances as an :term: `iterator `. [ # ]_ ::
455+ instances as an :term: `iterator `::
463456
464457 >>> iterator = p.finditer('12 drummers drumming, 11 ... 10 ...')
465458 >>> iterator
466- <callable-iterator object at 0x401833ac >
459+ <callable_iterator object at 0x... >
467460 >>> for match in iterator:
468461 ... print(match.span())
469462 ...
@@ -485,7 +478,7 @@ the RE string added as the first argument, and still return either ``None`` or a
485478 >>> print(re.match(r'From\s+', 'Fromage amk'))
486479 None
487480 >>> re.match(r'From\s+', 'From amk Thu May 14 19:12:10 1998')
488- <re.MatchObject instance at 80c5978 >
481+ <_sre.SRE_Match object at 0x... >
489482
490483Under the hood, these functions simply create a pattern object for you
491484and call the appropriate method on it. They also store the compiled object in a
@@ -687,7 +680,7 @@ given location, they can obviously be matched an infinite number of times.
687680 line, the RE to use is ``^From ``. ::
688681
689682 >>> print(re.search('^From', 'From Here to Eternity'))
690- <re.MatchObject instance at 80c1520 >
683+ <_sre.SRE_Match object at 0x... >
691684 >>> print(re.search('^From', 'Reciting From Memory'))
692685 None
693686
@@ -699,11 +692,11 @@ given location, they can obviously be matched an infinite number of times.
699692 or any location followed by a newline character. ::
700693
701694 >>> print(re.search('}$', '{block}'))
702- <re.MatchObject instance at 80adfa8 >
695+ <_sre.SRE_Match object at 0x... >
703696 >>> print(re.search('}$', '{block} '))
704697 None
705698 >>> print(re.search('}$', '{block}\n'))
706- <re.MatchObject instance at 80adfa8 >
699+ <_sre.SRE_Match object at 0x... >
707700
708701 To match a literal ``'$' ``, use ``\$ `` or enclose it inside a character class,
709702 as in ``[$] ``.
@@ -728,7 +721,7 @@ given location, they can obviously be matched an infinite number of times.
728721
729722 >>> p = re.compile(r'\bclass\b')
730723 >>> print(p.search('no class at all'))
731- <re.MatchObject instance at 80c8f28 >
724+ <_sre.SRE_Match object at 0x... >
732725 >>> print(p.search('the declassified algorithm'))
733726 None
734727 >>> print(p.search('one subclass is'))
@@ -746,7 +739,7 @@ given location, they can obviously be matched an infinite number of times.
746739 >>> print(p.search('no class at all'))
747740 None
748741 >>> print(p.search('\b' + 'class' + '\b') )
749- <re.MatchObject instance at 80c3ee0 >
742+ <_sre.SRE_Match object at 0x... >
750743
751744 Second, inside a character class, where there's no use for this assertion,
752745 ``\b `` represents the backspace character, for compatibility with Python's
@@ -1316,8 +1309,8 @@ a regular expression that handles all of the possible cases, the patterns will
13161309be *very * complicated. Use an HTML or XML parser module for such tasks.)
13171310
13181311
1319- Not Using re.VERBOSE
1320- --------------------
1312+ Using re.VERBOSE
1313+ ----------------
13211314
13221315By now you've probably noticed that regular expressions are a very compact
13231316notation, but they're not terribly readable. REs of moderate complexity can
@@ -1366,8 +1359,3 @@ reference for programming in Python. (The first edition covered Python's
13661359now-removed :mod: `regex ` module, which won't help you much.) Consider checking
13671360it out from your library.
13681361
1369-
1370- .. rubric :: Footnotes
1371-
1372- .. [# ] Introduced in Python 2.2.2.
1373-
0 commit comments