第一章：文本-re:正则表达式-模式语法（4）

1.3.4.4 锚定
除了描述要匹配的模式的内容之外，还可以使用锚定指定指定模式在输入文本中的相对位置。表1-2列出了合法的锚定码。
表1-2 正则表达式锚定码

锚定码	含义
^	字符串或行的开头
$	字符串或行末尾
\A	字符串开头
\Z	字符串末尾
\b	单词开头或末尾的空串
\B	不在单词开头或末尾的空串

# re_test_patterns.py
import re

def test_patterns(text,patterns):
    """Given source text and a list of patterns,look for
    matches for each pattern within the text and print
    them to stdout.
    """

    # Look for each pattern in the text and print the results.
    for pattern,desc in patterns:
        print("'{}' ({})\n".format(pattern,desc))
        print(" '{}'".format(text))
        for match in re.finditer(pattern,text):
            s = match.start()
            e = match.end()
            substr = text[s:e]
            n_backslashes = text[:s].count('\\')
            prefix = '.' * (s + n_backslashes)
            print(" {}'{}'".format(prefix,substr))
        print()
    return

if __name__ == '__main__':
    test_patterns('abbaaabbbbaaaaa',[('ab',"'a' followed by 'b'")])

from re_test_patterns import test_patterns

test_patterns(
    'This is some text -- with punctuation.',
    [(r'^\w+','word at start of string'),
     (r'\A\w+','word at start of string'),
     (r'\w+\S*$','word near end of string'),
     (r'\w+\S*\Z','word near ned of string'),
     (r'w*t\w*','word containing t'),
     (r'\bt\w+','t at start of word'),
     (r'\w+t\b','t at end of word'),
     (r'\Bt\B','t,not start or end of word')
        ],
    )

这个例子中，匹配字符串开头和末尾单词的模式是不同的，因为字符串末尾的单词后面有结束句子的标点符号。模式\w+$不能匹配，因为.不能被认为是一个字母数字字符。

运行结果：
第一章：文本-re:正则表达式-模式语法（4）

第一章：文本-re:正则表达式-模式语法（4）

相关推荐