正则表达式是什么
对字符串操作的一种逻辑公式
练习方式
tool.oschina.NET/regex/
re 模块
常规匹配
- import re
- content = 'akjnankc123ajckncs'
- re_text = 'akj\d{3}\s{4}ncs'
- result = re.match(re_text, content)
- print(result)
- print(result.group())
- print(result.span())
泛匹配
- import re
- content = 'Hello this is Aaron'
- re_text = '^Hello.*Aaron$'
- result = re.match(re_text, content)
- print(result)
- print(result.group())
- print(result.span())
匹配目标
- import re
- content = 'Hello 123131this is Aaron'
- re_text = '^Hello (\d+).*Aaron$'
- result = re.match(re_text, content)
- print(result)
- print(result.group(1))
- print(result.span())
贪婪匹配
- import re
- content = 'Hello 123131this is Aaron'
- re_text = '^Hello.*(\d+).*Aaron$'
- result = re.match(re_text, content)
- print(result)
- print(result.group(1))
- print(result.span())
非贪婪匹配
- import re
- content = 'Hello 123131this is Aaron'
- re_text = '^Hello.*?(\d+).*Aaron$'
- result = re.match(re_text, content)
- print(result)
- print(result.group(1))
- print(result.span())
匹配模式
. 是不可以匹配换行符的
- import re
- content = '''Hello 123131
- this is Aaron'''re_text ='^Hello.*?(\d+).*Aaron$'
- result = re.match(re_text, content, re.S)
- print(result)
- print(result.group(1))
- print(result.span())
转义
使用 \ 来转义特殊字符
- re.search
- import re
- content = 'extra strings Hello 123131this is Aaron extra strings'
- re_text = 'Hello.*?(\d+).*Aaron'
- result = re.search(re_text, content)
- print(result)
- print(result.group(1))
- print(result.span())
- re.findall
搜索字符串, 以列表形式返回全部能匹配的子串
re.sub
替换字符串中每一个匹配的子串后返回替换后的字符串
- import re
- content = "asfawacw Hello 123123123this is Aaronawdawcwc"
- result = re.sub('(\d+)', '\1 2123', content)
- print(result)
- re.compile
将一个正则表达式串编译成正则对象, 以便复用该匹配模式
PS: 上述的 re_text 均应该使用这种方法
- import re
- content = "Hello 123123123this is Aaron"
- pattern = re.compile('Hello.*Aaron', re.S)
- result = re.match(pattern, content)
- print(result)
来源: http://www.jianshu.com/p/099b276e2998