为了证明正则表达式功能的强大之处,我们先用个小例子体现一下:
# 用之前学过的知识判断输入的是否是手机号码
def isPhoneNumber(str):
if len(str) != 11:
return False
elif str[0] != "1":
return False
# 这里只列出了常见的几种手机号码的开头
elif str[1:3] != "31" and str[1:3] != "32" and str[1:3] != "38" and str[1:3] != "39" and str[1:3] != "47" and str[1:3] != "51" and str[1:3] != "57" and str[1:3] != "78" and str[1:3] != "86" and str[1:3] != "88":
return False
for i in range(3, 11):
if str[i] < "0" or str[i] > "9":
return False
return True
phoneNumber = input("请输入您的手机号:")
print(isPhoneNumber(phoneNumber))
def isPhoneNumber(phoneNumber):
pat = r"^1(([3578]\d)|(47))\d{8}$"
print(re.match(pat, phoneNumber))
phoneNumber = input("请输入您的手机号:")
isPhoneNumber(phoneNumber)
相信你此刻已经感受到了它的强大之处,接下来就让我们开始正则表达式的学习。先来介绍一下 re 模块。
import re
# 扫描整个字符串,注意返回从起始位置成功的匹配
print(re.match("To", "To be a better man !")) # <_sre.SRE_Match object; span=(0, 2), match='To'>
print(re.match("To", "be To a better man !")) # None
print(re.match("to", "To be a better man !", flags=re.I))
# 扫描整个字符串,并返回第一个成功的匹配
print(re.search("To", "be To a better man !")) # <_sre.SRE_Match object; span=(3, 5), match='To'>
# 扫描整个字符串,并返回结果列表
print(re.findall("To", "To be a better man and to make right decision !", flags=re.I)) # ['To', 'to']
import re
print(re.findall(".", "To be a \n better man !"))
print(re.findall("[better]", "To be a \n better man !"))
print(re.findall("[^To be a]", "To be a \n better man !"))
print(re.findall("\d", "95 To be a better man ! 0831"))
print(re.findall("\w", "95 To be a better man ! 0831"))
print(re.findall("\s", "95 To be a better man ! 0831"))
import re
print(re.search("^To", "To be a better man !"))
print(re.search("!$", "To be a better man !"))
print(re.findall("\ATo", "To be a better man !\nTo be a better man !", re.M))
print(re.findall("^To", "To be a better man !\nTo be a better man !", re.M))
print(re.search(r"er\b", "better"))
说明:下方的 x、y、z 均为假设的普通字符,不是正则表达式的元字符,m n 表示非负整数
import re
# 贪婪匹配,尽可能多的匹配;非贪婪匹配,尽可能少的匹配
print(re.findall(r"(better)", "To be a better man !")) # ['better']
print(re.findall(r"a?", "aaaaaa")) # ['a', 'a', 'a', 'a', 'a', 'a', '']
print(re.findall(r"a*", "aaaaaa")) # ['aaaaaa', '']
print(re.findall(r"a+", "aaaaaaba")) # ['aaaaaa', 'a']
print(re.findall(r"a{3}", "aaaaaa")) # ['aaa', 'aaa']
print(re.findall(r"a{4,}", "aaaaaabaaaacaaa")) # ['aaaaaa', 'aaaa']
print(re.findall(r"a{4,5}", "aaaaaabaaaacaaa")) # ['aaaaa', 'aaaa']
print(re.findall(r"((M|m)ark)", "Mark mark")) # [('Mark', 'M'), ('mark', 'm')]
# 对 * 进行转义,进行非贪婪匹配
print(re.findall(r"//*.*?/*/", r"/* one */ /* two */ ")) # ['/* one */', '/* two */']
大家可以去写一下关于 QQ 、邮箱、电话、用户名、密码、IP地址、URL的正则表达式来练下手。
END