python正则表达式

最后更新时间:2012-08-28

注:本wiki不涉及正则知识

1.编译正则表达式

正则表达式被编译成 RegexObject 实例,可以为不同的操作提供方法

compile(pattern, flags=0)

编译正则表达式,返回一个RegexObject.

其中:flags是编译时用的匹配模式.

code sample:

import re
print re.compile(r'^hello')

从字符串首开始进行匹配,匹配成功则返回一个match对象(MatchObject),否则返回None

match(pattern, string, flags=0)

使用编译了的pattern object:

import re
p=re.compile(r'^hello')
if p.match('hello world'):
    print 1
else:
    print 0

还可以这样:(python会把pattern编译成pattern对象再进行匹配,实质与上一种code一样)

import re
if re.match('^hello','hello world'):
    print 1
else:
    print 0

search(string[, pos[, endpos]]) | re.search(pattern, string[, flags]):

search() 将扫描整个字符串,并报告它找到的第一个匹配(返回MatchObject).找不到则返回None

code sample:

import re
p=re.search('abc','abcedfabcef')
if p:
    print p.group()
else:
    print 0

原型:

findall(pattern, string, flags=0)

sample code:

import re
print re.findall('\d','abcd1240sa')#['1','2','4','0']

原型:

finditer(pattern, string, flags=0)

返回一个match对象

sample code:

import re
for i in re.finditer('\d+','12hello23world45'):
    print i.span()

假如我们不用正则:

print "hello".replace('l','+')

如果我们使用正则做替换,比如下面的:

#--coding:utf-8--
import re
p=r'_(.+?)_'
pc=re.compile(p)
print pc.sub(r'<em>\1</em>',"_hello_enas_world_")

用到了sub函数:

sub(replacement, string[, count = 0])

找到所有模式匹配的字符串并用不同的字符串来替换它们.找到了返回替换后结果,否则返回原结果

注,这里使用的是原始字符串(去掉了反斜线转义机制)