python正则表达式提取网页URL
- import re
- import urllib
- url="http://www.open-open.com"
- s=urllib.urlopen(url).read()
- ss=s.replace(" ","")
- urls=re.findall(r"<a.*?href=.*?<\/a>",ss,re.I)
- for i in urls:
- print i
- else:
- print 'this is over'
来源: http://www.phpxs.com/code/1005015/