修改一个简单的爬虫

 
import urllib
 
content = urllib.urlopen('http://www.CodeSnippet.cn/list.html').read()
 
s1=0
while s1>=0:
    begin = content.find(r'<a',s1)
    m1 = content.find(r'href=',begin)
    m2 = content.find(r'>',m1)
 
    s1 = m2
    if(begin<=0):
        break
    elif(content[m1:m2].find(r" ")!=-1):
        m2 = content[m1:m2].find(r' ')
        url = content[m1+6:m1+m2-1]
        print url
    elif m2>=0:
        url = content[m1+6:m2-1]
        print url
print "end."
#该片段来自于http://www.codesnippet.cn/detail/020120148350.html

来源: http://www.codesnippet.cn/detail/020120148350.html

与本文相关文章

一个 exploit-db 的爬虫 demo
python简单爬虫
用 Python 写一个小白也能懂的分布式知乎爬虫
简单的爬虫，从 html 中提取表格信息
简单爬虫实例
简单的电子邮件爬虫Python代码
爬虫, 基于 request,bs4 的简单实例整合
python爬虫简单试用

暂无,快来抢沙发吧！