python 利用 beautifulsoup 多页面爬虫

利用了 beautifulsoup 进行爬虫, 解析网址分页面爬虫并存入文本文档:

结果:

源码:

from bs4 import BeautifulSoup
from urllib.request import urlopen
with open("热门标题. txt","a",encoding="utf-8") as f:
    for i in range(2):
        url = "http://www.ltaaa.com/wtfy-{}".format(i)+".html"
        HTML = urlopen(url).read()
        soup = BeautifulSoup(HTML,"html.parser")
        titles = soup.select("div[class ='dtop'] a") # CSS 选择器
        for title in titles:
             print(title.get_text(),title.get('href'))# 标签体, 标签属性
             f.write("标题:{}\n".format(title.get_text()))

来源: http://www.bubuko.com/infodetail-2948564.html

与本文相关文章

Python爬虫：用BeautifulSoup进行NBA数据爬取
利用 Python 网络爬虫抓取网易云音乐歌词
Python 爬虫 (三):BeautifulSoup 库
python 3利用BeautifulSoup抓取div标签的方法示例
python 安装 BeautifulSoup 库解析 HTML 页面
Python 爬虫: BeautifulSoup 用法总结
python多线程多队列（BeautifulSoup网络爬虫）
Python 爬虫 [解析库之 beautifulsoup]

暂无,快来抢沙发吧！