- from urllib.robotparser import RobotFileParser
- import ssl
- from urllib.request import urlopen
- ssl._create_default_https_context = ssl._create_unverified_context
- rp = RobotFileParser()
- rp.set_url('http://www.jianshu.com/robots.txt')
- rp.read()
- print(rp.can_fetch('*', 'http://www.jianshu.com/p/b6755402d7d'))
- print(rp.can_fetch('*', 'http://www.jianshu.com/search?q=python&page=1&type=note'))
parse() 读取分析
- rp = RobotFileParser()
- rp.parse(urlopen('http://www.jianshu.com/robots.txt').read().decode('utf-8').split('\n'))
- `
来源: http://www.bubuko.com/infodetail-2911567.html