python:requests 模块

requests 模块是我在学习爬虫时学到的一个模块,它的 api 比较简单好用,这里简介下使用方法.

其实这个很好使用,几行代码就可以获取一个网页的内容:

import requests
url = 'http://www.juzimi.com/ju/252304'
response = requests.get(url)
print (response.text)

它支持很多 http 请求类型: get,post,put,delete,head,options

其中获取的响应内容有 2 中显示方法

.content 以字节的方式显示,中文显示为字符

.text 以文本的方式显示,放两张图就能明白了

这是 content

这是 text

Requests 会自动解码来自服务器的内容.大多数 unicode 字符集都能被无缝地解码.

可以通过

.encoding

查看

requests

使用了什么编码

也可以手动改变其使用的编码 r.encoding= 'gbk2312'

Get 请求可以传递参数

import requests
url = 'http://www.juzimi.com/article/33125'
payload = {'page':'1'
response = requests.get(url,params=payload)
print (response.text)

可以打印. url,查看构造后的 url

定制请求头部

传一个 dict 给 heads 参数

headers = {
    'user-agent': 'my-app/0.0.1'r = requests.get(url, headers = headers)

发送 post 请求

payload = {'key1': 'value1', 'key2': 'value2'
r = requests.post("http://httpbin.org/post", data=payload)

get 方法还有一个 cookies 参数 timeout 参数访问代理

proxies = {
    "http": "http://10.10.10.10:8888",
    "https": "http://10.10.10.100:4444",
}
r = requests.get('http://m.ctrip.com', proxies = proxies)

来源: http://www.bubuko.com/infodetail-2465432.html

与本文相关文章

暂无,快来抢沙发吧！