requests 是一个很实用的 PythonHTTP 客户端库, 编写爬虫和测试服务器响应数据时经常会用到. 可以说, Requests 完全满足如今网络的需求
本文全部来源于官方文档 http://docs.python-requests.org/en/master/
安装方式一般采用 $ pip install requests. 其它安装方式参考官方文档
- HTTP - requests
- importrequests
GET 请求
r = requests.get('http://httpbin.org/get')
传参
- >>> payload = {
- 'key1':'value1','key2':'value2', 'key3':None
- }>>> r = requests.get('http://httpbin.org/get',params=payload)
- http://httpbin.org/get?key2=value2&key1=value1
Note that any dictionary key whose value is Nonewill not be added to the URL's query string.
参数也可以传递列表
- >>> payload = {
- 'key1':'value1','key2': ['value2','value3']
- }
- >>> r = requests.get('http://httpbin.org/get',params=payload)>>> print(r.url)http://httpbin.org/get?key1=value1&key2=value2&key2=value3
r.text 返回 headers 中的编码解析的结果, 可以通过 r.encoding = 'gbk'来变更解码方式
r.content 返回二进制结果
r.JSON()返回 JSON 格式, 可能抛出异常
r.status_code
r.raw 返回原始 socket respons, 需要加参数 stream=True
- >>> r = requests.get('https://api.github.com/events',stream=True)
- >>> r.raw<requests.packages.urllib3.response.HTTPResponse object at 0x101194810>
- >>> r.raw.read(10)'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03'
将结果保存到文件, 利用 r.iter_content()
withopen(filename,'wb')asfd:forchunk inr.iter_content(chunk_size):fd.write(chunk)
传递 headers
>>> headers = {'user-agent':'my-app/0.0.1'}>>> r = requests.get(url,headers=headers)
传递 cookies
- >>> url = 'http://httpbin.org/cookies'
- >>> r = requests.get(url,cookies=dict(cookies_are='working'))>>> r.text'{"cookies": {"cookies_are":"working"}}'
POST 请求
传递表单
r = requests.post('http://httpbin.org/post',data = {'key':'value'})
通常, 你想要发送一些编码为表单形式的数据 - 非常像一个 html 表单. 要实现这个, 只需简单地传递一个字典给 data 参数. 你的数据字典 在发出请求时会自动编码为表单形式:
- >>> payload = {
- 'key1':'value1','key2':'value2'
- }
- >>> r = requests.post("http://httpbin.org/post",data=payload)>>> print(r.text){
- ..."form": {
- "key2": "value2","key1": "value1"
- },...
- }
很多时候你想要发送的数据并非编码为表单形式的. 如果你传递一个 string 而不是一个 dict, 那么数据会被直接发布出去.
- >>> url = 'https://api.github.com/some/endpoint'>>> payload = {
- 'some':'data'
- }
- >>> r = requests.post(url,data=JSON.dumps(payload))
或者
>>> r = requests.post(url,JSON=payload)
传递文件
- url = 'http://httpbin.org/post'>>> files = {
- 'file':open('report.xls','rb')
- }
- >>> r = requests.post(url,files=files)
配置 files,filename, content_type and headers
- files = {
- 'file': ('report.xls',open('report.xls','rb'),'application/vnd.ms-excel', {
- 'Expires':'0'
- })
- }
- files = {
- 'file': ('report.csv','some,data,to,send\nanother,row,to,send\n')
- }
响应
- r.status_code
- r.heards
- r.cookies
跳转
- By default Requests will perform location redirection for all verbs except HEAD.
- >>> r = requests.get('http://httpbin.org/cookies/set?k2=v2&k1=v1')
- >>> r.url'http://httpbin.org/cookies'
- >>> r.status_code200
- >>> r.history[<Response [302]>]
- If you're using HEAD, you can enable redirection as well:
- r=requests.head('http://httpbin.org/cookies/set?k2=v2&k1=v1',allow_redirects=True)
- You can tell Requests to stop waiting for a response after a given number of seconds with the timeoutparameter:
- requests.get('http://github.com',timeout=0.001)
高级特性
- s = requests.Session()
- s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')r = s.get('http://httpbin.org/cookies')
- print(r.text)# '{"cookies": {"sessioncookie":"123456789"}}'
- s = requests.Session()s.auth = ('user','pass') #权限认证 s.headers.update({
- 'x-test':'true'
- })
- # both 'x-test' and 'x-test2' are sents.get('http://httpbin.org/headers',headers={
- 'x-test2':'true'
- })
- r = requests.get('http://en.wikipedia.org/wiki/Monty_Python')
- r.headers
- r.request.headers
- importrequests
- fromcontextlib
- importclosing
- tarball_url ='https://github.com/kennethreitz/requests/tarball/master'
- file =r'D:\Documents\WorkSpace\Python\Test\Python34Test\test.tar.gz'
- withclosing(requests.get(tarball_url,stream=True))asr:
- withopen(file,'wb')asf:
- fordata inr.iter_content(1024):
- f.write(data)
- Keep-Alive
- defgen():yieldb'hi'yieldb'there'
- requests.post('http://some.url/chunked',data=gen())
- For chunked encoded responses, it's best to iterate over the data using Response.iter_content(). In an ideal situation you'll have set stream=Trueon the request, in which case you can iterate chunk-by-chunk by calling iter_contentwith a chunk size parameter of None. If you want to set a maximum size of the chunk, you can set a chunk size parameter to any integer.
- POST Multiple Multipart-Encoded Files
- <input type="file" name="images" multiple="true" required="true" />
- To do that, just set files to a list of tuples of (form_field_name, file_info):
- >>> url = 'http://httpbin.org/post'>>> multiple_files = [('images', ('foo.png',
- open('foo.png', 'rb'), 'image/png')),('images', ('bar.png', open('bar.png',
- 'rb'), 'image/png'))]>>> r = requests.post(url,files=multiple_files)>>>
- r.text{...'files': {'images': 'data:image/png;base64,iVBORw ....'}'Content-Type':
- 'multipart/form-data; boundary=3131623adb2043caaeb5538cc
来源: http://www.bubuko.com/infodetail-2890092.html