这里有新鲜出炉的Java函数式编程,程序狗速度看过来!
java 是一种可以撰写跨平台应用软件的面向对象的程序设计语言,是由Sun Microsystems公司于1995年5月推出的Java程序设计语言和Java平台(即JavaEE(j2ee), JavaME(j2me), JavaSE(j2se))的总称。
这篇文章主要介绍了使用java的HttpClient实现多线程并发的相关资料,需要的朋友可以参考下
说明:以下的代码基于httpclient4.5.2实现。
我们要使用java的HttpClient实现get请求抓取网页是一件比较容易实现的工作:
- public static String get(String url) {
- CloseableHttpResponseresponse = null;
- BufferedReader in =null;
- String result = "";
- try {
- CloseableHttpClienthttpclient = HttpClients.createDefault();
- HttpGethttpGet = new HttpGet(url);
- response = httpclient.execute(httpGet);
- in =new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
- StringBuffersb = new StringBuffer("");
- String line = "";
- String NL = System.getProperty("line.separator");
- while ((line = in.readLine()) != null) {
- sb.append(line + NL);
- } in .close();
- result = sb.toString();
- } catch(IOException e) {
- e.printStackTrace();
- } finally {
- try {
- if (null != response) response.close();
- } catch(IOException e) {
- e.printStackTrace();
- }
- }
- return result;
- }
要多线程执行get请求时上面的方法也堪用。不过这种多线程请求是基于在每次调用get方法时创建一个HttpClient实例实现的。每个HttpClient实例使用一次即被回收。这显然不是一种最优的实现。
HttpClient提供了多线程请求方案,可以查看官方文档的《 Pooling connection manager 》这一节。HttpCLient实现多线程请求是基于内置的连接池实现的,其中有一个关键的类即PoolingHttpClientConnectionManager,这个类负责管理HttpClient连接池。在PoolingHttpClientConnectionManager中提供了两个关键的方法:setMaxTotal和setDefaultMaxPerRoute。setMaxTotal设置连接池的最大连接数,setDefaultMaxPerRoute设置每个路由上的默认连接个数。此外还有一个方法setMaxPerRoute——单独为某个站点设置最大连接个数,像这样:
- HttpHosthost = new HttpHost("locahost", 80);
- cm.setMaxPerRoute(new HttpRoute(host), 50);
根据文档稍稍调整下我们的get请求实现:
- package com.zhyea.robin;
- import org.apache.http.client.methods.CloseableHttpResponse;
- import org.apache.http.client.methods.HttpGet;
- import org.apache.http.impl.client.CloseableHttpClient;
- import org.apache.http.impl.client.HttpClients;
- import org.apache.http.impl.conn.PoolingHttpClientConnectionManager;
- import java.io.BufferedReader;
- import java.io.IOException;
- import java.io.InputStreamReader;
- public class HttpUtil {
- private static CloseableHttpClienthttpClient;
- static {
- PoolingHttpClientConnectionManagercm = new PoolingHttpClientConnectionManager();
- cm.setMaxTotal(200);
- cm.setDefaultMaxPerRoute(20);
- cm.setDefaultMaxPerRoute(50);
- httpClient = HttpClients.custom().setConnectionManager(cm).build();
- }
- public static String get(String url) {
- CloseableHttpResponseresponse = null;
- BufferedReaderin = null;
- String result = "";
- try {
- HttpGethttpGet = new HttpGet(url);
- response = httpClient.execute(httpGet);
- in =new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
- StringBuffersb = new StringBuffer("");
- String line = "";
- String NL = System.getProperty("line.separator");
- while ((line = in.readLine()) != null) {
- sb.append(line + NL);
- } in .close();
- result = sb.toString();
- } catch(IOException e) {
- e.printStackTrace();
- } finally {
- try {
- if (null != response) response.close();
- } catch(IOException e) {
- e.printStackTrace();
- }
- }
- return result;
- }
- public static void main(String[] args) {
- System.out.println(get("https://www.baidu.com/"));
- }
- }
这样就差不多了。不过对于我自己而言,我更喜欢httpclient的fluent实现,比如我们刚才实现的http get请求完全可以这样简单的实现:
- package com.zhyea.robin;
- import org.apache.http.client.fluent.Request;
- import java.io.IOException;
- public class HttpUtil {
- public static String get(String url) {
- String result = "";
- try {
- result = Request.Get(url)
- .connectTimeout(1000)
- .socketTimeout(1000)
- .execute().returnContent().asString();
- } catch (IOException e) {
- e.printStackTrace();
- }
- return result;
- }
- public static void main(String[] args) {
- System.out.println(get("https://www.baidu.com/"));
- }
- }
我们要做的只是将以前的httpclient依赖替换为fluent-hc依赖:
- <dependency>
- <groupId>org.apache.httpcomponents</groupId>
- <artifactId>fluent-hc</artifactId>
- <version>4.5.2</version>
- </dependency>
并且这个fluent实现天然就是采用PoolingHttpClientConnectionManager完成的。它设置的maxTotal和defaultMaxPerRoute的值分别是200和100:
- CONNMGR = new PoolingHttpClientConnectionManager(sfr);
- CONNMGR.setDefaultMaxPerRoute(100);
- CONNMGR.setMaxTotal(200);
唯一一点让人不爽的就是Executor没有提供调整这两个值的方法。不过这也完全够用了,实在不行的话,还可以考虑重写Executor方法,然后直接使用Executor执行get请求:
- Executor.newInstance().execute(Request.Get(url))
- .returnContent().asString();
就这样!
来源: http://www.phperz.com/article/17/1113/359980.html