- private static String ClusterName = "nsstargate";
- private static final String HADOOP_URL = "hdfs://"+ClusterName;
- public static Configuration conf;
- static {
- conf = new Configuration();
- conf.set("fs.defaultFS", HADOOP_URL);
- conf.set("dfs.nameservices", ClusterName);
- conf.set("dfs.ha.namenodes."+ClusterName, "nn1,nn2");
- conf.set("dfs.namenode.rpc-address."+ClusterName+".nn1", "172.16.50.24:8020");
- conf.set("dfs.namenode.rpc-address."+ClusterName+".nn2", "172.16.50.21:8020");
- //conf.setBoolean(name, value);
- conf.set("dfs.client.failover.proxy.provider."+ClusterName,
- "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider");
- }
通过 java API 连接 Hadoop 集群时, 如果集群支持 HA 方式, 那么可以通过如下方式设置来自动切换到活动的 master 节点上. 其中, ClusterName 是可以任意指定的, 跟集群配置无关, dfs.ha.namenodes.ClusterName 也可以任意指定名称, 有几个 master 就写几个, 后面根据相应的设置添加 master 节点地址即可.
上传文件到 HDFS 的代码如下, 至于读取等其他操作, 可以参考网络上其他文章.
- /**
- * 上传文件到 HDFS 上去
- */
- private static void uploadToHdfs() throws IOException {
- String localSrc = "E:\\test\\article01.txt";
- String dst = "/user/test/article04.txt";
- FileSystem fs = FileSystem.get(URI.create(HADOOP_URL), conf);
- long start = new Date().getTime();
- /* InputStream in = new FileInputStream(localSrc);
- InputStreamReader isr = new InputStreamReader(in, "GBK");
- OutputStream out = fs.create(new Path(HADOOP_URL+dst), true);
- IOUtils.copy(isr, out, "UTF8");*/
- // 该方法更快
- FSDataOutputStream outputStream=fs.create(new Path(dst));
- String fileContent = FileUtils.readFileToString(new File(localSrc), "GBK");
- outputStream.write(fileContent.getBytes());
- outputStream.close();
- long end = new Date().getTime();
- System.out.println("use:"+(end-start));
- }
来源: http://www.bubuko.com/infodetail-3028656.html