JAVA API 连接 HA 方式下的 HDFS

private static String ClusterName = "nsstargate";
    private static final String HADOOP_URL = "hdfs://"+ClusterName;
    public static Configuration conf;
    static {
        conf = new Configuration();
        conf.set("fs.defaultFS", HADOOP_URL);
        conf.set("dfs.nameservices", ClusterName);
        conf.set("dfs.ha.namenodes."+ClusterName, "nn1,nn2");
        conf.set("dfs.namenode.rpc-address."+ClusterName+".nn1", "172.16.50.24:8020");
        conf.set("dfs.namenode.rpc-address."+ClusterName+".nn2", "172.16.50.21:8020");
        //conf.setBoolean(name, value);
        conf.set("dfs.client.failover.proxy.provider."+ClusterName,
                "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider");
    }

通过 java API 连接 Hadoop 集群时, 如果集群支持 HA 方式, 那么可以通过如下方式设置来自动切换到活动的 master 节点上. 其中, ClusterName 是可以任意指定的, 跟集群配置无关, dfs.ha.namenodes.ClusterName 也可以任意指定名称, 有几个 master 就写几个, 后面根据相应的设置添加 master 节点地址即可.

上传文件到 HDFS 的代码如下, 至于读取等其他操作, 可以参考网络上其他文章.

/**
     * 上传文件到 HDFS 上去
     */
    private static void uploadToHdfs() throws IOException {
        String localSrc = "E:\\test\\article01.txt";
        String dst = "/user/test/article04.txt";
        FileSystem fs = FileSystem.get(URI.create(HADOOP_URL), conf);
        long start = new Date().getTime();
       /* InputStream in = new FileInputStream(localSrc);
        InputStreamReader isr = new InputStreamReader(in, "GBK");
        OutputStream out = fs.create(new Path(HADOOP_URL+dst), true);
        IOUtils.copy(isr, out, "UTF8");*/
        // 该方法更快
        FSDataOutputStream outputStream=fs.create(new Path(dst));
        String fileContent = FileUtils.readFileToString(new File(localSrc), "GBK");
        outputStream.write(fileContent.getBytes());
        outputStream.close();
        long end = new Date().getTime();
        System.out.println("use:"+(end-start));
    }

来源: http://www.bubuko.com/infodetail-3028656.html

与本文相关文章

暂无,快来抢沙发吧！