1.HA 架构注意事项
两个 Namenode 节点在某个时间只能有一个节点正常响应客户端请求, 响应请求的节点状态必须是 active
standby 状态要能够快速无缝切换成 active 状态, 两个 NN 节点必须时刻保持元数据一致
将 edits 文件放到 qjournal(一种分布式应用, 依赖 zookeeper 实现, 管理 edits), 而不存储在两个 NN 上, 如果各个 edits 放在各个 NN 上, 只能通过网络通信达到同步效果, 可用性, 安全性大大降低
每个 namenode 有一个监控进程 zkfc, 用来监控 namenode 是否异常
避免状态切换时发生 brain split, 执行自定义脚本杀死 NN 进程, 确保只有一个 NN 是 active 状态
两个 NN 组成 Federation
2. 搭建准备
准备七台机器
3. 安装过程
在 CentOS7One 机器上安装 jdk,hadoop 的过程不再赘述, 参考本文
首先配置免密
CentOS7One 需要免密连接 CentOS7Five,CentOS7Six,CentOS7Seven, 用以启动 zookeeper,datanode
CentOS7Three 需要免密连接 CentOS7Five,CentOS7Six,CentOS7Seven, 用以启动 nodemanager
免密配置过程参考 Hadoop 免密钥配置
3.1 zookeeper 配置
配置 CentOS7Five 的 zoo.cfg
- # The number of milliseconds of each tick
- tickTime=2000
- # The number of ticks that the initial
- # synchronization phase can take
- initLimit=10
- # The number of ticks that can pass between
- # sending a request and getting an acknowledgement
- syncLimit=5
- # the directory where the snapshot is stored.
- # do not use /tmp for storage, /tmp here is just
- # example sakes.
- dataDir=/opt/zkdata
- dataLogDir=/opt/zkdatalog
- # the port at which the clients will connect
- clientPort=2181
- # the maximum number of client connections.
- # increase this if you need to handle more clients
- #maxClientCnxns=60
- #
- # Be sure to read the maintenance section of the
- # administrator guide before turning on autopurge.
- #
- # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
- #
- # The number of snapshots to retain in dataDir
- #autopurge.snapRetainCount=3
- # Purge task interval in hours
- # Set to "0" to disable auto purge feature
- #autopurge.purgeInterval=1
- server.4=192.168.94.142:2888:3888
- server.5=192.168.94.143:2888:3888
- server.6=192.168.94.144:2888:3888
将配置好的 zookeeper 复制到 CentOS7Six,CentOS7Seven
- scp /opt/zookeeper/zookeeper-3.4.10 CentOS7Six:/opt/zookeeper/zookeeper-3.4.10
- scp /opt/zookeeper/zookeeper-3.4.10 CentOS7Seven:/opt/zookeeper/zookeeper-3.4.10
3.2 hadoop 配置
编辑 CentOS7One 的 core-site.xml
- <?xml version="1.0" encoding="UTF-8"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
- <!-- Put site-specific property overrides in this file. -->
- <configuration>
- <property>
- <name>fs.defaultFS</name>
- <value>hdfs://ns1/</value>
- </property>
- <property>
- <name>hadoop.tmp.dir</name>
- <value>/usr/local/hadoop-2.6.5/data</value>
- </property>
- <property>
- <name>ha.zookeeper.quorum</name>
- <value>CentOS7Five:2181,CentOS7Six:2181,CentOS7Seven:2181</value>
- </property>
- </configuration>
编辑 CentOS7One 的 hdfs-site.xml
- <?xml version="1.0" encoding="UTF-8"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
- <!-- Put site-specific property overrides in this file. -->
- <configuration>
- <!-- 指定 hdfs 的 nameservice 为 ns1, 需要和 core-site.xml 中的保持一致 -->
- <property>
- <name>dfs.nameservices</name>
- <value>ns1</value>
- </property>
- <!-- ns1 下面有两个 NameNode, 分别是 nn1,nn2 -->
- <property>
- <name>dfs.ha.namenodes.ns1</name>
- <value>nn1,nn2</value>
- </property>
- <!-- nn1 的 RPC 通信地址 -->
- <property>
- <name>dfs.namenode.rpc-address.ns1.nn1</name>
- <value>CentOS7One:9000</value>
- </property>
- <!-- nn1 的 http 通信地址 -->
- <property>
- <name>dfs.namenode.http-address.ns1.nn1</name>
- <value>CentOS7One:50070</value>
- </property>
- <!-- nn2 的 RPC 通信地址 -->
- <property>
- <name>dfs.namenode.rpc-address.ns1.nn2</name>
- <value>CentOS7Two:9000</value>
- </property>
- <!-- nn2 的 http 通信地址 -->
- <property>
- <name>dfs.namenode.http-address.ns1.nn2</name>
- <value>CentOS7Two:50070</value>
- </property>
- <!-- 指定 NameNode 的元数据在 JournalNode 上的存放位置 -->
- <property>
- <name>dfs.namenode.shared.edits.dir</name>
- <value>qjournal://CentOS7Five:8485;CentoS7Six:8485;CentOS7Seven:8485/ns1</value>
- </property>
- <!-- 指定 JournalNode 在本地磁盘存放数据的位置 -->
- <property>
- <name>dfs.journalnode.edits.dir</name>
- <value>/usr/local/hadoop-2.6.5/journaldata</value>
- </property>
- <!-- 开启 NameNode 失败自动切换 -->
- <property>
- <name>dfs.ha.automatic-failover.enabled</name>
- <value>true</value>
- </property>
- <!-- 配置失败自动切换实现方式 -->
- <property>
- <name>dfs.client.failover.proxy.provider.ns1</name>
- <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
- </property>
- <!-- 配置隔离机制方法, 多个机制用换行分割, 即每个机制暂用一行 -->
- <property>
- <name>dfs.ha.fencing.methods</name>
- <value>
- sshfence
- shell(/bin/true)
- </value>
- </property>
- <!-- 使用 sshfence 隔离机制时需要 ssh 免登陆 -->
- <property>
- <name>dfs.ha.fencing.SSH.private-key-files</name>
- <value>/root/.SSH/id_rsa</value>
- </property>
- <!-- 配置 sshfence 隔离机制超时时间 -->
- <property>
- <name>dfs.ha.fencing.SSH.connect-timeout</name>
- <value>30000</value>
- </property>
- </configuration>
编辑 CentOS7One 的 mapred-site.xml
- <?xml version="1.0"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
- <!-- Put site-specific property overrides in this file. -->
- <configuration>
- <property>
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
- </configuration>
编辑 CentOS7One 的 yarn-site.xml
- <?xml version="1.0"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
- <configuration>
- <!-- 开启 RM 高可用 -->
- <property>
- <name>yarn.resourcemanager.ha.enabled</name>
- <value>true</value>
- </property>
- <!-- 指定 RM 的 cluster id -->
- <property>
- <name>yarn.resourcemanager.cluster-id</name>
- <value>yrc</value>
- </property>
- <!-- 指定 RM 的名字 -->
- <property>
- <name>yarn.resourcemanager.ha.rm-ids</name>
- <value>rm1,rm2</value>
- </property>
- <!-- 分别指定 RM 的地址 -->
- <property>
- <name>yarn.resourcemanager.hostname.rm1</name>
- <value>CentOS7Three</value>
- </property>
- <property>
- <name>yarn.resourcemanager.hostname.rm2</name>
- <value>CentOS7Four</value>
- "yarn-site.xml" 52L, 1479C
- <name>yarn.resourcemanager.hostname.rm2</name>
- <value>CentOS7Four</value>
- </property>
- <!-- 指定 zk 集群地址 -->
- <property>
- <name>yarn.resourcemanager.zk-address</name>
- <value>CentOS7Five:2181,CentOS7Six:2181,CentOS7Seven:2181</value>
- </property>
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce_shuffle</value>
- </property>
- </configuration>
将配置好的 Hadoop 发送给所有机器
- scp -r /usr/local/hadoop-2.6.5 CentOS7Two:/usr/local/hadoop-2.6.5/
- scp -r /usr/local/hadoop-2.6.5 CentOS7Three:/usr/local/hadoop-2.6.5/
- scp -r /usr/local/hadoop-2.6.5 CentOS7Four:/usr/local/hadoop-2.6.5/
- scp -r /usr/local/hadoop-2.6.5 CentOS7Five:/usr/local/hadoop-2.6.5/
- scp -r /usr/local/hadoop-2.6.5 CentOS7Six:/usr/local/hadoop-2.6.5/
- scp -r /usr/local/hadoop-2.6.5 CentOS7Seven:/usr/local/hadoop-2.6.5/
至此, 配置完毕
4. 启动流程
在 CentOS7Five,CentOS7Six,CentOS7Seven 启动 zookeeper
sh zkServer.sh start
在 CentOS7Five,CentOS7Six,CentOS7Seven 启动 journalnode
sh hadoop-daemon.sh start journalnode
在 CentOS7One 上格式化 namenode(以后启动无需格式化)
hdfs namenode -format
在 CentOS7One 上格式化 ZKFC(以后启动无需格式化)
hdfs zkfc -formatZK
在 CentOS7One 上启动 hdfs
sh start-dfs.sh
在 CentOS7Three,CentOS7Four 上启动 yarn
sh start-yarn.sh
5. 结果
CentOS7One 启动成功应该有如下进程
- [root@CentOS7One ~]# jps
- 17397 DFSZKFailoverController
- 17111 NameNode
- 17480 Jps
- CentOS7Two
- [root@CentOS7Two ~]# jps
- 2497 Jps
- 2398 DFSZKFailoverController
- 2335 NameNode
- CentOS7Three
- [root@CentOS7Three ~]# jps
- 2344 ResourceManager
- 2619 Jps
- CentOS7Four
- [root@CentOS7Four ~]# jps
- 2344 ResourceManager
- 2619 Jps
- CentOS7Five
- [root@CentOS7Five logs]# jps
- 2803 Jps
- 2310 QuorumPeerMain
- 2460 JournalNode
- 2668 NodeManager
- 2543 DataNode
- CentOS7Six
- [root@CentOS7Six ~]# jps
- 2400 JournalNode
- 2608 NodeManager
- 2483 DataNode
- 2743 Jps
- 2301 QuorumPeerMain
- CentOS7Seven
- [root@CentOS7Seven ~]# jps
- 2768 Jps
- 2313 QuorumPeerMain
- 2425 JournalNode
- 2650 NodeManager
- 2525 DataNode
访问 http://centos7one:50070, 可以看到我们有三个 datanode
访问 http://centos7three:8088/
查看集群情况
5. 致谢
- https://www.cnblogs.com/biehongli/p/7660310.html
- https://www.bilibili.com/video/av15390641/?p=44
来源: https://www.cnblogs.com/Java-Starter/p/10698217.html