问题现象:
- 2019-03-11 12:30:52,174 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 7653ms
- GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=7692ms
- 2019-03-11 12:31:00,573 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 7899ms
- GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=7951ms
- 2019-03-11 12:31:08,952 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 7878ms
- GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=7937ms
- 2019-03-11 12:31:17,405 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 7951ms
- GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=8037ms
- 2019-03-11 12:31:26,611 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 8705ms
- GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=8835ms
- 2019-03-11 12:31:35,009 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 7897ms
- GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=8083ms
- 2019-03-11 12:31:43,806 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 8296ms
- GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=8416ms
- 2019-03-11 12:31:52,317 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 8010ms
- GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=8163ms
- 2019-03-11 12:32:00,680 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 7862ms
gc 一段时间后出现:
- 2019-03-11 12:27:15,820 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
- java.lang.OutOfMemoryError: Java heap space
- at java.lang.StringCoding$StringEncoder.encode(StringCoding.java:300)
- at java.lang.StringCoding.encode(StringCoding.java:344)
- at java.lang.String.getBytes(String.java:918)
- at java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
- at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:242)
- at java.io.File.exists(File.java:819)
- at sun.misc.URLClassPath$FileLoader.getResource(URLClassPath.java:1282)
- at sun.misc.URLClassPath.getResource(URLClassPath.java:239)
- at java.NET.URLClassLoader$1.run(URLClassLoader.java:365)
- at java.NET.URLClassLoader$1.run(URLClassLoader.java:362)
- at java.security.AccessController.doPrivileged(Native Method)
- at java.NET.URLClassLoader.findClass(URLClassLoader.java:361)
- at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
- at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
- at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
- at org.apache.hadoop.hdfs.server.namenode.JournalSet.close(JournalSet.java:244)
- at org.apache.hadoop.hdfs.server.namenode.FSEditLog.close(FSEditLog.java:400)
- at org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.close(FSEditLogAsync.java:112)
- at org.apache.hadoop.hdfs.server.namenode.FSImage.close(FSImage.java:1408)
- at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1079)
- at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:681)
- at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:666)
- at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:728)
- at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:953)
- at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:932)
- at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1673)
- at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1741)
- 2019-03-11 12:27:15,827 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: java.lang.OutOfMemoryError: Java heap space
- 2019-03-11 12:27:15,830 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
或者出现下面的错误:
- 2019-03-11 11:09:16,124 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
- java.lang.OutOfMemoryError: GC overhead limit exceeded
- at com.google.protobuf.CodedInputStream.<init>(CodedInputStream.java:573)
- at com.google.protobuf.CodedInputStream.newInstance(CodedInputStream.java:55)
- at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:199)
- at com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
- at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
- at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
- at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
- at org.apache.hadoop.hdfs.server.namenode.FsImageProto$INodeSection$INode.parseDelimitedFrom(FsImageProto.java:10867)
- at org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:233)
- at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:250)
- at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:176)
- at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:226)
- at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:937)
- at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:921)
- at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:794)
- at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:724)
- at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:322)
- at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1052)
- at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:681)
- at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:666)
- at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:728)
- at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:953)
- at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:932)
- at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1673)
- at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1741)
- 2019-03-11 11:09:16,127 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: java.lang.OutOfMemoryError: GC overhead limit exceeded
解决:
打开 hadoop-env.sh 文件, 找到 HADOOP_HEAPSIZE= 和 HADOOP_NAMENODE_INIT_HEAPSIZE= 调整这两个参数, 具体调整多少, 视情况而定, 默认是 1000m, 也就是一个 g, 我这里调整如下:
export HADOOP_HEAPSIZE=32000
export HADOOP_NAMENODE_INIT_HEAPSIZE=16000 这两个参数去掉前面的 #号, 两台 namenode 节点都要调整
接着重新启动 hdfs, 如果还不行, 打开 hadoop-env.sh 文件, 找到 HADOOP_NAMENODE_OPTS
export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS" ---- 这是系统默认值
调整如下:
export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} -Xms6000m -Xmx6000m -XX:+UseCompressedOops -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0 -XX:+CMSParallelRemarkEnabled -XX:+DisableExplicitGC -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=75 -XX:SoftRefLRUPolicyMSPerMB=0 $HADOOP_NAMENODE_OPTS"
接着重新启动 hdfs, 如果还是报上面的错误, 那就继续调大上面
HADOOP_HEAPSIZE 和
HADOOP_NAMENODE_INIT_HEAPSIZE 的值
来源: http://www.bubuko.com/infodetail-2983822.html