因升级JN节点,需要将JN迁移到其他机器,该节点有三台在迁移过程中我迁移其中一台。
在HDFS页面进行角色迁移,选择当前角色机器和目标机器,提示需要重启整个集群(前提是需要确保是否有人员在使用)。重启后出现错误导致HA中Master无法启动
查看日志
2019-01-15 13:24:55,058 INFO org.apache.hadoop.util.GSet: Computing capacity for map NameNodeRetryCache2019-01-15 13:24:55,058 INFO org.apache.hadoop.util.GSet: VM type = 64-bit2019-01-15 13:24:55,058 INFO org.apache.hadoop.util.GSet: 0.029999999329447746% max memory 4.9 GB = 1.5 MB2019-01-15 13:24:55,058 INFO org.apache.hadoop.util.GSet: capacity = 2^18 = 262144 entries2019-01-15 13:24:55,063 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: ACLs enabled? false2019-01-15 13:24:55,063 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: XAttrs enabled? true2019-01-15 13:24:55,063 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: Maximum size of an xattr: 163842019-01-15 13:24:55,080 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /data1/dfs/nn/in_use.lock acquired by nodename 15050@cluster-master.gyyx.cn2019-01-15 13:24:55,083 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimagejava.io.IOException: NameNode is not formatted.at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:212)at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1061)at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:765)at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:609)at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:666)at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:838)at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:817)at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1538)at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1606)2019-01-15 13:24:55,096 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@cluster-master.gyyx.cn:500702019-01-15 13:24:55,196 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system...2019-01-15 13:24:55,197 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.2019-01-15 13:24:55,198 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.2019-01-15 13:24:55,198 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.java.io.IOException: NameNode is not formatted.at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:212)at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1061)at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:765)at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:609)at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:666)at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:838)at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:817)at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1538)at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1606)2019-01-15 13:24:55,202 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 12019-01-15 13:24:55,205 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:关注点在:
2019-01-15 13:24:55,083 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimagejava.io.IOException: NameNode is not formatted.各种百度、google搜索均是要求格式化,
hadoop namenode -format我这生产环境能动不动就格式化吗?
解决思路根据提示说是无法load fsimage
于是寻找fsimage所在的位置也就是edits 所在的位置
看到/data1/dfs/nn 目录下只有一个root权限的current.bak 说明系统将current目录给重命名了。
因为我的NN是HA。所以可以把current目录拷贝过来。(不能把currtne.bak名称改过去是因为数据已经发生变更)
解决问题