当正常启动hadoop之后,一般进程都在,而且50070端口可以访问,但Live Nodes为0,这里可能有多种原因造成这样:

1、/etc/hosts 中的ip映射不对

2、master与slave之间不能互通

3、hadoop配置文件有错

在子节点中查看日志,

2018-01-03 09:26:48,488 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hbase1/192.168.10.101:8031. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-01-03 09:26:49,489 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hbase1/192.168.10.101:8031. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-01-03 09:26:50,490 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hbase1/192.168.10.101:8031. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-01-03 09:26:51,494 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hbase1/192.168.10.101:8031. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-01-03 09:26:52,495 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hbase1/192.168.10.101:8031. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-01-03 09:26:53,496 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hbase1/192.168.10.101:8031. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-01-03 09:26:54,497 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hbase1/192.168.10.101:8031. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-01-03 09:26:55,498 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hbase1/192.168.10.101:8031. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-01-03 09:27:26,510 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hbase1/192.168.10.101:8031. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-01-03 09:27:27,511 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hbase1/192.168.10.101:8031. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

说明子节点无法访问到主节点,再去看了/etc/hosts文件:

#127.0.0.1      localhost
172.17.168.96 hbase1
192.168.10.101 hbase1
192.168.10.102 hbase2
192.168.10.103 hbase3
192.168.10.104 hbase4
192.168.10.105 hbase5
192.168.10.106 hbase6
192.168.10.107 hbase7
#backup cluster
192.168.20.101 bhbase1
192.168.20.102 bhbase2
192.168.20.103 bhbase3
192.168.20.104 bhbase4
192.168.20.105 bhbase5
192.168.20.106 bhbase6
192.168.20.107 bhbase7

其中hbase1(即主节点)映射了两个ip,但一般在linux中默认只会取hosts中配置的第一条主机名映射来使用,对于172.17.168.96这个ip,因为子节点配置了内网ip,所以访问不到。

解决方法就是将外网ip注释掉。

#127.0.0.1      localhost
#172.17.168.96 hbase1
192.168.10.101 hbase1
192.168.10.102 hbase2
192.168.10.103 hbase3
192.168.10.104 hbase4
192.168.10.105 hbase5
192.168.10.106 hbase6
192.168.10.107 hbase7
#backup cluster
192.168.20.101 bhbase1
192.168.20.102 bhbase2
192.168.20.103 bhbase3
192.168.20.104 bhbase4
192.168.20.105 bhbase5
192.168.20.106 bhbase6
192.168.20.107 bhbase7


然后重启hadoop即可。

=======================

还有一个问题就是,正常启动了hadoop之后,50070端口可以正常访问,但是8088端口访问不了,解决方法:

修改配置文件yarn-site.xml 中的


yarn.resourcemanager.webapp.address
hbase1:8088

修改后为:


yarn.resourcemanager.webapp.address
172.17.168.96:8088

这里不能用主机名,而应该用外网ip。每一台子节点都要修改,修改完成之后重启hadoop即可。

这样8088就能够正常访问。