Hadoop datanode绑定错误的IP地址

问题描述:

我有一个三节点hadoop集群正在运行。出于某种原因,当datanode从站启动时,它们会识别我的网络上不存在的IP地址。这是我的主机名和IP映射。Hadoop datanode绑定错误的IP地址

nodes: 
    - hostname: hadoop-master 
    ip: 192.168.51.4 
    - hostname: hadoop-data1 
    ip: 192.168.52.4 
    - hostname: hadoop-data2 
    ip: 192.168.52.6 

正如你可以看到下面的Hadoop的主节点正常启动,但其他两个节点,只有一个永远显示为一个活数据节点和哪一个总是显示了具有IP 192.168.51.1 ,正如你上面看到的,甚至在我的网络上都不存在。

[email protected]:~$ hdfs dfsadmin -report 
Safe mode is ON 
Configured Capacity: 84482326528 (78.68 GB) 
Present Capacity: 75735965696 (70.53 GB) 
DFS Remaining: 75735281664 (70.53 GB) 
DFS Used: 684032 (668 KB) 
DFS Used%: 0.00% 
Under replicated blocks: 0 
Blocks with corrupt replicas: 0 
Missing blocks: 0 
Missing blocks (with replication factor 1): 0 

------------------------------------------------- 
Live datanodes (2): 

Name: 192.168.51.1:50010 (192.168.51.1) 
Hostname: hadoop-data2 
Decommission Status : Normal 
Configured Capacity: 42241163264 (39.34 GB) 
DFS Used: 303104 (296 KB) 
Non DFS Used: 4305530880 (4.01 GB) 
DFS Remaining: 37935329280 (35.33 GB) 
DFS Used%: 0.00% 
DFS Remaining%: 89.81% 
Configured Cache Capacity: 0 (0 B) 
Cache Used: 0 (0 B) 
Cache Remaining: 0 (0 B) 
Cache Used%: 100.00% 
Cache Remaining%: 0.00% 
Xceivers: 1 
Last contact: Fri Sep 25 13:54:23 UTC 2015 


Name: 192.168.51.4:50010 (hadoop-master) 
Hostname: hadoop-master 
Decommission Status : Normal 
Configured Capacity: 42241163264 (39.34 GB) 
DFS Used: 380928 (372 KB) 
Non DFS Used: 4440829952 (4.14 GB) 
DFS Remaining: 37799952384 (35.20 GB) 
DFS Used%: 0.00% 
DFS Remaining%: 89.49% 
Configured Cache Capacity: 0 (0 B) 
Cache Used: 0 (0 B) 
Cache Remaining: 0 (0 B) 
Cache Used%: 100.00% 
Cache Remaining%: 0.00% 
Xceivers: 1 
Last contact: Fri Sep 25 13:54:21 UTC 2015 

我没有试图明确添加dfs.datanode.address为每个主机,但在这种情况下,它甚至未能显示为一个活动节点。这是我的hdfs-site.xml的样子(注意我已经用dfs.datanode.address设置和不存在了)。

<?xml version="1.0"?> 
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> 
<!-- 
    Licensed under the Apache License, Version 2.0 (the "License"); 
    you may not use this file except in compliance with the License. 
    You may obtain a copy of the License at 

    http://www.apache.org/licenses/LICENSE-2.0 

    Unless required by applicable law or agreed to in writing, software 
    distributed under the License is distributed on an "AS IS" BASIS, 
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 
    See the License for the specific language governing permissions and 
    limitations under the License. See accompanying LICENSE file. 
--> 

<!-- Put site-specific property overrides in this file. --> 

<configuration> 
    <property> 
    <name>dfs.replication</name> 
    <value>2</value> 
    <description>Default block replication. 
    The actual number of replications can be specified when the file is created. 
    The default is used if replication is not specified in create time. 
    </description> 
    </property> 
    <property> 
    <name>dfs.namenode.rpc-bind-host</name> 
    <value>0.0.0.0</value> 
    </property> 
    <property> 
    <name>dfs.datanode.address</name> 
    <value>192.168.51.4:50010</value> 
    </property> 
    <property> 
    <name>dfs.namenode.datanode.registration.ip-hostname-check</name> 
    <value>false</value> 
    </property> 
    <property> 
    <name>dfs.namenode.name.dir</name> 
    <value>/home/hadoop/hadoop-data/hdfs/namenode</value> 
    <description>Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.</description> 
    </property> 
    <property> 
    <name>dfs.datanode.data.dir</name> 
    <value>/home/hadoop/hadoop-data/hdfs/datanode</value> 
    <description>Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.</description> 
    </property> 
</configuration> 

为什么hadoop会将每个datanode与一个甚至不存在的IP相关联?或者更重要的是,我怎样才能让节点正常工作?

UPDATE: 文件/ etc /所有节点上的主机是相同的

192.168.51.4 hadoop-master 
192.168.52.4 hadoop-data1 
192.168.52.6 hadoop-data2 

下面是我的奴隶文件的内容。

[email protected]:~$ cat /usr/local/hadoop/etc/hadoop/slaves 
hadoop-master 
hadoop-data1 
hadoop-data2 

数据节点日志:
https://gist.github.com/dwatrous/7241bb804a9be8f9303fhttps://gist.github.com/dwatrous/bcd85cda23d6eca3a68bhttps://gist.github.com/dwatrous/922c4f773aded0137fa3

NameNode的日志:
https://gist.github.com/dwatrous/dafaa7695698f36a5d93

您可以发布您的整个数据节点日志?尝试将以下值设置为要绑定到的IP的接口名称。

dfs.client.local.interfaces = eth0的

+0

下面是这三个节点,启动日志: https://gist.github.com/dwatrous/7241bb804a9be8f9303f https://gist.github.com/dwatrous/bcd85cda23d6eca3a68b HTTPS://gist.github。 com/dwatrous/922c4f773aded0137fa3 –

+0

@DanielWatrous最好在问题中添加信息,以便它可供所有用户使用。用户可能不会阅读所有来自不同答案的评论。 – YoungHobbit

+0

我在问题中添加了日志 –

审查所有可能出现的问题后,这一次似乎与流浪汉,VirtualBox虚拟的某种组合。我试图在一个子网上运行主节点,在另一个子网上运行datanode。事实证明,配置网络的方式,我可以在这些子网之间进行通信,但有一些类型的隐藏网关会导致使用错误的IP地址。

解决方法是更改​​我的Vagrantfile以将所有三台主机放在同一个子网上。之后,一切按预期工作。