Hadoop集群搭建
安装虚拟机
配置网卡:
[[email protected] hadoop]# vi /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
TYPE=Ethernet
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=static
IPADDR=192.168.199.141
NETMASK=255.255.255.0
GATEWAY=192.168.199.2
DNS1=114.114.114.114
DNS2=192.168.199.2
配置主机名
[[email protected] hadoop]# vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=hadoop01
关闭防火墙
[[email protected] hadoop]# service iptables stop
[[email protected] hadoop]# chkconfig iptables off
安装ssh客户端
[[email protected] hadoop]# yum install -y openssh-clients
克隆
[[email protected] hadoop]# vi /etc/udev/rules.d/70-persistent-net.rules
# PCI device 0x8086:0x100f (e1000)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:0c:29:ab:a6:61", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"
PCI device 0x8086:0x100f (e1000)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:0c:29:ab:a6:61", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"
删除eth0,将eth1修改为eth0
修改主机名
[[email protected] hadoop]# vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=hadoop02
重启电脑,使网卡生效
[[email protected] hadoop]# reboot
其他的机器按照上面重新操作一遍(3-4)
hosts映射
修改hosts文件
==windows:C:\Windows\System32\drivers\etc==
linux:[[email protected] hadoop]# vi /etc/hosts
增加
192.168.199.141 hadoop01
192.168.199.142 hadoop02
192.168.199.143 hadoop03
免密登陆
可以使用脚本做
#!/bin/bash
#yum安装expect
yum -y install expect
#PWD_1是登陆密码,可以自己设定
PWD_1=123456
ips=$(cat /etc/hosts |grep -v "::" | grep -v "127.0.0.1")
key_generate() {
expect -c "set timeout -1;
spawn ssh-****** -t rsa;
expect {
{Enter file in which to save the key*} {send -- \r;exp_continue}
{Enter passphrase*} {send -- \r;exp_continue}
{Enter same passphrase again:} {send -- \r;exp_continue}
{Overwrite (y/n)*} {send -- n\r;exp_continue}
eof {exit 0;}
};"
}
auto_ssh_copy_id () {
expect -c "set timeout -1;
spawn ssh-copy-id -i $HOME/.ssh/id_rsa.pub [email protected]$1;
expect {
{Are you sure you want to continue connecting *} {send -- yes\r;exp_continue;}
{*password:} {send -- $2\r;exp_continue;}
eof {exit 0;}
};"
}
rm -rf ~/.ssh
key_generate
for ip in $ips
do
auto_ssh_copy_id $ip $PWD_1
done
jdk安装
1:把文件上传到linux
2:解压文件到安装目录:[[email protected] hadoop]# tar -zxvf /root/jdk-8u102-linux-x64.tar.gz -C /usr/local/
3:配置环境变量[[email protected] hadoop]# vi /etc/profile
export JAVA_HOME=/usr/local/jdk1.8.0_102
export PATH=$PATH:$JAVA_HOME/bin
[[email protected] hadoop]# source /etc/profile
安装hadoop
1.上传HADOOP安装包 /root
2.规划安装目录 /usr/local/hadoop-2.7.3
3.解压安装包[[email protected] hadoop]# tar -zxvf /root/local/hadoop-2.7.3.tar.gz -C /usr/local/
4.修改配置文件 $HADOOP_HOME/etc/hadoop/(进入到路径)
最简化配置如下:
(1)[[email protected] hadoop]# vi hadoop-env.sh
#The java implementation to use.
export JAVA_HOME=/usr/local/jdk1.8.0_102
(2)[[email protected] hadoop]# vi core-site.xml
Namenode在哪里 ,临时文件存储在哪里
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop3801:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-2.7.3/tmp</value>
</property>
</configuration>
(3)[[email protected] hadoop]# vi hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/hadoop-2.7.3/data/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/local/hadoop-2.7.3/data/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.secondary.http.address</name>
<value>hadoop3801:50090</value>
</property>
</configuration>
(4)[[email protected] hadoop]# cp mapred-site.xml.template mapred-site.xml
(5)[[email protected] hadoop]# vi mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
(6)[[email protected] hadoop]# vi yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop3801</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
(7)[[email protected] hadoop]# vi slaves
删除原有的localhost
Hadoop02
Hadoop03
全局路径生效:source /etc/profile
把第一台安装好的jdk和hadoop以及配置文件发送给另外两台
hosts文件
jdk安装后的文件夹
Hadoop安装后的文件夹
/etc/profile 文件
[[email protected] hadoop]# scp -r /usr/local/jdk1.8.0_102 hadoop02:/usr/local/
[[email protected] hadoop]# scp -r /usr/local/hadoop-2.7.3/ hadoop02:/usr/local/
[[email protected] hadoop]# scp -r /etc/hosts hadoop02:/etc/
[[email protected] hadoop]# scp -r /etc/profile hadoop02:/etc/
启动集群
初始化HDFS(在hadoop01进行操作)(操作一次就ok)bin/目录下[[email protected] hadoop]# hadoop namenode -format
启动HDFSsbin/目录下[[email protected] hadoop]# start-dfs.sh
Jps查看进程[[email protected] hadoop]# jps
2820 Jps
2028 NameNode
2205 SecondaryNameNode
启动YARN
sbin/目录下[[email protected] hadoop]# start-yarn.sh
[[email protected] hadoop]# jsp
Hadoop01上进程
2820 Jps
2028 NameNode
2205 SecondaryNameNode
2350 ResourceManager
通过网页查看
hadoop01:50070
Hadoop01:8088