Centos7下搭建Hadoop7单机环境

本文采用Vmware12+Centos7+hadoop2.7.3

一.准备工作

1.设置hostname

#hostnamectl set-hostname develop01

# hostnamectl status

 

2.配置Notepad++修改Linux文件

Centos7下搭建Hadoop7单机环境

点击Notepad++最右侧“show NPPFTP”,点击profile setting,设置如下,

Centos7下搭建Hadoop7单机环境

点击连接即可完成连接。

Centos7下搭建Hadoop7单机环境

修改/etc/hosts

Centos7下搭建Hadoop7单机环境

Ping一下查看配置是否成功:

Centos7下搭建Hadoop7单机环境

3.防火墙和SELINUX配置

# firewall-cmd –state

Running

如果要关闭防火墙:

#systemctl stop firewalld.service

关闭SELINUX

# setenforce 0

修改/etc/selinux/config配置

Centos7下搭建Hadoop7单机环境

3.几个重要配置文件

/etc/profile

/etc/bashrc

/etc/bash_profile

3.安装java-1.8.0-openjdk-devel.x86_64

我下载的cenos7 JAVA中目录中只有jre,所以下载一个,不然后面运行不了JPS命令,设置$JAVA_HOME也会出现问题:

#yum installjava-1.8.0-openjdk-devel.x86_64

 

4.配置SSH免秘钥登录

设置免密码登录前要生成自己的公钥和私钥。

# ssh-****** -t rsa

默认生成后存放在/root/.ssh/id_rsa

# cd /root/.ssh/

[[email protected] .ssh]# ls

id_rsa id_rsa.pub  known_hosts

 

复制秘钥:

# ssh-copy-id 192.168.220.129

/usr/bin/ssh-copy-id: INFO: Source ofkey(s) to be installed: "/root/.ssh/id_rsa.pub"

The authenticity of host '192.168.220.129(192.168.220.129)' can't be established.

ECDSA key fingerprint isSHA256:XCqvJVXoOI4nSQrOkD38qZtK4YZUw6QRsuRjViRUdWw.

ECDSA key fingerprint isMD5:41:be:79:d5:a7:32:8f:42:9e:2c:14:c3:e3:08:31:9a.

Are you sure you want to continueconnecting (yes/no)? yes

/usr/bin/ssh-copy-id: INFO: attempting tolog in with the new key(s), to filter out any that are already installed

/usr/bin/ssh-copy-id: INFO: 1 key(s) remainto be installed -- if you are prompted now it is to install the new keys

[email protected]'s password:

Number of key(s) added: 1

 

此时在/root/.ssh/生成了一个authorized_keys文件和id_rsa.pub内容相同。

如果还需要免密码登录其他机器

# ssh-copy-id 其他需要免密登录的主机域名或IP

 

二.Apache原生单节点版本部署

# mkdir -p /opt/hadoop_singleNode

# tar -zxf hadoop-2.7.3.tar.gz -C/opt/hadoop_singleNode/

# tar -zxf hbase-1.3.0-bin.tar.gz -C/opt/hadoop_singleNode/

 

1.配置hadoop

使用Notepad++进入/opt/hadoop_singleNode/hadoop-2.7.3/etc/Hadoop

(1)配置hadoop-env.sh

该配置文件中主要需要配置JAVA_HOME

在Linux中查询:# echo $JAVA_HOME为空,说明需要在/etc/profile中配置,需要先找到Java安装路径:

[[email protected] hadoop]# which java

/usr/bin/java

[[email protected] hadoop]# ls -lrt/usr/bin/java

lrwxrwxrwx. 1 root root 22 Jan 23 04:35/usr/bin/java -> /etc/alternatives/java

[[email protected] hadoop]# ls -lrt/etc/alternatives/java

lrwxrwxrwx. 1 root root 73 Jan 23 04:35/etc/alternatives/java ->/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/bin/java

 

配置/etc/profile

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64

export JRE_HOME=$JAVA_HOME/jre 

exportCLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH 

exportPATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH

 

保存退出后执行命令生效:

[[email protected] hadoop]# source/etc/profile

[[email protected] hadoop]# echo $JAVA_HOME

/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64

配置hadoop-env.sh

exportJAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/

 

(2)配置core-site.xml

<configuration>

<property>

   <name>hadoop.tmp.dir</name>

   <value>/opt/hadoop_singleNode/hadoop-2.7.3/tmp</value>

   <description>A base for other temporarydirectories.</description>

 </property>

 

 <property>

  <!--fs.default.name指定NameNode的IP地址和端口号-->

   <name>fs.default.name</name>

   <value>hdfs://localhost:54310</value>

   <description>The name of the default file system. A URI whose

 scheme and authority determine the FileSystem implementation. The

 uri's scheme determines the config property (fs.SCHEME.impl) naming

  theFileSystem implementation class. The uri's authority is used to

 determine the host, port, etc. for a filesystem.</description>

 </property>

</configuration>

 

(3) 配置hdfs-site.xml

<configuration>

<property>

<!--block的副本数,默认为3;你可以设置为1 这样每个block只会存在一份。-->

 <name>dfs.replication</name>

  <value>1</value>

 <description>Default block replication.

  Theactual number of replications can be specified when the file is created.

  Thedefault is used if replication is not specified in create time.

 </description>

</property>

</configuration>

 

(4)配置mapred-site.xml

将mapred-site.xml.template重命名为mapred-site.xml

<configuration>

<property>

 <name>mapred.job.tracker</name>

 <value>localhost:54311</value>

 <description>The host and port that the MapReduce job tracker runs

  at.If "local", then jobs are run in-process as a single map

  andreduce task.

 </description>

</property>

</configuration>

 

2.启动hadoop

(1)第一次进入需要格式化hadoop

#/opt/hadoop_singleNode/hadoop-2.7.3/bin/hdfs namenode –format

 

(2)启动

#/opt/hadoop_singleNode/hadoop-2.7.3/sbin/start-dfs.sh

# jps

48976 NameNode

49284 SecondaryNameNode

49109 DataNode

50012 Jps

 

(3)在浏览器中查看信息

http://192.168.220.129:50070/ 

Centos7下搭建Hadoop7单机环境

4)创建一个文件夹试试看

# /opt/hadoop_singleNode/hadoop-2.7.3/bin/hadoopfs -mkdir /test

Centos7下搭建Hadoop7单机环境

(5)启动mapreduce计算框架

#/opt/hadoop_singleNode/hadoop-2.7.3/sbin/start-yarn.sh

starting yarn daemons

starting resourcemanager, logging to/opt/hadoop_singleNode/hadoop-2.7.3/logs/yarn-root-resourcemanager-status.out

[email protected]'s password:

localhost: starting nodemanager, logging to/opt/hadoop_singleNode/hadoop-2.7.3/logs/yarn-root-nodemanager-status.out

# jps

48976 NameNode

49284 SecondaryNameNode

49109 DataNode

50534 Jps

50217 ResourceManager

50347 NodeManager

(6)启动所有(YARN、HDFS、MapReduce)

#/opt/hadoop_singleNode/hadoop-2.7.3/sbin/start-all.sh

# jps

48976 NameNode

49284 SecondaryNameNode

49109 DataNode

51127 Jps

50217 ResourceManager

50347 NodeManager

(7)停止所有

#/opt/hadoop_singleNode/hadoop-2.7.3/sbin/stop-all.sh