Centos7下搭建Hadoop7单机环境
本文采用Vmware12+Centos7+hadoop2.7.3
一.准备工作
1.设置hostname
#hostnamectl set-hostname develop01
# hostnamectl status
2.配置Notepad++修改Linux文件
点击Notepad++最右侧“show NPPFTP”,点击profile setting,设置如下,
点击连接即可完成连接。
修改/etc/hosts
Ping一下查看配置是否成功:
3.防火墙和SELINUX配置
# firewall-cmd –state
Running
如果要关闭防火墙:
#systemctl stop firewalld.service
关闭SELINUX
# setenforce 0
修改/etc/selinux/config配置
3.几个重要配置文件
/etc/profile
/etc/bashrc
/etc/bash_profile
3.安装java-1.8.0-openjdk-devel.x86_64
我下载的cenos7 JAVA中目录中只有jre,所以下载一个,不然后面运行不了JPS命令,设置$JAVA_HOME也会出现问题:
#yum installjava-1.8.0-openjdk-devel.x86_64
4.配置SSH免秘钥登录
设置免密码登录前要生成自己的公钥和私钥。
# ssh-****** -t rsa
默认生成后存放在/root/.ssh/id_rsa
# cd /root/.ssh/
[[email protected] .ssh]# ls
id_rsa id_rsa.pub known_hosts
复制秘钥:
# ssh-copy-id 192.168.220.129
/usr/bin/ssh-copy-id: INFO: Source ofkey(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host '192.168.220.129(192.168.220.129)' can't be established.
ECDSA key fingerprint isSHA256:XCqvJVXoOI4nSQrOkD38qZtK4YZUw6QRsuRjViRUdWw.
ECDSA key fingerprint isMD5:41:be:79:d5:a7:32:8f:42:9e:2c:14:c3:e3:08:31:9a.
Are you sure you want to continueconnecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting tolog in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remainto be installed -- if you are prompted now it is to install the new keys
[email protected]'s password:
Number of key(s) added: 1
此时在/root/.ssh/生成了一个authorized_keys文件和id_rsa.pub内容相同。
如果还需要免密码登录其他机器
# ssh-copy-id 其他需要免密登录的主机域名或IP
二.Apache原生单节点版本部署
# mkdir -p /opt/hadoop_singleNode
# tar -zxf hadoop-2.7.3.tar.gz -C/opt/hadoop_singleNode/
# tar -zxf hbase-1.3.0-bin.tar.gz -C/opt/hadoop_singleNode/
1.配置hadoop
使用Notepad++进入/opt/hadoop_singleNode/hadoop-2.7.3/etc/Hadoop
(1)配置hadoop-env.sh
该配置文件中主要需要配置JAVA_HOME
在Linux中查询:# echo $JAVA_HOME为空,说明需要在/etc/profile中配置,需要先找到Java安装路径:
[[email protected] hadoop]# which java
/usr/bin/java
[[email protected] hadoop]# ls -lrt/usr/bin/java
lrwxrwxrwx. 1 root root 22 Jan 23 04:35/usr/bin/java -> /etc/alternatives/java
[[email protected] hadoop]# ls -lrt/etc/alternatives/java
lrwxrwxrwx. 1 root root 73 Jan 23 04:35/etc/alternatives/java ->/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/jre/bin/java
配置/etc/profile
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64
export JRE_HOME=$JAVA_HOME/jre
exportCLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
exportPATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
保存退出后执行命令生效:
[[email protected] hadoop]# source/etc/profile
[[email protected] hadoop]# echo $JAVA_HOME
/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64
配置hadoop-env.sh
exportJAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.161-0.b14.el7_4.x86_64/
(2)配置core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop_singleNode/hadoop-2.7.3/tmp</value>
<description>A base for other temporarydirectories.</description>
</property>
<property>
<!--fs.default.name指定NameNode的IP地址和端口号-->
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
theFileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>
(3) 配置hdfs-site.xml
<configuration>
<property>
<!--block的副本数,默认为3;你可以设置为1 这样每个block只会存在一份。-->
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
Theactual number of replications can be specified when the file is created.
Thedefault is used if replication is not specified in create time.
</description>
</property>
</configuration>
(4)配置mapred-site.xml
将mapred-site.xml.template重命名为mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at.If "local", then jobs are run in-process as a single map
andreduce task.
</description>
</property>
</configuration>
2.启动hadoop
(1)第一次进入需要格式化hadoop
#/opt/hadoop_singleNode/hadoop-2.7.3/bin/hdfs namenode –format
(2)启动
#/opt/hadoop_singleNode/hadoop-2.7.3/sbin/start-dfs.sh
# jps
48976 NameNode
49284 SecondaryNameNode
49109 DataNode
50012 Jps
(3)在浏览器中查看信息
http://192.168.220.129:50070/
4)创建一个文件夹试试看
# /opt/hadoop_singleNode/hadoop-2.7.3/bin/hadoopfs -mkdir /test
(5)启动mapreduce计算框架
#/opt/hadoop_singleNode/hadoop-2.7.3/sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to/opt/hadoop_singleNode/hadoop-2.7.3/logs/yarn-root-resourcemanager-status.out
[email protected]'s password:
localhost: starting nodemanager, logging to/opt/hadoop_singleNode/hadoop-2.7.3/logs/yarn-root-nodemanager-status.out
# jps
48976 NameNode
49284 SecondaryNameNode
49109 DataNode
50534 Jps
50217 ResourceManager
50347 NodeManager
(6)启动所有(YARN、HDFS、MapReduce)
#/opt/hadoop_singleNode/hadoop-2.7.3/sbin/start-all.sh
# jps
48976 NameNode
49284 SecondaryNameNode
49109 DataNode
51127 Jps
50217 ResourceManager
50347 NodeManager
(7)停止所有
#/opt/hadoop_singleNode/hadoop-2.7.3/sbin/stop-all.sh