Spark On YAR部署记录

环境

centos 7.4 2核 4G 150G * 3
master 10.0.43.241
slave1 10.0.43.242
slave2 10.0.43.243

虚机安装(一台)

先安装一台master
yum install -y net-tools
yum install -y vim
yum install -y wget
yum install -y openssh-clients

[[email protected] ~]# vi /etc/hosts
[[email protected] ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.0.43.241 master
10.0.43.242 slave1
10.0.43.243 slave2
[[email protected] ~]#
[[email protected] ~]# systemctl stop firewalld
[[email protected] ~]# systemctl disable firewalld
[[email protected] ~]# setenforce 0
[[email protected] ~]# sed -i ‘s/SELINUX=enforcing/SELINUX=disabled/g’ /etc/selinux/config

在/root目录下准备一个脚本文件sshUtil.sh,用于SSH免密登录配置(后面再执行),内容如下
#!/bin/bash
ssh-****** -q -t rsa -N “” -f /root/.ssh/id_rsa
ssh-copy-id -i localhost
ssh-copy-id -i master
ssh-copy-id -i slave1
ssh-copy-id -i slave2

[[email protected] ~]# tar -zxvf jdk-8u221-linux-x64.tar.gz -C /opt
[[email protected] ~]# vi /etc/profile.d/custom.sh
[[email protected] ~]# cat /etc/profile.d/custom.sh
#!/bin/bash
#java path
export JAVA_HOME=/opt/jdk1.8.0_221
export PATH=PATH:PATH:JAVA_HOME/bin
export CLASSPATH=.:CLASSPATH:CLASSPATH:JAVA_HOME/lib
[[email protected] ~]# source /etc/profile.d/custom.sh
[[email protected] ~]# java -version
java version “1.8.0_221”
Java™ SE Runtime Environment (build 1.8.0_221-b01)
Java HotSpot™ 64-Bit Server VM (build 25.221-b01, mixed mode)
[[email protected] ~]#

hadoop配置准备

1.准备以下包并解压出来
Spark On YAR部署记录
2.hadoop-env.sh
[[email protected] hadoop]# pwd
/opt/hadoop-2.10.0/etc/hadoop
[[email protected] hadoop]# sed -i ‘s#export JAVA_HOME=${JAVA_HOME}#export JAVA_HOME=/opt/jdk1.8.0_221#’ hadoop-env.sh

3.core-site.xml
[[email protected] hadoop-2.10.0]# vi etc/hadoop/core-site.xml

<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> fs.defaultFS hdfs://master:9000 hadoop.tmp.dir /var/data/hadoop io.file.buffer.size 65536

4.hdfs-site.xml
[[email protected] hadoop]# vi hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> dfs.replication 3 dfs.namenode.secondary.http-address slave2:50090 dfs.namenode.secondary.https-address slave2:50091

5.slaves
[[email protected] hadoop]# cat slaves
master
slave1
slave2

6.mapred-site.xml
[[email protected] hadoop-2.10.0]# vi etc/hadoop/mapred-site.xml
[[email protected] hadoop-2.10.0]# cat etc/hadoop/mapred-site.xml

<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> mapreduce.framework.name yarn

7.yarn-site.xml

<?xml version="1.0" encoding="utf-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> yarn.resourcemanager.hostname master yarn.nodemanager.aux-services mapreduce_shuffle

8.配置环境变量
编辑/etc/profile.d/custom.sh,增加如下内容
#hadoop path
export HADOOP_HOME=/opt/hadoop-2.10.0
export PATH=HADOOPHOME/bin:{HADOOP_HOME}/bin:{HADOOP_HOME}/sbin:PATHexportHADOOPMAPREDHOME=PATH export HADOOP_MAPRED_HOME={HADOOP_HOME}
export HADOOP_COMMON_HOME=HADOOPHOMEexportHADOOPHDFSHOME={HADOOP_HOME} export HADOOP_HDFS_HOME={HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}

clone虚机

clone出另外两台虚机,对应修改ip,hostname
hostnamectl set-hostname slave1
hostnamectl set-hostname slave2

SSH免密操作

三个节点均要执行,按照提示操作
[[email protected] ~]# sh sshUtil.sh

启动Hadoop集群

1.清空数据
[[email protected] ~]# rm -rf /tmp/*

2.namenode格式化
[[email protected] ~]# hdfs namenode -format

3.启动HDFS
[[email protected] ~]# start-dfs.sh

[[email protected] ~]# jps
11024 DataNode
11319 Jps
10890 NameNode
11195 SecondaryNameNode
[[email protected] ~]#

[[email protected] ~]# jps
1282 Jps
1203 DataNode
[[email protected] ~]#

[[email protected] ~]# jps
1027 Jps
1948 DataNode
[[email protected] ~]#

4.启动YARN
[[email protected] ~]# start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop-2.10.0/logs/yarn-root-resourcemanager-slave1.out
slave1: starting nodemanager, logging to /opt/hadoop-2.10.0/logs/yarn-root-nodemanager-slave1.out
master: starting nodemanager, logging to /opt/hadoop-2.10.0/logs/yarn-root-nodemanager-master.out
slave2: starting nodemanager, logging to /opt/hadoop-2.10.0/logs/yarn-root-nodemanager-slave2.out
[[email protected] ~]#

[[email protected] ~]# jps
8948 DataNode
9079 ResourceManager
9482 Jps
9183 NodeManager
[[email protected] ~]#

[[email protected] ~]# jps
7203 DataNode
7433 Jps
7325 NodeManager
[[email protected] ~]#

[[email protected] ~]# jps
21024 DataNode
21481 Jps
20890 NameNode
21195 SecondaryNameNode
21371 NodeManager
[[email protected] ~]#

5.访问WEB
namenode在master节点,resourcemanager在slave1节点,每个节点都有nodemanager

namenode界面:http://10.0.43.241:50070/
resourcemanager界面:http://10.0.43.242:8088/
nodemanager界面:http://10.0.43.241:8042

Spark配置

1.重命名
[[email protected] ~]# mv /opt/spark-3.0.0-bin-hadoop2.7/ /opt/spark-3.0.0

2.配置环境变量
编辑文件/etc/profile.d/custom.sh,增加如下内容
#spark path
export SPARK_HOME=/opt/spark-3.0.0
export PATH=SPARKHOME/bin:{SPARK_HOME}/bin:{SPARK_HOME}/sbin:PATHexportSCALAHOME=/opt/scala2.11.0exportPATH=PATH export SCALA_HOME=/opt/scala-2.11.0 export PATH=PATH:$SCALA_HOME/bin

[[email protected] ~]# source /etc/profile.d/custom.sh

3.spark-env.sh
export LD_LIBRARY_PATH=KaTeX parse error: Expected 'EOF', got '#' at position 56: …jdk1.8.0_221 #̲Java环境变量 export…HADOOP_HOME/etc/hadoop #Hadoop配置目录
export SPARK_CLASSPATH=/opt/spark-3.0.0/libext #把MySQL驱动jar包放里面

4.slave
slave1
slave2

5.分发到其他节点
[[email protected] conf]#scp -r /opt/spark-3.0.0 [email protected]:/opt/
[[email protected] conf]#scp -r /opt/spark-3.0.0 [email protected]:/opt/

Spark启动

[[email protected] sbin]# pwd
/opt/spark-3.0.0/sbin
[[email protected] sbin] ./start-all.sh

[[email protected] sbin]# jps
4611 Master
2630 NameNode
5321 Jps
2767 DataNode

[[email protected] opt]# jps
2820 ResourceManager
4005 Worker
2311 DataNode
4679 Jps
[[email protected] opt]#

[[email protected] sbin]# jps
4563 Jps
2149 DataNode
2264 SecondaryNameNode
3737 Worker
[[email protected] sbin]#

Spark On YAR部署记录