Spark On YAR部署记录
Spark On YAR部署记录
环境
centos 7.4 2核 4G 150G * 3
master 10.0.43.241
slave1 10.0.43.242
slave2 10.0.43.243
虚机安装(一台)
先安装一台master
yum install -y net-tools
yum install -y vim
yum install -y wget
yum install -y openssh-clients
[[email protected] ~]# vi /etc/hosts
[[email protected] ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.0.43.241 master
10.0.43.242 slave1
10.0.43.243 slave2
[[email protected] ~]#
[[email protected] ~]# systemctl stop firewalld
[[email protected] ~]# systemctl disable firewalld
[[email protected] ~]# setenforce 0
[[email protected] ~]# sed -i ‘s/SELINUX=enforcing/SELINUX=disabled/g’ /etc/selinux/config
在/root目录下准备一个脚本文件sshUtil.sh,用于SSH免密登录配置(后面再执行),内容如下
#!/bin/bash
ssh-****** -q -t rsa -N “” -f /root/.ssh/id_rsa
ssh-copy-id -i localhost
ssh-copy-id -i master
ssh-copy-id -i slave1
ssh-copy-id -i slave2
[[email protected] ~]# tar -zxvf jdk-8u221-linux-x64.tar.gz -C /opt
[[email protected] ~]# vi /etc/profile.d/custom.sh
[[email protected] ~]# cat /etc/profile.d/custom.sh
#!/bin/bash
#java path
export JAVA_HOME=/opt/jdk1.8.0_221
export PATH=JAVA_HOME/bin
export CLASSPATH=.:JAVA_HOME/lib
[[email protected] ~]# source /etc/profile.d/custom.sh
[[email protected] ~]# java -version
java version “1.8.0_221”
Java™ SE Runtime Environment (build 1.8.0_221-b01)
Java HotSpot™ 64-Bit Server VM (build 25.221-b01, mixed mode)
[[email protected] ~]#
hadoop配置准备
1.准备以下包并解压出来
2.hadoop-env.sh
[[email protected] hadoop]# pwd
/opt/hadoop-2.10.0/etc/hadoop
[[email protected] hadoop]# sed -i ‘s#export JAVA_HOME=${JAVA_HOME}#export JAVA_HOME=/opt/jdk1.8.0_221#’ hadoop-env.sh
3.core-site.xml
[[email protected] hadoop-2.10.0]# vi etc/hadoop/core-site.xml
4.hdfs-site.xml
[[email protected] hadoop]# vi hdfs-site.xml
5.slaves
[[email protected] hadoop]# cat slaves
master
slave1
slave2
6.mapred-site.xml
[[email protected] hadoop-2.10.0]# vi etc/hadoop/mapred-site.xml
[[email protected] hadoop-2.10.0]# cat etc/hadoop/mapred-site.xml
7.yarn-site.xml
<?xml version="1.0" encoding="utf-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> yarn.resourcemanager.hostname master yarn.nodemanager.aux-services mapreduce_shuffle8.配置环境变量
编辑/etc/profile.d/custom.sh,增加如下内容
#hadoop path
export HADOOP_HOME=/opt/hadoop-2.10.0
export PATH={HADOOP_HOME}/sbin:{HADOOP_HOME}
export HADOOP_COMMON_HOME={HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
clone虚机
clone出另外两台虚机,对应修改ip,hostname
hostnamectl set-hostname slave1
hostnamectl set-hostname slave2
SSH免密操作
三个节点均要执行,按照提示操作
[[email protected] ~]# sh sshUtil.sh
启动Hadoop集群
1.清空数据
[[email protected] ~]# rm -rf /tmp/*
2.namenode格式化
[[email protected] ~]# hdfs namenode -format
3.启动HDFS
[[email protected] ~]# start-dfs.sh
[[email protected] ~]# jps
11024 DataNode
11319 Jps
10890 NameNode
11195 SecondaryNameNode
[[email protected] ~]#
[[email protected] ~]# jps
1282 Jps
1203 DataNode
[[email protected] ~]#
[[email protected] ~]# jps
1027 Jps
1948 DataNode
[[email protected] ~]#
4.启动YARN
[[email protected] ~]# start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop-2.10.0/logs/yarn-root-resourcemanager-slave1.out
slave1: starting nodemanager, logging to /opt/hadoop-2.10.0/logs/yarn-root-nodemanager-slave1.out
master: starting nodemanager, logging to /opt/hadoop-2.10.0/logs/yarn-root-nodemanager-master.out
slave2: starting nodemanager, logging to /opt/hadoop-2.10.0/logs/yarn-root-nodemanager-slave2.out
[[email protected] ~]#
[[email protected] ~]# jps
8948 DataNode
9079 ResourceManager
9482 Jps
9183 NodeManager
[[email protected] ~]#
[[email protected] ~]# jps
7203 DataNode
7433 Jps
7325 NodeManager
[[email protected] ~]#
[[email protected] ~]# jps
21024 DataNode
21481 Jps
20890 NameNode
21195 SecondaryNameNode
21371 NodeManager
[[email protected] ~]#
5.访问WEB
namenode在master节点,resourcemanager在slave1节点,每个节点都有nodemanager
namenode界面:http://10.0.43.241:50070/
resourcemanager界面:http://10.0.43.242:8088/
nodemanager界面:http://10.0.43.241:8042
Spark配置
1.重命名
[[email protected] ~]# mv /opt/spark-3.0.0-bin-hadoop2.7/ /opt/spark-3.0.0
2.配置环境变量
编辑文件/etc/profile.d/custom.sh,增加如下内容
#spark path
export SPARK_HOME=/opt/spark-3.0.0
export PATH={SPARK_HOME}/sbin:PATH:$SCALA_HOME/bin
[[email protected] ~]# source /etc/profile.d/custom.sh
3.spark-env.sh
export LD_LIBRARY_PATH=KaTeX parse error: Expected 'EOF', got '#' at position 56: …jdk1.8.0_221 #̲Java环境变量
export…HADOOP_HOME/etc/hadoop #Hadoop配置目录
export SPARK_CLASSPATH=/opt/spark-3.0.0/libext #把MySQL驱动jar包放里面
4.slave
slave1
slave2
5.分发到其他节点
[[email protected] conf]#scp -r /opt/spark-3.0.0 [email protected]:/opt/
[[email protected] conf]#scp -r /opt/spark-3.0.0 [email protected]:/opt/
Spark启动
[[email protected] sbin]# pwd
/opt/spark-3.0.0/sbin
[[email protected] sbin] ./start-all.sh
[[email protected] sbin]# jps
4611 Master
2630 NameNode
5321 Jps
2767 DataNode
[[email protected] opt]# jps
2820 ResourceManager
4005 Worker
2311 DataNode
4679 Jps
[[email protected] opt]#
[[email protected] sbin]# jps
4563 Jps
2149 DataNode
2264 SecondaryNameNode
3737 Worker
[[email protected] sbin]#