ubuntu18.04搭建spark
参考博文:
https://blog.****.net/weixin_42001089/article/details/82346367
Hadoop
在安装和配置spark之前,先安装并配置好hadoop,安装过程请看博主另一篇文章ubuntu18.04搭建hadoop
安装scala
下载scala-2.11.8.tgz
下载好后解压:
$ tar -zxvf scala-2.11.8.tgz
移动文件到/usr/local/scala
路径下:
$ sudo mkdir /usr/local/scala
$ sudo mv -v scala-2.11.8.tgz /usr/local/scala
配置scala环境变量
$ sudo gedit /etc/profile
添加:
SCALA_HOME=/usr/local/scala/scala-2.11.12
PATH=$PATH:$HOME/bin:$JAVA_HOME/bin:$SCALA_HOME/bin
export SCALA_HOME=/usr/local/scala/scala-2.11.12
export PATH=$PATH:$SCALA_HOME/bin
更新环境变量
$ source /etc/profile
测试:
scala安装成功!
安装spark
下载地址
下载好后解压到/usr/local/spark
$ tar -zxvf spark-2.4.0-bin-hadoop2.7.tgz
$ sudo mkdir /usr/local/spark
$ sudo mv spark-2.4.0-bin-hadoop2.7.tgz /usr/local/spark
配置spark环境变量
$ sudo vi /etc/profile
添加:
SPARK_HOME=/usr/local/spark/spark-2.4.0-bin-hadoop2.7
PATH=$PATH:$HOME/bin:JAVA_HOME/bin:$SCALA_HOME/bin:$SPARK_HOME/bin
export SPARK_HOME=/usr/local/spark/spark-2.4.0-bin-hadoop2.7
export path=$PATH:$SPARK_HOME/bin
更新环境变量:
$ source /etc /profile
配置spark-env.sh
进入路径:/usr/local/spark/spark-2.4.0-bin-hadoop2.7/conf
$ sudo cp -v spark-env.sh.template spark-env.sh
sudo vi spark-env.sh
添加以下内容:
export JAVA_HOME=/usr/local/java/jdk1.8.0_191
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.9.2
export HADOOP_CONF_DIR=/usr/local/hadoop/hadoop-2.9.2/etc/hadoop
export SCALA_HOME=/usr/local/scala/scala-2.11.12
export SPARK_HOME=/usr/local/spark/spark-2.4.0-bin-hadoop2.7
export SPARK_MASTER_IP=127.0.0.1
export SPARK_MASTER_PORT=7077
export SPARK_MASTER_WEBUI_PORT=8099
export SPARK_WORKER_CORES=3
export SPARK_WORKER_INSTANCES=1
export SPARK_WORKER_MEMORY=5G
export SPARK_WORKER_WEBUI_PORT=8081
export SPARK_EXECUTOR_CORES=1
export SPARK_EXECUTOR_MEMORY=1G
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:$HADOOP_HOME/lib/native
配置Slave
$ cp slaves.template slaves
$ vi slaves
默认为localhost
启动
在/sbin
下启动start-master.sh
和start-slaves.sh
在/bin
目录下启动spark-shell
这样就进入scala环境了,可以在这编写代码了。
Spark Web界面 http://127.0.0.1:4040
spark安装成功!!!