5.Spark2.0在ubuntu上的安装
1.Spark下载
https://spark.apache.org/downloads.html
wget http://archive.apache.org/dist/spark/spark-2.4.0/spark-2.4.0-bin-hadoop2.6.tgz
2.解压
tar zxf spark-2.4.0-bin-hadoop2.6.tgz
3.移动到/usr/local/spark
sudo mv spark-2.4.0-bin-hadoop2.6 /usr/local/spark/
3.设置环境变量
sudo gedit ~/.bashrc
export SPARK_HOME=/usr/local/spark
export PATH=$PATH:$SPARK_HOME/bin
4.环境变量生效
source ~/.bashrc
5.启动pyspark
pyspark
6.修改spark默认日志级别
(1)切换至spark配置目录
cd /usr/local/spark/conf
(2)复制log4j模板文件到log4j.properties
cp log4j.properties.template log4j.properties
(3)修改log4j.properties日志级别
sudo gedit log4j.properties
log4j.rootCategory=WARN, console