5.Spark2.0在ubuntu上的安装

1.Spark下载

   https://spark.apache.org/downloads.html

  wget http://archive.apache.org/dist/spark/spark-2.4.0/spark-2.4.0-bin-hadoop2.6.tgz

2.解压

  tar zxf spark-2.4.0-bin-hadoop2.6.tgz

3.移动到/usr/local/spark

  sudo mv spark-2.4.0-bin-hadoop2.6 /usr/local/spark/

 

3.设置环境变量

  sudo gedit ~/.bashrc

  export SPARK_HOME=/usr/local/spark
  export PATH=$PATH:$SPARK_HOME/bin

4.环境变量生效

   source ~/.bashrc

5.启动pyspark

  pyspark

5.Spark2.0在ubuntu上的安装

6.修改spark默认日志级别

(1)切换至spark配置目录

        cd /usr/local/spark/conf

(2)复制log4j模板文件到log4j.properties

        cp log4j.properties.template log4j.properties

(3)修改log4j.properties日志级别

        sudo gedit log4j.properties

        log4j.rootCategory=WARN, console