8.Spark Standalone Cluster环境搭建及验证

前置准备:Hadoop集群环境搭建

1.在master虚拟机设置spark-env.sh

(1)复制模板文件来创建spark-env.sh

         cp /usr/local/spark/conf/spark-env.sh.template /usr/local/spark/conf/spark-env.sh

(2)修改spark-env.sh

        sudo vim /usr/local/spark/conf/spark-env.sh

        export SPARK_MASTER_IP=master

        export SPARK_WORKER_CORES=1

        export SPARK_WORKER_MEMORY=128m

        export SPARK_WORKER_INSTANCES=4

2.复制spark程序到data1、data2、data3

(1)复制spark程序到data1

         ssh data1

         sudo mkdir /usr/local/spark

         sudo chown hduser:hduser /usr/local/spark

         exit

         sudo scp -r /usr/local/spark [email protected]:/usr/local

(2)复制spark程序到data2

         ssh data2

         sudo mkdir /usr/local/spark

         sudo chown hduser:hduser /usr/local/spark

         exit

         sudo scp -r /usr/local/spark [email protected]:/usr/local

(3)复制spark程序到data3

         ssh data3

         sudo mkdir /usr/local/spark

         sudo chown hduser:hduser /usr/local/spark

         exit

         sudo scp -r /usr/local/spark [email protected]:/usr/local

3.在master虚拟机编辑slaves文件

        1.修改slaves文件

           sudo vim /usr/local/spark/conf/slaves

           data1

           data2

           data3

4.启动Spark Standalone Cluster

        /usr/local/spark/sbin/start-all.sh       

       或

       /usr/local/spark/sbin/start-master.sh       

      /usr/local/spark/sbin/start-slaves.sh

     注意:

     需要在spark-config.sh配置JAVA_HOME

8.Spark Standalone Cluster环境搭建及验证

 

  5.运行pyspark

     pyspark --master spark://master:7077 --num-executors 1 --total-executor-cores 3 --executor-memory 512m

8.Spark Standalone Cluster环境搭建及验证

8.Spark Standalone Cluster环境搭建及验证