如何从源代码正确构建spark 2.0,以包含pyspark?

问题描述:

我刚刚在Ubuntu主机上使用“sbt assembly”构建了spark 2.0。 一切都完成很好,但是,当我试图提交pyspark工作:如何从源代码正确构建spark 2.0,以包含pyspark?

bin/spark-submit --master spark://localhost:7077 examples/src/main/python/pi.py 1000 

我得到这个错误:

Failed to find Spark jars directory (/home/ubuntu/spark/spark-2.0.0/assembly/target/scala-2.10/jars). 
You need to build Spark with the target "package" before running this program. 

我应该以重新构建火花2.0包括pyspark做?

尝试:

  1. Install sbt

  2. 体形:

    https://github.com/apache/spark.git 
    cd spark 
    git checkout v2.0.0 
    sbt package