Spark用IntelliJ + maven打jar包上传到Spark集群上运行

因为需要用Hadoop的HDFS所以要启动Hadoop

1.启动Hadoop

Spark用IntelliJ + maven打jar包上传到Spark集群上运行

2.启动Spark集群

Spark用IntelliJ + maven打jar包上传到Spark集群上运行

3.打开IntelliJ

创建maven项目

然后配置maven项目的pom.xml文件

内容如下

<properties>

<maven.compiler.source>1.8</maven.compiler.source>

<maven.compiler.target>1.8</maven.compiler.target>

<encoding>UTF-8</encoding>

</properties>



<dependencies>

<dependency>

<groupId>org.scala-lang</groupId>

<artifactId>scala-library</artifactId>

<version>2.11.8</version>

</dependency>



<dependency>

<groupId>org.apache.spark</groupId>

<artifactId>spark-core_2.11</artifactId>

<version>2.1.1</version>

</dependency>



<dependency>

<groupId>org.apache.hadoop</groupId>

<artifactId>hadoop-client</artifactId>

<version>2.2.0</version>

</dependency>

</dependencies>



<build>

<pluginManagement>

<plugins>

<!-- 编译scala的插件 -->

<plugin>

<groupId>net.alchim31.maven</groupId>

<artifactId>scala-maven-plugin</artifactId>

<version>3.2.2</version>

</plugin>

<!-- 编译java的插件 -->

<plugin>

<groupId>org.apache.maven.plugins</groupId>

<artifactId>maven-compiler-plugin</artifactId>

<version>3.5.1</version>

</plugin>

</plugins>

</pluginManagement>

<plugins>

<plugin>

<groupId>net.alchim31.maven</groupId>

<artifactId>scala-maven-plugin</artifactId>

<executions>

<execution>

<id>scala-compile-first</id>

<phase>process-resources</phase>

<goals>

<goal>add-source</goal>

<goal>compile</goal>

</goals>

</execution>

<execution>

<id>scala-test-compile</id>

<phase>process-test-resources</phase>

<goals>

<goal>testCompile</goal>

</goals>

</execution>

</executions>

</plugin>



<plugin>

<groupId>org.apache.maven.plugins</groupId>

<artifactId>maven-compiler-plugin</artifactId>

<executions>

<execution>

<phase>compile</phase>

<goals>

<goal>compile</goal>

</goals>

</execution>

</executions>

</plugin>





</plugins>

</build>

</project>

4.打jar包

Spark用IntelliJ + maven打jar包上传到Spark集群上运行

Spark用IntelliJ + maven打jar包上传到Spark集群上运行

Spark用IntelliJ + maven打jar包上传到Spark集群上运行

 

Spark用IntelliJ + maven打jar包上传到Spark集群上运行

 

 

删掉多余jar除了最后一个文件(因为这样jar包内存会小很多,一般集群上都有删除掉的那些jar包的),点击apply然后点击ok

Spark用IntelliJ + maven打jar包上传到Spark集群上运行

拷贝jar包到你方便的位置

Spark用IntelliJ + maven打jar包上传到Spark集群上运行

5.上传jar包并运行

Spark用IntelliJ + maven打jar包上传到Spark集群上运行

6。查看结果

Spark用IntelliJ + maven打jar包上传到Spark集群上运行