[[email protected] input]# scp -r /usr/local/spark/input hadoop102:/usr/local/spark/input
[[email protected] input]# scp -r /usr/local/spark/input hadoop103:/usr/local/spark/input
备注：file://来表明使用的是本地文件系统！！同时要保证每台节点都要有该文件！

2、启动

/usr/local/spark/bin/spark-shell \

--master spark://hadoop101:7077 \

--executor-memory 1g \

--total-executor-cores 2

3、在scala>中测试

1）在本地测试

scala> sc.textFile("input/hello.txt").flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_).collect

Spark的安装和单词计数

2）在hdfs上测试

scala> sc.textFile("hdfs://hadoop101:9000/hello/hello.txt").flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_).collect

Spark的安装和单词计数

在http://hadoop101:4040中查看

Spark的安装和单词计数

Spark的安装和单词计数

一、Standalone模式安装

1、上传并解压spark安装包

2、进入spark安装目录下的conf文件夹

3、修改配置文件名称

4、修改slave文件，添加work节点：（只写两台从机的节点）

5、修改spark-env.sh文件，添加如下配置：

6、在sbin目录下的spark-config.sh 文件中加入如下配置：

7、分发spark包

8、启动

二、单词计数

1、创建文件

2、启动

3、在scala>中测试

相关推荐