作业提交过程分析(源码)
sc.textFile("README.md").flatMap(line => line.split(" ")).map(word => (word,1)).reduceByKey(_ + _).collect
sc.textFile("README.md").flatMap(line => line.split(" ")).map(word => (word,1)).reduceByKey((a,b) => a + b).collect
总结:
第一个stage :
HadoopRDD -> MapPartitionRDD -> MapPartitionsRDD -> MapPartitionsRDD -> MapPartitionsRDD
第二个stage :
Stage shuffledRDD -> MapPartitionsRDD
本文转自大数据躺过的坑博客园博客,原文链接:http://www.cnblogs.com/zlslch/p/5906198.html,如需转载请自行联系原作者