Scala - Spark：从特定节点返回顶点属性

问题描述：

我有一个图形，我想计算最大度数。特别是具有最大程度的顶点我想知道所有属性。这是代码片段：Scala - Spark：从特定节点返回顶点属性

def max(a: (VertexId, Int), b: (VertexId, Int)): (VertexId, Int) = { 
    if (a._2 > b._2) a else b 
} 

val maxDegrees : (VertexId, Int) = graphX.degrees.reduce(max) 
max: (a: (org.apache.spark.graphx.VertexId, Int), b: (org.apache.spark.graphx.VertexId, Int))(org.apache.spark.graphx.VertexId, Int) 
maxDegrees: (org.apache.spark.graphx.VertexId, Int) = (2063726182,56387) 

val startVertexRDD = graphX.vertices.filter{case (hash_id, (id, state)) => hash_id == maxDegrees._1} 
startVertexRDD.collect()

但它返回此异常：

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 145.0 failed 1 times, most recent failure: Lost task 0.0 in stage 145.0 (TID 5380, localhost, executor driver): scala.MatchError: (1009147972,null) (of class scala.Tuple2)

如何解决这个问题？

答

我认为这是问题所在。在这里：

val startVertexRDD = graphX.vertices.filter{case (hash_id, (id, state)) => hash_id == maxDegrees._1}

所以它会尝试一些比较元组这样

(2063726182,56387)

期待这样的事情：

(hash_id, (id, state))

抚养scala.MatchError因为比较的Tuple2（VertextId ，Int）与Tuple2的（VertexId，Tuple2（id，state））

要小心这一点L：

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 145.0 failed 1 times, most recent failure: Lost task 0.0 in stage 145.0 (TID 5380, localhost, executor driver): scala.MatchError: (1009147972,null) (of class scala.Tuple2)

具体位置：

scala.MatchError: (1009147972,null)

没有为顶点1009147972计算的程度，所以当它比较能提出一些问题，以及。

希望这会有所帮助。

我检查是否有与该代码段分离的节点：VAL vertexDegree： VertexRDD [INT] = graphX.degrees VAL vertexNoDegree = vertexDegree.filter {情况下（ID，度）=>度== NULL} vertexNoDegree.isEmpty（） res6：布尔值= true 没有孤立的节点...我不知道该怎么办 – alukard990

Scala - Spark：从特定节点返回顶点属性

相关推荐