Scala - Spark:从特定节点返回顶点属性
问题描述:
我有一个图形,我想计算最大度数。特别是具有最大程度的顶点我想知道所有属性。 这是代码片段:Scala - Spark:从特定节点返回顶点属性
def max(a: (VertexId, Int), b: (VertexId, Int)): (VertexId, Int) = {
if (a._2 > b._2) a else b
}
val maxDegrees : (VertexId, Int) = graphX.degrees.reduce(max)
max: (a: (org.apache.spark.graphx.VertexId, Int), b: (org.apache.spark.graphx.VertexId, Int))(org.apache.spark.graphx.VertexId, Int)
maxDegrees: (org.apache.spark.graphx.VertexId, Int) = (2063726182,56387)
val startVertexRDD = graphX.vertices.filter{case (hash_id, (id, state)) => hash_id == maxDegrees._1}
startVertexRDD.collect()
但它返回此异常:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 145.0 failed 1 times, most recent failure: Lost task 0.0 in stage 145.0 (TID 5380, localhost, executor driver): scala.MatchError: (1009147972,null) (of class scala.Tuple2)
如何解决这个问题?
答
我认为这是问题所在。在这里:
val startVertexRDD = graphX.vertices.filter{case (hash_id, (id, state)) => hash_id == maxDegrees._1}
所以它会尝试一些比较元组这样
(2063726182,56387)
期待这样的事情:
(hash_id, (id, state))
抚养scala.MatchError因为比较的Tuple2(VertextId ,Int)与Tuple2的(VertexId,Tuple2(id,state))
要小心这一点L:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 145.0 failed 1 times, most recent failure: Lost task 0.0 in stage 145.0 (TID 5380, localhost, executor driver): scala.MatchError: (1009147972,null) (of class scala.Tuple2)
具体位置:
scala.MatchError: (1009147972,null)
没有为顶点1009147972计算的程度,所以当它比较能提出一些问题,以及。
希望这会有所帮助。
我检查是否有与该代码段分离的节点:VAL vertexDegree: VertexRDD [INT] = graphX.degrees VAL vertexNoDegree = vertexDegree.filter {情况下(ID,度)=>度== NULL} vertexNoDegree.isEmpty() res6:布尔值= true 没有孤立的节点...我不知道该怎么办 – alukard990