在猪加入时出现投射错误

问题描述:

我有一个执行JOIN的脚本;当我在它成功的小数据运行它,但是当我增加数据的大小我得到这个错误:在猪加入时出现投射错误

14/10/07 19:10:19 ERROR executionengine.Launcher: Backend error message 
org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing [POProject (Name: Project[tuple][0] - scope-577 Operator Key: scope-577) children: null at []]: java.lang.ClassCastException: java.lang.Long cannot be cast to org.apache.pig.data.Tuple 
     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:339) 
     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:304) 
     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNextTuple(POUnion.java:167) 
     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282) 
     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277) 
     at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) 
     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) 
     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) 
     at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
     at java.security.AccessController.doPrivileged(Native Method) 
     at javax.security.auth.Subject.doAs(Subject.java:415) 
     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) 
     at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.lang.ClassCastException: java.lang.Long cannot be cast to org.apache.pig.data.Tuple 
     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNextTuple(POProject.java:475) 
     at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:334) 
     ... 13 more 

我想,这个问题是不是由于错误输入,而其大小(数据集中等大小不在开发服务器上运行,但在更大的群集上运行)。

你能帮我理解错误的原因吗?

我的猜测是大数据集中有一行是Long值而不是元组。这是造成演员异常。发布您的猪脚本和一些示例行也很有帮助。