的hadoop streaming失败,错误代码5

的hadoop streaming失败,错误代码5

问题描述:

RHadoop程序单词计数: 的hadoop streaming失败,错误代码5


 
Sys.setenv(HADOOP_CMD="/usr/local/hadoop/bin/hadoop") 
 
Sys.setenv(HADOOP_STREAMING="/usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.4.1.jar") 
 
Sys.setenv(HADOOP_HOME="/usr/local/hadoop") 
 
library(rmr2) 
 

 
## map function 
 
map <- function(k,lines) { 
 
    words.list <- strsplit(lines, '\\s') 
 
    words <- unlist(words.list) 
 
    return(keyval(words, 1)) 
 
} 
 

 
## reduce function 
 
reduce <- function(word, counts) { 
 
    keyval(word, sum(counts)) 
 
} 
 

 
wordcount <- function (input, output=NULL) { 
 
    mapreduce(input=input, output=output, input.format="text", 
 
      map=map, reduce=reduce) 
 
} 
 

 

 
    
 

 
## Submit job 
 
hdfs.root <- 'input' 
 
#hdfs.data <- file.path(hdfs.root, 'data') 
 
hdfs.out <- file.path(hdfs.root, 'out') 
 
out <- wordcount(hdfs.root, hdfs.out) 
 

 
## Fetch results from HDFS 
 
results <- from.dfs(out) 
 

 
## check top 2 frequent words 
 
results.df <- as.data.frame(results, stringsAsFactors=F) 
 
colnames(results.df) <- c('word', 'count') 
 
head(results.df[order(results.df$count, decreasing=T), ], 2)
要检查RHadoop整合,我已经使用在RSCRIPT执行上述单词计数程序。但是我收到了我在下面显示的错误。

15/01/21 13:48:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
 
packageJobJar: [/usr/local/hadoop/data/hadoop-unjar5866699842450503195/] [] /tmp/streamjob7335081573862861018.jar tmpDir=null 
 
15/01/21 13:48:53 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8050 
 
15/01/21 13:48:53 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8050 
 
15/01/21 13:48:53 ERROR streaming.StreamJob: Error Launching job : Permission denied: user=pgl-26, access=EXECUTE, inode="/tmp":hduser:supergroup:drwxrwx--- 
 
\t at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:265) 
 
\t at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:251) 
 
\t at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:205) 
 
\t at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:168) 
 
\t at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:5523) 
 
\t at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3521) 
 
\t at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:779) 
 
\t at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:764) 
 
\t at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) 
 
\t at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) 
 
\t at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) 
 
\t at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) 
 
\t at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) 
 
\t at java.security.AccessController.doPrivileged(Native Method) 
 
\t at javax.security.auth.Subject.doAs(Subject.java:415) 
 
\t at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) 
 
\t at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) 
 

 
Streaming Command Failed! 
 
Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : 
 
    hadoop streaming failed with error code 5

请帮我有关错误的。我对R和Hadoop都是新手。我无法确定我出错的地方。

我认为这个问题是有权限的。在日志“错误streaming.StreamJob:错误启动作业:权限被拒绝:”。请检查连接。

授予临时目录的权限,如hadoop fs -chown -R rhadoop /tmp

其中rhadoop是用户名

+0

您能否给出更好的解释? – 2017-07-03 09:24:58