spark-submit,客户无法通过以下方式进行身份验证:[TOKEN,KERBEROS];

spark-submit,客户无法通过以下方式进行身份验证:[TOKEN,KERBEROS];

问题描述:

我用kerberos设置hadoop集群,但是当我运行spark-submit时,它抛出异常。spark-submit,客户无法通过以下方式进行身份验证:[TOKEN,KERBEROS];

17/10/19 08:46:53 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 192.168.92.4, executor 1): java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: "slave2/192.168.92.4"; destination host is: "master.hadoop":9000; 
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776) 
    at org.apache.hadoop.ipc.Client.call(Client.java:1479) 
    at org.apache.hadoop.ipc.Client.call(Client.java:1412) 
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) 
    at com.sun.proxy.$Proxy15.getBlockLocations(Unknown Source) 
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:255) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:498) 
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191) 
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) 
    at com.sun.proxy.$Proxy16.getBlockLocations(Unknown Source) 
    at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1226) 
    at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1213) 
    at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201) 
    at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:306) 
    at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:272) 
    at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:264) 
    at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1526) 
    at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:304) 
    at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:299) 
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) 
    at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:312) 
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769) 
    at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:109) 
    at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) 
    at org.apache.spark.rdd.HadoopRDD$$anon$1.liftedTree1$1(HadoopRDD.scala:246) 
    at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:245) 
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:203) 
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:94) 
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) 
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) 
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) 
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) 
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) 
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) 
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) 
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) 
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) 
    at org.apache.spark.scheduler.Task.run(Task.scala:108) 
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:748) 

我看到的Kerberos的日志,只有主发送身份验证请求KDC时火花应用running.Slaves没有发送身份验证请求KDC。

+0

你是如何设置kerberos的?通过kinit或通过文件附加到火花提交? –

+0

您尝试使用哪种Spark执行模式 - 本地,独立,纱线客户端,纱线群? Spark版本?从哪个发行版? Hadoop的版本?从哪个发行版? –

+0

@ThiagoBaldim @SamsonScharfrichter非常感谢。我使用'client'作为deploy-mode参数时发生了异常。我解决了这个问题。我更改了spark-submit命令的参数。如下所示:'--master yarn --deploy-mode集群--keytab /etc/krb5.keytab --principal root/bigdataserver03 @ EXAMPLE.COM' –

也许你可以使用 “火花提交--master纱--deploy模型客户--keytab $ keytab_file_path --principal $ your_principal

看来,这个命令只适用于yarn-客户端模式