在Apache Spark中。如何设置worker/executor的环境变量?
问题描述:
在EMR我的星火计划不断收到此错误:在Apache Spark中。如何设置worker/executor的环境变量?
Caused by: javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated
at sun.security.ssl.SSLSessionImpl.getPeerCertificates(SSLSessionImpl.java:421)
at org.apache.http.conn.ssl.AbstractVerifier.verify(AbstractVerifier.java:128)
at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:397)
at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:148)
at org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:149)
at org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:121)
at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:573)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:425)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:334)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:281)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestHead(RestStorageService.java:942)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2148)
at org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectDetailsImpl(RestStorageService.java:2075)
at org.jets3t.service.StorageService.getObjectDetails(StorageService.java:1093)
at org.jets3t.service.StorageService.getObjectDetails(StorageService.java:548)
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:172)
at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
at org.apache.hadoop.fs.s3native.$Proxy8.retrieveMetadata(Unknown Source)
at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:414)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1398)
at org.apache.hadoop.fs.s3native.NativeS3FileSystem.create(NativeS3FileSystem.java:341)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:784)
我做了一些研究,发现这个认证可以在低安全性的情况下被禁用,通过设置环境变量:
com.amazonaws.sdk.disableCertChecking=true
但我只能使用spark-submit.sh --conf进行设置,它只影响驱动程序,而大部分错误都在工作人员身上。
有没有办法将它们传播给工人?
非常感谢。
答
只是在一些在Spark documentation跌跌撞撞:
spark.executorEnv.[EnvironmentVariableName]
Add the environment variable specified by EnvironmentVariableName to the Executor process. The user can specify multiple of these to set multiple environment variables.
所以你的情况,我会星火配置选项spark.executorEnv.com.amazonaws.sdk.disableCertChecking
设置为true
,看看有没有什么帮助。
非常感谢!我刚刚在官方文档中发现,不知道为什么它被忽略之前 – tribbloid 2015-04-20 00:10:21
容易被忽视的选项。这节省了我一些头痛,谢谢。 – 2016-09-15 01:57:42
我需要同样的东西,但在'工人'上:去看看是否可用.. – javadba 2017-06-04 18:05:20