Hadoop报错Caused by: java.io.IOException: Stream closed问题

【一、问题现象】

这几天离线计算平台有个别计算作业运行失败,排查原因,查看平台日志,报错关键信息如下:

WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Error: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. java.io.IOException: Stream closed
at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:257)
at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:348)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:362)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: java.io.IOException: Stream closed
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2639)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:981)
at org.apache.hadoop.mapred.JobConf.checkAndWarnDeprecation(JobConf.java:2007)
at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:479)
at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:469)
at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:188)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:580)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:578)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.mapred.JobClient.getJobUsingCluster(JobClient.java:578)
at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:596)
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:295)
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:559)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:424)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:151)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1232)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:255)
        ... 11 more
Caused by: java.io.IOException: Stream closed
at java.util.zip.InflaterInputStream.ensureOpen(InflaterInputStream.java:67)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:142)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at org.apache.xerces.impl.XMLEntityManager$RewindableInputStream.read(Unknown Source)
at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
        at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2480)
        at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2468)
        at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2539)
        ... 37 more (state=08S01,code=1)
Closing: 0: jdbc:hive2://60.8.1.23:10000

【二、问题定位】

1、根据报错日志,指向为hadoop问题,于是在apache官网选择ALL ISSUES,搜索日志报错关键字:java.io.IOException: Stream closed,记录数较多,一 一比对,发现HADOOP-12404这个ISSUE与问题日志匹配

Hadoop报错Caused by: java.io.IOException: Stream closed问题

 

我们生产环境所用Hadoop版本为2.7,该ISSUE在Hadoop 2.8已被fixed,问题描述为:

从Configuration类中的URL加载资源时,请禁用JarURLConnection的缓存,以避免与其他用户共享JarFile。

Configuration类的parse方法将调用url.openStream来获取InputStream供DocumentBuilder进行解析。

根据JDK源代码,调用顺序为 url.openStream => handler.openConnection.getInputStream => new JarURLConnection => JarURLConnection.connect => factory.get(getJarFileURL(),getUseCaches())=> URLJarFile.getInputStream => JarFile.getInputStream => ZipFile.getInputStream

如果URLConnection类的getUseCaches方法返回值为true(默认情况下),则URLJarFile将为同一URL共享。 如果共享的URLJarFile被其他用户关闭,则URLJarFile类的getInputStream方法返回的所有InputStream都将基于文档关闭。

因此,在集群负载较高时,可能会发生该异常。

Hadoop报错Caused by: java.io.IOException: Stream closed问题

 

版本2.8对该问题进行了修复,设置用户缓存为false,同时修改了parse函数的返回参数。

 

diff --git hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java

index 0b45429..8801c6c 100644

--- hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java

+++ hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java

@@ -34,7 +34,9 @@

 import java.io.Writer;

 import java.lang.ref.WeakReference;

 import java.net.InetSocketAddress;

+import java.net.JarURLConnection;

 import java.net.URL;

+import java.net.URLConnection;

 import java.util.ArrayList;

 import java.util.Arrays;

 import java.util.Collection;

@@ -2531,7 +2533,14 @@ private Document parse(DocumentBuilder builder, URL url)

     if (url == null) {

       return null;

     }

-    return parse(builder, url.openStream(), url.toString());

+

+    URLConnection connection = url.openConnection();

+    if (connection instanceof JarURLConnection) {

+      // Disable caching for JarURLConnection to avoid sharing JarFile

+      // with other users.

+      connection.setUseCaches(false);

+    }

+    return parse(builder, connection.getInputStream(), url.toString());

   }

 

   private Document parse(DocumentBuilder builder, InputStream is,

【三、问题解决】

方案一:直接升级hadoop 2.7到2.8

由于我们的hadoop上层支撑了10+个平台产品,如果升级hadoop 2.7,那么上层产品应用也要相应改造,改动太多,因此该方案现阶段尚不具备实施条件。

方案二:补丁修复

使用Arthas,查看hiveserver2进程调用哪个jar包

[[email protected] ~]$ unzip arthas-bin.zip

查看hiveserver2进程,pid为1244551,在Arthas启动选项中,输入对应的进程对应序号3

[[email protected] ~]$ ./as.sh

Arthas script version: 3.4.4

 

[INFO] JAVA_HOME: /export/server/jdk-1.8.0_211

Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.

* [1]: 3970879 org.apache.spark.executor.CoarseGrainedExecutorBackend

  [2]: 69477 org.apache.hadoop.hbase.regionserver.HRegionServer

  [3]: 1244551 org.apache.hadoop.util.RunJar

  [4]: 1884169 org.apache.hadoop.yarn.server.nodemanager.NodeManager

  [5]: 3950379 org.apache.spark.executor.CoarseGrainedExecutorBackend

  [6]: 3948779 org.apache.spark.executor.CoarseGrainedExecutorBackend

  [7]: 3968885 org.apache.spark.executor.CoarseGrainedExecutorBackend

  [8]: 547699 org.apache.spark.deploy.yarn.ExecutorLauncher

  [9]: 62382 org.apache.hadoop.hdfs.server.datanode.DataNode

  [10]: 1839602 org.apache.hadoop.util.RunJar

3

Arthas home: /home/hdfs

Calculating attach execution time...

Attaching to 1244551 using version /home/hdfs...

 

real    0m0.885s

user    0m0.522s

sys     0m0.126s

Attach success.

telnet connecting to arthas server... current timestamp is 1604659919

Trying 127.0.0.1...

Connected to 127.0.0.1.

Escape character is '^]'.

  ,---.  ,------. ,--------.,--.  ,--.  ,---.   ,---.

 /  O  \ |  .--. ''--.  .--'|  '--'  | /  O  \ '   .-'

|  .-.  ||  '--'.'   |  |   |  .--.  ||  .-.  |`.  `-.

|  | |  ||  |\  \    |  |   |  |  |  ||  | |  |.-'    |

`--' `--'`--' '--'   `--'   `--'  `--'`--' `--'`-----'

 

 

wiki      https://arthas.aliyun.com/doc

tutorials https://arthas.aliyun.com/doc/arthas-tutorials.html

version   3.4.4

pid       1244551

time      2020-11-06 18:51:59

 

查看classloader的继承树

[[email protected]]$ classloader -t

+-BootstrapClassLoader

[email protected]

  [email protected]

  +-sun.misc.Launcher$AppClassLoader@2f7c7260

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

    [email protected]

Affect(row-cnt:44) cost in 44 ms.

通过类加载器的hashcode,查找类Configuration.class文件所在位置,进程调用的jar文件信息为 /export/server/hadoop-2.7.3/share/hadoop/common/hadoop-common-2.7.3.jar

[[email protected]]$ classloader -c 2f7c7260 -r org/apache/hadoop/conf/Configuration.class

 

jar:file:/export/server/hadoop-2.7.3/share/hadoop/common/hadoop-common-2.7.3.jar!/org/apache/hadoop/conf/Configuration.class

 

Affect(row-cnt:1) cost in 3 ms.

[[email protected]]$ exit

Connection closed by foreign host.

[[email protected] ~]$

 

【编译源文件】

从mvn center(https://search.maven.org/)中搜索org.apache.hadoop:hadoop-common:2.7.3,获取Configuration.java文件

Hadoop报错Caused by: java.io.IOException: Stream closed问题

 

在MyEclipse上新建项目,将Configuration.java文件导入,在pom.xml修改项目名称artifactId

Hadoop报错Caused by: java.io.IOException: Stream closed问题

 

根据官网公开的bug修复内容,对文件进行修改,

Hadoop报错Caused by: java.io.IOException: Stream closed问题

 

外网环境,windows CMD命令行切换到Maven项目的根目录,D:\Users\lenovo\Workspaces\MyEclipse 2017 CI\test,然后执行命令

mvn clean compile

Hadoop报错Caused by: java.io.IOException: Stream closed问题

 

Hadoop报错Caused by: java.io.IOException: Stream closed问题

根据报错内容,找到源文件对应位置,进行格式调整,多了个“+”号,修改后重新编译。

Hadoop报错Caused by: java.io.IOException: Stream closed问题

 

期间会下载一些依赖包,等待完成,直到“BUILD SUCCESS” 

Hadoop报错Caused by: java.io.IOException: Stream closed问题

 

mvn clean package

对编译文件进行打包,等待完成,直到“BUILD SUCCESS

Hadoop报错Caused by: java.io.IOException: Stream closed问题

 

Hadoop报错Caused by: java.io.IOException: Stream closed问题

 

将编译打包后的jar包赋权,并拷贝到hive的lib库下

[[email protected] lib]$ pwd

/export/hive/lib

[[email protected] bin]# chown hdfs.hadoop /home/hdfs/hadoop-common-my-0.0.1.jar

[[email protected] bin]# cp -p /home/hdfs/hadoop-common-my-0.0.1.jar /export/hive/lib/

[[email protected] bin]# su - hdfs

Last login: Fri Nov  6 16:32:42 CST 2020 on pts/0

[[email protected] ~]$ cd /export/hive/lib/

[[email protected] lib]$ ll hadoop-common-my-0.0.1.jar

-rw-r--r-- 1 hdfs hadoop 38689 Nov  6 17:36 hadoop-common-my-0.0.1.jar

hadoop-common-my-0.0.1.jar文件放入/export/server/hive-2.3.2/lib目录即hive的库目录下。

重启hiveserver2,再次通过类加载器的hashcode,查找类Configuration.class的调用顺序,发现hadoop-common-my-0.0.1.jar已被优先调用

[[email protected]]$ classloader -c 6bdf28bb -r org/apache/hadoop/conf/Configuration.class

 

hadoop-common-my-0.0.1.jarjar:file:/export/server/hive-2.3.2/lib/hadoop-common-my-0.0.1.jar!/org/apache/hadoop/conf/Configuration.class

 

jar:file:/export/server/hadoop-2.7.3/share/hadoop/common/hadoop-common-2.7.3.jar!/org/apache/hadoop/conf/Configuration.class

 

Affect(row-cnt:2) cost in 2 ms.

[[email protected]]$

[[email protected]]$

[[email protected]]$ classloader -c 6bdf28bb | grep hadoop-common

file:/export/server/hive-2.3.2/lib/hadoop-common-my-0.0.1.jar

file:/export/server/hive-2.3.2/lib/hadoop-common-my-0.0.1.jar

file:/export/server/hive-2.3.2/lib/hadoop-common-my-0.0.1.jar

file:/export/server/hive-2.3.2/lib/hadoop-common-my-0.0.1.jar

file:/export/server/hive-2.3.2/lib/hadoop-common-my-0.0.1.jar

file:/export/server/hadoop-2.7.3/share/hadoop/common/hadoop-common-2.7.3-tests.jar

file:/export/server/hadoop-2.7.3/share/hadoop/common/hadoop-common-2.7.3.jar

[[email protected]]$

通过sc可以查看已加载类的相关信息,比如该类是从哪个jar包加载的,被哪个类加载器加载的,以及是否是接口等等

[[email protected]]$ sc -d org.apache.hadoop.conf.Configuration

 class-info        org.apache.hadoop.conf.Configuration

 code-source       /export/server/hive-2.3.2/lib/hadoop-common-my-0.0.1.jar

 name              org.apache.hadoop.conf.Configuration

 isInterface       false

 isAnnotation      false

 isEnum            false

 isAnonymousClass  false

 isArray           false

 isLocalClass      false

 isMemberClass     false

 isPrimitive       false

 isSynthetic       false

 simple-name       Configuration

 modifier          public

 annotation        org.apache.hadoop.classification.InterfaceAudience$Public,org.apache.hadoop.classification.InterfaceStability$Stable

 interfaces        java.lang.Iterable,org.apache.hadoop.io.Writable

 super-class       +-java.lang.Object

 class-loader      [email protected]

                     [email protected]

 classLoaderHash   6bdf28bb

 

 class-info        org.apache.hadoop.hdfs.HdfsConfiguration

 code-source       /export/server/hadoop-2.7.3/share/hadoop/hdfs/hadoop-hdfs-2.7.3.jar

 name              org.apache.hadoop.hdfs.HdfsConfiguration

 isInterface       false

 isAnnotation      false

 isEnum            false

 isAnonymousClass  false

 isArray           false

 isLocalClass      false

 isMemberClass     false

 isPrimitive       false

 isSynthetic       false

 simple-name       HdfsConfiguration

 modifier          public

 annotation        org.apache.hadoop.classification.InterfaceAudience$Private

 interfaces

 super-class       +-org.apache.hadoop.conf.Configuration

                     +-java.lang.Object

 class-loader      [email protected]

                     [email protected]

 classLoaderHash   6bdf28bb

 

 class-info        org.apache.hadoop.hive.conf.HiveConf

 code-source       /export/server/hive-2.3.2/lib/hive-common-2.3.2.jar

 name              org.apache.hadoop.hive.conf.HiveConf

 isInterface       false

 isAnnotation      false

 isEnum            false

 isAnonymousClass  false

 isArray           false

 isLocalClass      false

 isMemberClass     false

 isPrimitive       false

 isSynthetic       false

 simple-name       HiveConf

 modifier          public

 annotation

 interfaces

 super-class       +-org.apache.hadoop.conf.Configuration

                     +-java.lang.Object

 class-loader      [email protected]

                     [email protected]

 classLoaderHash   6bdf28bb

 

 class-info        org.apache.hadoop.mapred.JobConf

 code-source       /export/server/apache-tez-0.9.1-bin/lib/hadoop-mapreduce-client-core-2.7.0.jar

 name              org.apache.hadoop.mapred.JobConf

 isInterface       false

 isAnnotation      false

 isEnum            false

 isAnonymousClass  false

 isArray           false

 isLocalClass      false

 isMemberClass     false

 isPrimitive       false

 isSynthetic       false

 simple-name       JobConf

 modifier          public

 annotation        org.apache.hadoop.classification.InterfaceAudience$Public,org.apache.hadoop.classification.InterfaceStability$Stable

 interfaces

 super-class       +-org.apache.hadoop.conf.Configuration

                     +-java.lang.Object

 class-loader      [email protected]

                     [email protected]

 classLoaderHash   6bdf28bb

 

 class-info        org.apache.hadoop.yarn.conf.YarnConfiguration

 code-source       /export/server/hadoop-2.7.3/share/hadoop/yarn/hadoop-yarn-api-2.7.3.jar

 name              org.apache.hadoop.yarn.conf.YarnConfiguration

 isInterface       false

 isAnnotation      false

 isEnum            false

 isAnonymousClass  false

 isArray           false

 isLocalClass      false

 isMemberClass     false

 isPrimitive       false

 isSynthetic       false

 simple-name       YarnConfiguration

 modifier          public

 annotation        org.apache.hadoop.classification.InterfaceAudience$Public,org.apache.hadoop.classification.InterfaceStability$Evolving

 interfaces

 super-class       +-org.apache.hadoop.conf.Configuration

                     +-java.lang.Object

 class-loader      [email protected]

                     [email protected]

 classLoaderHash   6bdf28bb

 

 class-info        org.apache.tez.dag.api.TezConfiguration

 code-source       /export/server/apache-tez-0.9.1-bin/tez-api-0.9.1.jar

 name              org.apache.tez.dag.api.TezConfiguration

 isInterface       false

 isAnnotation      false

 isEnum            false

 isAnonymousClass  false

 isArray           false

 isLocalClass      false

 isMemberClass     false

 isPrimitive       false

 isSynthetic       false

 simple-name       TezConfiguration

 modifier          public

 annotation        org.apache.hadoop.classification.InterfaceAudience$Public

 interfaces

 super-class       +-org.apache.hadoop.conf.Configuration

                     +-java.lang.Object

 class-loader      [email protected]

                     [email protected]

 classLoaderHash   6bdf28bb

 

Affect(row-cnt:6) cost in 99 ms.

[[email protected]]$

反编译指定已加载类的源码,确定该hiveserver2进程调用的org.apache.hadoop.conf.Configuration类的源码是否为修改后的

[[email protected]]$ jad org.apache.hadoop.conf.Configuration

 

ClassLoader:

[email protected]

  [email protected]

 

Location:

/export/server/hive-2.3.2/lib/hadoop-common-my-0.0.1.jar

 

/*

 * Decompiled with CFR.

 */

package org.apache.hadoop.conf;

……

        URLConnection connection = url.openConnection();

        if (connection instanceof JarURLConnection) {

            connection.setUseCaches(false);

        }

        return this.parse(builder, connection.getInputStream(), url.toString());

}

……

重启hiveserver2服务,执行一个mr查询任务,无报错OK。