Hive on Tez部署及验证测试

该帖子主要记录Hive On Tez安装及验证测试,并详细记录验证过程中的各种坑;

Tez安装介绍请参考

http://blog.****.net/hqwang4/article/details/72773654


Hive运行在Tez上有2种方式:

一种是修改mapreduce-site.xml文件将yarn修改为yarn-tez(将所有运行在Yarn上的MR全部修改为了引擎为Tez);

另一种修改Hive的执行引擎为Tez,set hive.execution.engine=tez;

修改hive-env.sh

exportHIVE_AUX_JARS_PATH=$HADOOP_HOME/share/hadoop/mapreduce1/hadoop-core-2.6.0-mr1-cdh5.5.0.jar:$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.6.0-cdh5.5.0.jar:$HADOOP_HOME/share/hadoop/common/hadoop-common-2.6.0-cdh5.5.0.jar

验证测试

#启动hive

$HIVE_HOME/bin/hive

#设置Hive执行引擎为Tez

sethive.execution.engine=tez;

#执行SQL

select count(*) from student;

测试结果如下:Hive on Tez部署及验证测试

常见问题

1、问题:

For moredetailed output, check application trackingpage:http://node01:8088/proxy/application_1495800987987_0003/Then, click on linksto logs of each attempt.

Diagnostics:Exception from container-launch.

Container id:container_1495800987987_0003_02_000001

Exit code: 1

Stack trace:ExitCodeException exitCode=1:

        atorg.apache.hadoop.util.Shell.runCommand(Shell.java:543)

        atorg.apache.hadoop.util.Shell.run(Shell.java:460)

        atorg.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:720)

        atorg.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)

解决方法:

    需要将${TEZ_HOME}/share/tez.tar.gz文件上传到Hdfs上,并在tez-site.xml文件tez.lib.uris属性中配置。


2、问题

Vertex failed,vertexName=Map 1, vertexId=vertex_1496371729424_0004_1_00, diagnostics=[Vertexvertex_1496371729424_0004_1_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE,Vertex Input: student initializer failed, vertex=vertex_1496371729424_0004_1_00[Map 1], java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/MRVersion

        atorg.apache.hadoop.hive.shims.Hadoop23Shims.isMR2(Hadoop23Shims.java:852)

        atorg.apache.hadoop.hive.shims.Hadoop23Shims.getHadoopConfNames(Hadoop23Shims.java:923)

        atorg.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:356)

        atorg.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:371)

        atorg.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:296)

        atorg.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:106)

        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)

        atorg.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)

        at java.security.AccessController.doPrivileged(NativeMethod)

        atjavax.security.auth.Subject.doAs(Subject.java:422)

        atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)

        atorg.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)

        atorg.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)

        atjava.util.concurrent.FutureTask.run(FutureTask.java:266)

        atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

        atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

        atjava.lang.Thread.run(Thread.java:745)

Caused by:java.lang.ClassNotFoundException: org.apache.hadoop.mapred.MRVersion

        atjava.net.URLClassLoader.findClass(URLClassLoader.java:381)

        atjava.lang.ClassLoader.loadClass(ClassLoader.java:424)

        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)

        atjava.lang.ClassLoader.loadClass(ClassLoader.java:357)

        ... 17 more

]

Vertex killed,vertexName=Reducer 2, vertexId=vertex_1496371729424_0004_1_01,diagnostics=[Vertex received Kill in INITED state., Vertex vertex_1496371729424_0004_1_01[Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]

DAG did notsucceed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1

FAILED:Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask

解决办法:

    将hadoop-core-2.6.0-mr1-cdh5.5.0.jar包添加到HIVE_CLASS中,如下:

exportHIVE_AUX_JARS_PATH=/zxhbase/cdh5.5/hadoop/share/hadoop/mapreduce1/hadoop-core-2.6.0-mr1-cdh5.5.0.jar

 

3、问题

Status: Failed

Vertex failed, vertexName=Map 1,vertexId=vertex_1496371729424_0007_1_00, diagnostics=[Task failed,taskId=task_1496371729424_0007_1_00_000000, diagnostics=[TaskAttempt 0 failed,info=[Error: Error while running task ( failure ) :attempt_1496371729424_0007_1_00_000000_0:java.lang.Exception:java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError:org.apache.hadoop.mapred.TaskID: method<init>(Ljava/lang/String;ILorg/apache/hadoop/mapreduce/TaskType;I)V notfound

        atorg.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:267)

        atorg.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:69)

        atorg.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)

        atjava.security.AccessController.doPrivileged(Native Method)

        atjavax.security.auth.Subject.doAs(Subject.java:422)

        atorg.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)

        atorg.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)

        atorg.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)

        atorg.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)

        atjava.util.concurrent.FutureTask.run(FutureTask.java:266)

        atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

        atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

        atjava.lang.Thread.run(Thread.java:745)

Caused by:java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError:org.apache.hadoop.mapred.TaskID: method<init>(Ljava/lang/String;ILorg/apache/hadoop/mapreduce/TaskType;I)V notfound

        atjava.util.concurrent.FutureTask.report(FutureTask.java:122)

        atjava.util.concurrent.FutureTask.get(FutureTask.java:192)

        atorg.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:261)

        ...12 more

Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.mapred.TaskID:method<init>(Ljava/lang/String;ILorg/apache/hadoop/mapreduce/TaskType;I)V notfound

        atorg.apache.tez.mapreduce.input.base.MRInputBase.initialize(MRInputBase.java:93)

解决办法:

    将hadoop-mapreduce-client-core-2.6.0-cdh5.5.0.jar包添加到HIVE_AUX_JARS_PATH中,如下:

exportHIVE_AUX_JARS_PATH=/zxhbase/cdh5.5/hadoop/share/hadoop/mapreduce1/hadoop-core-2.6.0-mr1-cdh5.5.0.jar:/zxhbase/cdh5.5/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.6.0-cdh5.5.0.jar

4、问题

Vertex failed, vertexName=Map 1,vertexId=vertex_1496388544237_0004_1_00, diagnostics=[Vertexvertex_1496388544237_0004_1_00 [Map 1] killed/failed dueto:ROOT_INPUT_INIT_FAILURE, Vertex Input: student initializer failed,vertex=vertex_1496388544237_0004_1_00 [Map 1], java.lang.NoClassDefFoundError:org/apache/hadoop/util/StopWatch

        atorg.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:314)

        atorg.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:306)

        atorg.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:408)

        atorg.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:129)

解决办法:

    将hadoop-common-2.6.0-cdh5.5.0.jar包添加到HIVE_AUX_JARS_PATH中,如下:

exportHIVE_AUX_JARS_PATH=/zxhbase/cdh5.5/hadoop/share/hadoop/mapreduce1/hadoop-core-2.6.0-mr1-cdh5.5.0.jar:/zxhbase/cdh5.5/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.6.0-cdh5.5.0.jar:/zxhbase/cdh5.5/hadoop/share/hadoop//common/hadoop-common-2.6.0-cdh5.5.0.jar

5、问题

Hive on tez报如下错误:

FAILED: Execution Error, return code 1 fromorg.apache.hadoop.hive.ql.exec.tez.TezTask

Hive on Tez部署及验证测试
解决办法:

   需要将tez.tar.gz包解压,然后上传到Hdfs上,并修改tez.lib.uris参数。