Nifi读取Hive中数据然后再写入新Hive数据库表操作

注:新hive数据库表已提前SQL操作建立

方案1:

SelectHiveQL -> PutHiveStreaming

个人配置设置了Hive集群环境:

  • hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
  • hive.compactor.initiator.on = true
  • hive.compactor.worker.threads > 0
  • hive.support.concurrency=true
  • ACID Transactions 设置On打开
配置PutHiveStreaming processor参数字段表名
Table name – Table name in which you want to insert the data. Again note that the
  1. a.ORC is the only format supported currently. So your table must have "stored as orc"
  2. b.transactional = "true" should be set in the table create statement
  3. c.Bucketed but not sorted. So your table must have "clustered by (colName) into (n) buckets"

报错失败,如下:
Nifi读取Hive中数据然后再写入新Hive数据库表操作


暂时不清楚原因,未成功

方案2:SelectHiveQL->ConvertAvroToJSON->SplitJson->EvaluateJsonPath->ReplaceText->PutHiveQL

Nifi读取Hive中数据然后再写入新Hive数据库表操作


方案2成功插入到Hive

PROPERTIES配置如下:

(1)SelectHiveQL

Nifi读取Hive中数据然后再写入新Hive数据库表操作

(2)ConvertAvroToJSON

Nifi读取Hive中数据然后再写入新Hive数据库表操作

(3)SplitJson

Nifi读取Hive中数据然后再写入新Hive数据库表操作

(4)EvaluateJsonPath

Nifi读取Hive中数据然后再写入新Hive数据库表操作

(5)ReplaceText 

Nifi读取Hive中数据然后再写入新Hive数据库表操作

(6)PutHiveQL

Nifi读取Hive中数据然后再写入新Hive数据库表操作