Hive表导入Elasticsearch
1,添加elasticsearch-hadoop-hive-2.1.2.jar到Hive。Hive添加第三方包,查看:http://blog.csdn.net/qianshangding0708/article/details/50381966
2,在hive中建立Elasticsearch外表:
- @Test
- public void testESTable() {
- try {
- HiveHelper
- .excuteNonQuery("CREATE EXTERNAL TABLE es_user(id String ,name String ,age int ,create_date String) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource' = 'es_hive/user_{create_date}','es.index.auto.create' = 'true','es.nodes' = '10.0.1.75:9200,10.0.1.76:9200,10.0.1.77:9200')");
- } catch (Exception e) {
- e.printStackTrace();
- }
- }
为了让SQL语句看的清晰点,再贴一次SQL语句:
- CREATE EXTERNAL TABLE es_user (
- id String,
- NAME String,
- age INT,
- create_date String
- ) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES (
- 'es.resource' = 'es_hive/user_{create_date}',
- 'es.index.auto.create' = 'true',
- 'es.nodes' = '10.0.1.75:9200,10.0.1.76:9200,10.0.1.77:9200'
- );
es.index.auto.create:设置是否自动创建索引
es.nodes:Elasticsearch集群地址。
3,创建Hive表
- @Test
- public void testHiveTable() {
- try {
- HiveHelper
- .excuteNonQuery("CREATE TABLE IF NOT EXISTS hive_user(id String ,name String ,age int) PARTITIONED BY (create_date String) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE");
- } catch (Exception e) {
- e.printStackTrace();
- }
- }
SQL语句:
- CREATE TABLE IF NOT EXISTS hive_user(
- id String ,
- name String ,
- age int
- ) PARTITIONED BY (create_date String)
- ROW FORMAT DELIMITED
- FIELDS TERMINATED BY ','
- LINES TERMINATED BY '\n'
- STORED AS TEXTFILE;
4,上传数据文件,并将文件导入到hive表(hive_user)
数据原文:kkk.txt
- 1,fish1,1
- 2,fish2,2
- 3,fish3,3
- 4,fish4,4
- 5,fish5,5
- 6,fish6,6
- 7,fish7,7
- 8,fish8,8
- 9,fish9,9
- @Test
- public void testLoadHiveTable() {
- try {
- HiveHelper
- .excuteNonQuery("LOAD DATA INPATH '/fish/hive/kkk.txt' INTO TABLE hive_user PARTITION(CREATE_DATE='2015-12-22')");
- } catch (Exception e) {
- e.printStackTrace();
- }
- }
查看hive_user表:
- hive> select * from hive_user;
- OK
- 1 fish1 1 2015-12-22
- 2 fish2 2 2015-12-22
- 3 fish3 3 2015-12-22
- 4 fish4 4 2015-12-22
- 5 fish5 5 2015-12-22
- 6 fish6 6 2015-12-22
- 7 fish7 7 2015-12-22
- 8 fish8 8 2015-12-22
- 9 fish9 9 2015-12-22
- Time taken: 0.041 seconds, Fetched: 9 row(s)
OK,数据已经Load到Hive。
5,将Hive表的数据插入到Elasticsearch
- @Test
- public void testInsertElasticSearch() {
- try {
- HiveHelper
- .excuteNonQuery("INSERT OVERWRITE TABLE es_user SELECT s.id, s.name, s.age, s.create_date FROM hive_user s where s.create_date='2015-12-22'");
- } catch (Exception e) {
- e.printStackTrace();
- }
- }
数据已经成功上传到Elasticsearch。
更多细节,参考:https://www.elastic.co/guide/en/elasticsearch/hadoop/current/hive.html