Apache Flume 1.9:安装配置测试(log2hive)


Apache Flume:1.9.0


1、下载

wget https://mirrors.tuna.tsinghua.edu.cn/apache/flume/1.9.0/apache-flume-1.9.0-bin.tar.gz

2、解压

tar -zxvf apache-flume-1.9.0-bin.tar.gz

3、数据流

Apache Flume 1.9:安装配置测试(log2hive)

4、创建一张hive 目标表

 create table action_log

(id string,

write_date string,

name string)

COMMENT 'click action log'

ROW FORMAT DELIMITED FIELDS TERMINATED BY ','

STORED AS TEXTFILE;

5、查看表hdfs目录

Apache Flume 1.9:安装配置测试(log2hive)

hdfs://bigdata-dev1.nexttao:8020/warehouse/tablespace/managed/hive/flume.db/action_log

6、插入一条测试数据

insert into action_log values ('1','2019-12-13 00:00:00','Raymond');

Apache Flume 1.9:安装配置测试(log2hive)

通过HDFS再插入一条

Apache Flume 1.9:安装配置测试(log2hive)

Apache Flume 1.9:安装配置测试(log2hive)

7、写一个简单flume配置文件

#agent1表示代理名称

agent1.sources=source1

agent1.sinks=sink1

agent1.channels=channel1

#配置source1

agent1.sources.source1.type=TAILDIR

agent1.sources.source1.filegroups = f1

agent1.sources.source1.filegroups.f1 = /data/log/tracy/.*log.*

agent1.sources.source1.channels=channel1

agent1.sources.source1.fileHeader = false

#加拦截器

agent1.sources.source1.interceptors = i1

#时间戳拦截器

agent1.sources.source1.interceptors.i1.type = timestamp

#配置channel1

agent1.channels.channel1.type=file

agent1.channels.channel1.checkpointDir=/data/flume/tracy/cheackpointDir

agent1.channels.channel1.dataDirs=/data/flume/tracy/dataDirs

#配置sink1

agent1.sinks.sink1.type=hdfs

agent1.sinks.sink1.hdfs.path=hdfs://bigdata-dev1.nexttao:8020/warehouse/tablespace/managed/hive/flume.db/action_log

#DataStream类似于textfile

agent1.sinks.sink1.hdfs.fileType=DataStream

#只写入event的body部分

agent1.sinks.sink1.hdfs.writeFormat=TEXT

#hdfs创建多长时间新建文件,0不基于时间

agent1.sinks.sink1.hdfs.rollInterval=1

agent1.sinks.sink1.channel=channel1

agent1.sinks.sink1.hdfs.filePrefix=%Y-%m-%d

Apache Flume 1.9:安装配置测试(log2hive)

8、启动flume-ng

./flume-ng agent -n agent1 -c ../conf -f ../conf/log2hive.properties -Dflume.root.logger=DEBUG,console

9、启动报错

Exception in thread "SinkRunner-PollingRunner-DefaultSinkProcessor" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V

更换guava jar包版本

Apache Flume 1.9:安装配置测试(log2hive)

10、在监控的log目录下添加新log

Apache Flume 1.9:安装配置测试(log2hive)

11、查看flume log

Apache Flume 1.9:安装配置测试(log2hive)

显示数据已经写入hdfs

12、查看hdfs

Apache Flume 1.9:安装配置测试(log2hive)

13、查看hive数据

Apache Flume 1.9:安装配置测试(log2hive)

14、再往1.log写入一条数据

Apache Flume 1.9:安装配置测试(log2hive)

查看hive:

Apache Flume 1.9:安装配置测试(log2hive)

总结:这是一个从下载flume,到配置log2hive的简单流程,只是简单跑通,后续需要做优化压测等