logstash日志收集分析系统elasticsearch&kibana
logstash日志收集分析系统
此文版本已较老,请移步http://bbotte.com/查看
Logstash provides a powerful pipeline for storing, querying, and analyzing your logs. When using Elasticsearch as a backend data store and Kibana as a frontend reporting tool, Logstash acts as the workhorse. It includes an arsenal of built-in inputs, filters, codecs, and outputs, enabling you to harness some powerful functionality with a small amount of effort.
http://semicomplete.com/files/logstash/ logstash收集日志,需要java平台
logstash-1.4.2.tar.tar jdk-7u67-linux-x64.rpm
http://www.elasticsearch.org/overview/elkdownloads elasticsearch搜索引擎,此页面有帮助文档
http://www.elasticsearch.org/overview/kibana/installation/ Kibana提供web界面
http://redis.io/download redis redis-2.8.19.tar.gz
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-plugins.html elasticsearch插件
https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns logstash正则
帮助文档
http://www.elasticsearch.org/guide/
http://logstash.net/docs/1.4.2/
https://github.com/elasticsearch/kibana/blob/master/README.md
http://kibana.logstash.es/content/
http://shgy.gitbooks.io/mastering-elasticsearch/content/
系统:CentOS 6.5 64位
所安装的软件包:
jdk-7u67-linux-x64.rpm
redis-2.8.19.tar.gz
logstash-1.4.2.tar.tar
elasticsearch-1.4.2.zip#请安装新的版本 1.4.4(修复了漏洞),logstash和elasticsearch的版本最好一致
kibana-3.1.2.zip
#安装java和redis
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
|
# rpm -ivh jdk-7u67-linux-x64.rpm # /usr/java/jdk1.7.0_67/bin/java -version # vim ~/.bashrc export JAVA_HOME= /usr/java/jdk1 .7.0_67
export JRE_HOME=${JAVA_HOME} /jre
export CLASSPATH=.:${JAVA_HOME} /lib :${JRE_HOME} /lib
export PATH=${JAVA_HOME} /bin :$PATH
# . ~/.bashrc # java -version #验证java java version "1.7.0_67"
Java(TM) SE Runtime Environment (build 1.7.0_67-b01) Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode) # tar -xzf redis-2.8.19.tar.gz # cd redis-2.8.19 # make # make install # ./utils/install_server.sh Port : 6379 Config file : /etc/redis/6379 .conf
Log file : /var/log/redis_6379 .log
Data dir : /var/lib/redis/6379
Executable : /usr/local/bin/redis-server
Cli Executable : /usr/local/bin/redis-cli
# service redis_6379 restart #启动redis # redis-cli ping |
#安装logstash和elasticsearch
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
|
# mkdir /var/www/logstash # unzip elasticsearch-1.4.2.zip -d /var/www/logstash # cd /var/www/logstash # ln -s elasticsearch-1.4.2/ elasticsearch # cd elasticsearch # ./bin/elasticsearch -f #启动elasticsearch,默认配置文件 getopt: invalid option -- 'f'
[2015-02-09 16:15:24,502][INFO ][node ] [Amergin] version[1.4.2], pid[4718], build[927caff /2014-12-16T14 :11:12Z]
[2015-02-09 16:15:24,502][INFO ][node ] [Amergin] initializing ... [2015-02-09 16:15:24,518][INFO ][plugins ] [Amergin] loaded [], sites [] [2015-02-09 16:15:27,945][INFO ][node ] [Amergin] initialized [2015-02-09 16:15:27,945][INFO ][node ] [Amergin] starting ... [2015-02-09 16:15:28,232][INFO ][transport ] [Amergin] bound_address {inet[ /0 :0:0:0:0:0:0:0:9300]}, publish_address {inet[ /192 .168.10.1:9300]}
[2015-02-09 16:15:28,300][INFO ][discovery ] [Amergin] elasticsearch /mvrxUfixSPKQKzb3s_nFug
[2015-02-09 16:15:32,091][INFO ][cluster.service ] [Amergin] new_master [Amergin][mvrxUfixSPKQKzb3s_nFug][manager][inet[ /192 .168.10.1:9300]], reason: zen-disco- join (elected_as_master)
[2015-02-09 16:15:32,143][INFO ][http ] [Amergin] bound_address {inet[ /0 :0:0:0:0:0:0:0:9200]}, publish_address {inet[ /192 .168.10.1:9200]}
[2015-02-09 16:15:32,143][INFO ][node ] [Amergin] started [2015-02-09 16:15:32,162][INFO ][gateway ] [Amergin] recovered [0] indices into cluster_state # curl -X GET http://localhost:9200 #也可以在浏览器打开http://192.168.10.1:9200/ { "status" : 200,
"name" : "Amergin" ,
"cluster_name" : "elasticsearch" ,
"version" : {
"number" : "1.4.2" ,
"build_hash" : "927caff6f05403e936c20bf4529f144f0c89fd8c" ,
"build_timestamp" : "2014-12-16T14:11:12Z" ,
"build_snapshot" : false ,
"lucene_version" : "4.10.2"
},
"tagline" : "You Know, for Search"
} # tar -xzf logstash-1.4.2.tar.tar # cd logstash-1.4.2 # ./bin/logstash -h #查看帮助 |
#下面是测试,查看logstash的运行原理
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
|
# echo "`date` hello world" Mon Feb 9 16:36:15 CST 2015 hello world #测试logstash的stdin,stdout,如下: # bin/logstash -e 'input { stdin { } } output { stdout {} }' Mon Feb 9 16:36:15 CST 2015 hello world #输入这一行,直接粘贴,不要手动输入
2015-02-09T08:36:23.190+0000 manager Mon Feb 9 16:36:15 CST 2015 hello world #显示logstash处理后的数据
#测试logstash的stdin,stdout在elasticsearch处理后的数据显示,如下: # /var/www/logstash/elasticsearch/bin/elasticsearch -f #同时启动elasticsearch # bin/logstash -e 'input { stdin { } } output { elasticsearch { host => localhost } }' you know, for logs #输入这一行
# curl 'http://localhost:9200/_search?pretty' #显示elasticsearch处理后的数据 { "took" : 64,
"timed_out" : false ,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [ {
"_index" : "logstash-2015.02.09" ,
"_type" : "logs" ,
"_id" : "IFmPqi0dQjSNZR5-94NuHg" ,
"_score" : 1.0,
"_source" :{ "message" : "you know, for logs" , "@version" : "1" , "@timestamp" : "2015-02-09T08:48:48.747Z" , "host" : "manager" }
} ]
}
} #You’ve successfully stashed logs in Elasticsearch via Logstash |
#安装elasticsearch插件,测试一下
1
2
3
4
5
6
7
8
9
10
11
12
|
# cd /var/www/logstash/elasticsearch/bin/ #安装kopf插件 # ./plugin -install lmenezes/elasticsearch-kopf #下面测试这个kopf插件 # /var/www/logstash/elasticsearch/bin/elasticsearch -f # bin/logstash -e 'input { stdin { } } output { elasticsearch { host => localhost } stdout { } }' hello world 2015-02-09T09:07:35.590+0000 manager hellhello world hello logstash 2015-02-09T09:09:26.981+0000 manager hello logstash # curl 'http://localhost:9200/_search?pretty' #会看到刚才输出一定格式log文件 # curl 'http://localhost:9200/_plugin/kopf/' #显示插件的页面,不过这个看不到东西 #在浏览器访问192.168.10.1:9200/_plugin/kopf/ 会打开如下界面,浏览保存在Elasticsearch中的数据,设置及映射 |
还有好多很不错的插件,都可以安装上去:
es_head: 这个主要提供的是健康状态查询,当然标签页里也提供了简单的form给你提交API请求。es_head现在可以直接通过 elasticsearch/bin/plugin -install mobz/elasticsearch-head
安装,然后浏览器里直接输入http://$eshost:9200/_plugin/head/
就可以看到cluster/node/index/shards的状态了
bigdesk: 这个主要提供的是节点的实时状态监控,包括jvm的情况,linux的情况,elasticsearch的情况。排查性能问题的时候很有用,现在也可以过 elasticsearch/bin/plugin -install lukas-vlcek/bigdesk
直接安装了。然后浏览器里直接输入 http://$eshost:9200/_plugin/bigdesk/
就可以看到了。注意如果使用的 bulk_index
的话,如果选择的刷新间隔太长,indexing per second数据是不准的
#elasticsearch处理日志
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
|
# /var/www/logstash/elasticsearch/bin/elasticsearch -d /var/run/elasticsearch.pid #启动elasticsearch #logstash对apache的错误日志处理,如下: # vi logstash-apache.conf input { file {
path => "/var/log/httpd/error_log"
start_position => beginning
}
} filter { if [path] =~ "error" {
mutate { replace => { "type" => "apache_error" } }
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
} output { elasticsearch {
host => localhost
}
stdout { codec => rubydebug }
} # bin/logstash -f logstash-apache.conf #稍等二十秒,如果没有输出,那么vim 这个日志,到最后面复制再粘贴一行,模拟写入日志 #此时logstash会读apache的错误日志,在下面命令行会显示,http://192.168.10.1:9200/_search?pretty 浏览器页面也会看到 |
继续测试
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
#logstash对apache日志的处理,如下: # vi logstash-apache.conf input { file {
path => "/var/log/httpd/*_log"
}
} filter { if [path] =~ "access" {
mutate { replace => { type => "apache_access" } }
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
} else if [path] =~ "error" {
mutate { replace => { type => "apache_error" } }
} else {
mutate { replace => { type => "random_logs" } }
}
} output { elasticsearch { host => localhost }
stdout { codec => rubydebug }
} # bin/logstash -f logstash-apache.conf |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
说明: 事件的生命周期 Inputs,Outputs,Codecs,Filters构成了Logstash的核心配置项。Logstash通过建立一条事件处理的管道,从你的日志提取出数据保存到Elasticsearch中,为高效的查询数据提供基础。 Inputs input 及输入是指日志数据传输到Logstash中。其中常见的配置如下: file:从文件系统中读取一个文件,很像UNIX命令 "tail -0a" syslog:监听514端口,按照RFC3164标准解析日志数据 redis:从redis服务器读取数据,支持channel(发布订阅)和list模式。redis一般在Logstash消费集群中作为 "broker" 角色,保存events队列共Logstash消费。
Filters Fillters在Logstash处理链中担任中间处理组件。他们经常被组合起来实现一些特定的行为来,处理匹配特定规则的事件流。常见的filters如下: grok:解析无规则的文字并转化为有结构的格式。Grok是目前最好的方式来将无结构的数据转换为有结构可查询的数据。有120多种匹配规则,会有一种满足你的需要。 mutate:mutate filter 允许改变输入的文档,你可以从命名,删除,移动或者修改字段在处理事件的过程中。 drop:丢弃一部分events不进行处理,例如:debug events。 clone:拷贝event,这个过程中也可以添加或移除字段。 geoip:添加地理信息(为前台kibana图形化展示使用) Outputs outputs是logstash处理管道的最末端组件。一个event可以在处理过程中经过多重输出,但是一旦所有的outputs都执行结束,这个event也就完成生命周期。一些常用的outputs包括: elasticsearch:如果你计划将高效的保存数据,并且能够方便和简单的进行查询 file:将event数据保存到文件中 graphite:将event数据发送到图形化组件中,一个很流行的开源存储图形化展示的组件。http://graphite.wikidot.com/ statsd:statsd是一个统计服务,比如技术和时间统计,通过udp通讯,聚合一个或者多个后台服务,如果你已经开始使用statsd,该选项对你应该很有用 Codecs codecs 是基于数据流的过滤器,它可以作为input,output的一部分配置。Codecs可以帮助你轻松的分割发送过来已经被序列化的数据。流行的codecs包括 json,msgpack,plain(text)。 json:使用json格式对数据进行编码/解码 multiline:将汇多个事件中数据汇总为一个单一的行。比如:java异常信息和堆栈信息 获取完整的配置信息,请参考 Logstash文档中 "plugin configuration" 部分
|
#上面已经很清楚的说明了logstash的工作模式,下面就结合kibana在页面查看
#kibaba,在logstash里面已经集成了kibana,在vendor/kibana/这个目录里面,当然你也可以下载 kibana-3.1.2.zip 然后解压
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
|
# unzip kibana-3.1.2.zip -d /var/www/logstash/kibana # ln -s /var/www/logstash/kibana/kibana-3.1.2 /var/www/logstash/kibana/kibana # vim /var/www/logstash/kibana/kibana/config.js 32 /* elasticsearch: "http://" +window.location. hostname + ":9200" ,
33 */
34 elasticsearch: "http://192.168.10.1:9200" ,
# vim /etc/httpd/conf.d/kibana.conf <VirtualHost *:80> DocumentRoot /var/www/logstash/kibana/kibana
ServerName 192.168.10.1
<Directory "/var/www/logstash/kibana/kibana" >
Options FollowSymLinks
AllowOverride None
Order allow,deny
Allow from all
php_value max_execution_time 300
php_value memory_limit 128M
php_value post_max_size 16M
php_value upload_max_filesize 2M
php_value max_input_time 300
php_value date .timezone Asia /Shanghai
< /Directory >
< /VirtualHost >
# vim logstash.conf input { file {
type => "syslog"
# path => [ "/var/log/*.log", "/var/log/messages", "/var/log/syslog" ] path => [ "/var/log/messages" , "/var/log/syslog" ]
sincedb_path => "/var/sincedb"
}
redis {
host => "192.168.10.1"
type => "redis-input"
data_type => "list"
key => "logstash"
}
syslog {
type => "syslog"
port => "5544"
}
} filter { grok {
type => "syslog"
match => [ "message" , "%{SYSLOGBASE2}" ]
add_tag => [ "syslog" , "grokked" ]
}
} output { elasticsearch { host => "192.168.10.1" }
} # service httpd restart # vim /etc/redis/6379.conf bind 192.168.10.1 # service redis_6379 restart # ps aux|grep redis|grep -v grep root 8340 0.1 0.7 40536 7448 ? Ssl 07:23 0:00 /usr/local/bin/redis-server 192.168.10.1:6379
# vim /var/www/logstash/elasticsearch/config/elasticsearch.yml http.cors.enabled: true #添加此行
#参考https://github.com/elastic/kibana/issues/1637 # /var/www/logstash/elasticsearch/bin/elasticsearch -d /var/run/elasticsearch.pid #服务也重启下 # ./bin/logstash --configtest -f logstash.conf #测试配置文件 Configuration OK # ./bin/logstash -v -f logstash.conf & #服务都启动后,在浏览器打开 http://192.168.10.1即可显示Kibana的默认页面 |
当有日志写入的时候,http://192.168.10.1/index.html#/dashboard/file/guided.json 页面相应的数据即随着变动,下一步就是研究elasticsearch搜索
elasticsearch存储的分析日志目录:
/var/www/logstash/elasticsearch/data/elasticsearch/nodes/0/indices