ELK收集分析Nginx日志-Elasticsearch6.3.1+FileBeat7.0+Logstash6.6.0+Kibana6.3.2

Filebeat 已经完全替代了 Logstash-Forwarder 成为新一代的日志采集器,同时鉴于它轻量、安全等特点,经消息队列输出插件输出到消息队列中。目前 Logstash 支持 Kafka、Redis、RabbitMQ 等常见消息队列。然后 Logstash 通过消息队列输入插件从队列中获取数据,分析过滤后经输出插件发送到 Elasticsearch,最后通过 Kibana 展示
引入消息队列机制的架构
ELK收集分析Nginx日志-Elasticsearch6.3.1+FileBeat7.0+Logstash6.6.0+Kibana6.3.2

elasticsearch和redis及nginx的安装这里就不说了,网上有太多了,各种安装方式

Kibana安装:

我这里是用Rancher启动docker容器的方式安装kibana。ELK收集分析Nginx日志-Elasticsearch6.3.1+FileBeat7.0+Logstash6.6.0+Kibana6.3.2

镜像为:docker.elastic.co/kibana/kibana:6.3.2  映射主机端口:5601  配置文件映射,文件内容:

server.port: 5601
server.name: kibana
server.host: "0"
elasticsearch.url: "http://192.168.1.149:9200"
xpack.monitoring.ui.container.elasticsearch.enabled: false

启动即可,若没问题的话,访问http://192.168.1.149:5601,界面如下:

ELK收集分析Nginx日志-Elasticsearch6.3.1+FileBeat7.0+Logstash6.6.0+Kibana6.3.2

Logstash安装:

也是Rancher启动容器的方式安装:

ELK收集分析Nginx日志-Elasticsearch6.3.1+FileBeat7.0+Logstash6.6.0+Kibana6.3.2

镜像为:docker.elastic.co/logstash/logstash:6.6.0 无需映射端口,需映射配置文件

logstash.yml:

http.host: "0.0.0.0"
xpack.monitoring.enabled: true
xpack.monitoring.elasticsearch.url: http://192.168.1.149:9200

/data/logstash/pipeline文件夹下有个配置文件叫logstash.conf,内容如下:

input {
    redis {
        data_type => "list"
        key => "filebeat"
        host => "192.168.1.149"
        port => 6379
        db => 2
        threads => 1
    }
}


filter {
     mutate {
        gsub => ["message", "\\x", "\\\\x"]
     }
     json {
        source => "message"
     }
     if [http_user_agent] != "-" {
      useragent {
         target => "ua"
         source => "http_user_agent"
       }
     }
     geoip {
        source => "clientip"
        database => "/usr/etc/GeoLite2-City.mmdb"
        remove_field => ["[geoip][latitude]", "[geoip][longitude]", "[geoip][country_code]", "[geoip][country_code2]", "[geoip][country_code3]", "[geoip][timezone]", "[geoip][continent_code]", "[geoip][region_code]", "[geoip][ip]"]
        #add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
        #add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
        target => "geoip"
         }
    mutate {
        convert => { 
                     "bytes_sent" => "integer"
                     "body_bytes_sent" => "integer"
                     "content_length" => "integer"
                     "request_length" => "integer"
                     "request_time" => "float"
                   }
        rename => { "[host][name]" => "host" }
    }
}

output {
    if [fields][service] == "nginx_access" {
        elasticsearch {
            hosts => ["192.168.1.149:9200"]
            index => "logstash-nginx_access-%{+yyyy.MM.dd}"
        }
        }
}

 

还一个文件GeoLite2-City.mmdb,这个文件是logstash用来根据IP地址获取城市信息的,下载地址:https://dev.maxmind.com/geoip/geoip2/geolite2/

下载后放到宿主机上,然后映射到容器内。这个文件是需要每隔段时间就要更新下的,但是用了他们提供的自动更新脚本更新不了,有知道怎么处理的大神路过,还望指点下。

官方提供的更新方式请参考:https://dev.maxmind.com/geoip/geoipupdate/

接下来配置Nginx日志输出格式为JSON格式:

    #log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
    #                  '$status $body_bytes_sent "$http_referer" '
    #                  '"$http_user_agent" "$http_x_forwarded_for"';
    log_format json '{ "time_local": "$time_local", '
                              '"time": "$time_iso8601", '
                               '"remote_addr": "$remote_addr", '
                               '"remote_user": "$remote_user", '
                               '"body_bytes_sent": "$body_bytes_sent", '
                               '"request_time": "$request_time", '
                               '"up_resp_time": "$upstream_response_time",'
                               '"status": "$status", '
                               '"host": "$host", '
                               '"request": "$request", '
                               '"request_method": "$request_method", '
                               '"uri": "$uri", '
                               '"http_referrer": "$http_referer", '
                               '"bytes_sent": "$bytes_sent",'
                               '"content_length": "$content_length",'
                               '"request_length": "$request_length",'
                               '"http_x_forwarded_for": "$http_x_forwarded_for", '
                               '"http_user_agent": "$http_user_agent" '
                               '}';
    access_log  /data/nginx/log/access.log  json;

最后安装filebeat。参考官方文档:https://www.elastic.co/guide/en/beats/filebeat/7.0/setup-repositories.html

ELK收集分析Nginx日志-Elasticsearch6.3.1+FileBeat7.0+Logstash6.6.0+Kibana6.3.2

我是安装在Centos7上的,配置文件在/etc/filebeat/filebeat.yml  设置日志文件监听:

###################### Filebeat Configuration Example #########################

# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html

# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.

#=========================== Filebeat inputs =============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.

- type: log

  # Change to true to enable this input configuration.
  enabled: true  #注意一定要打开此处

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /data/nginx/log/access.log
    #- c:\programdata\elasticsearch\logs\*
  tags: ["nginx-access"]
  fields: 
    service: nginx_access
  # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  #exclude_lines: ['^DBG']

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1

  ### Multiline options

  # Multiline can be used for log messages spanning multiple lines. This is common
  # for Java Stack Traces or C-Line Continuation

  # The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
  #multiline.pattern: ^\[

  # Defines if the pattern set under pattern should be negated or not. Default is false.
  #multiline.negate: false

  # Match can be set to "after" or "before". It is used to define if lines should be append to a pattern
  # that was (not) matched before or after or as long as a pattern is not matched based on negate.
  # Note: After is the equivalent to previous and before is the equivalent to to next in Logstash
  #multiline.match: after


#============================= Filebeat modules ===============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

  # Period on which files under path should be checked for changes
  #reload.period: 10s

#==================== Elasticsearch template setting ==========================

#setup.template.settings:
  #index.number_of_shards: 1
  #index.codec: best_compression
  #_source.enabled: false

#================================ General =====================================

# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:

# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]

# Optional fields that you can specify to add additional information to the
# output.
#fields:
#  env: staging


#============================== Dashboards =====================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false

# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:

#============================== Kibana =====================================

# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
#setup.kibana:

  # Kibana Host
  # Scheme and port can be left out and will be set to the default (http and 5601)
  # In case you specify and additional path, the scheme is required: http://localhost:5601/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
  #host: "localhost:5601"

  # Kibana Space ID
  # ID of the Kibana Space into which the dashboards should be loaded. By default,
  # the Default Space will be used.
  #space.id:

#============================= Elastic Cloud ==================================

# These settings simplify using filebeat with the Elastic Cloud (https://cloud.elastic.co/).

# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:

# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:

#================================ Outputs =====================================
output.redis:
  hosts: ["192.168.1.149:6379"]
  key: "filebeat"
  db: 2
  timeout: 5
# Configure what output to use when sending the data collected by the beat.

#-------------------------- Elasticsearch output ------------------------------
#output.elasticsearch:
  # Array of hosts to connect to.
  #hosts: ["localhost:9200"]

  # Optional protocol and basic auth credentials.
  #protocol: "https"
  #username: "elastic"
  #password: "changeme"

#----------------------------- Logstash output --------------------------------
#output.logstash:
  # The Logstash hosts
  #hosts: ["localhost:5044"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"

#================================ Processors =====================================

# Configure processors to enhance or manipulate events generated by the beat.

processors:
  - add_host_metadata: ~
  - add_cloud_metadata: ~

#================================ Logging =====================================

# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug

# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publish", "service".
#logging.selectors: ["*"]

#============================== Xpack Monitoring ===============================
# filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster.  This requires xpack monitoring to be enabled in Elasticsearch.  The
# reporting is disabled by default.

# Set to true to enable the monitoring reporter.
#xpack.monitoring.enabled: false

# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well. Any setting that is not set is
# automatically inherited from the Elasticsearch output configuration, so if you
# have the Elasticsearch output configured, you can simply uncomment the
# following line.
#xpack.monitoring.elasticsearch:

#================================= Migration ==================================

# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true

配置好后 启动filebeat:  systemctl start filebeat。

不出意外的话,进入kibana管理页面 应该能看到nginx产生的日志了。

ELK收集分析Nginx日志-Elasticsearch6.3.1+FileBeat7.0+Logstash6.6.0+Kibana6.3.2

需要注意的是,线上环境使用阿里云的Elasticsearch,阿里云默认是不允许自动生成索引的,需配置下logstash-*的索引可自动生成即可。

下篇文章讲下kibana上配置DashBoard线上各种图表