Centos6.5 上的 ElasticSearch/td-agent 安装和集成示例

3 年 ago

文, 翔

4 minutes

介绍

由于工作中需要经常操作ES/td-agent，在本地机器上也需要搭建测试环境，因此我记录了该搭建过程的笔记（使用Vagrant上的CentOS6.5）。
我将以本地Apache日志作为示例，尝试实现td-agent和ES的协同工作。

环境

td-agent 0.10.55

原则性假设

虚拟机处于初始状态。
正在使用root账户进行操作。

操作步骤笔记

1. 安装开发者工具

我会在以后安装流畅插件时遇到麻烦，所以现在就安装好它。这将包括GCC和许多其他工具。

yum groupinstall "Development Tools"
yum install libcurl-devel #これ入れてないとだめだった

2. JDK 的安装

使用Oracle的jdk-7u71版本。在Mac上下载并在共享文件夹中复制到vagrant一侧进行使用。

rpm -ivh jdk-7u71-linux-x64.rpm

$ java -version
java version "1.7.0_71"

3. 安装Apache

yum install httpd
chkconfig httpd on
chkconfig --list | grep httpd
service httpd start

4. 安装ElasticSearch

由于工作的需要，我会从ES官方下载rpm并安装1.3.6版本。

wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.3.6.noarch.rpm
rpm -ivh elasticsearch-1.3.6.noarch.rpm
chkconfig --add elasticsearch
chkconfig elasticsearch on
service elasticsearch start
tail -f /var/log/elasticsearch/elasticsearch.log

如果可以确认启动，那就没问题。

安装ElasticSearch的插件。

cd /usr/share/elasticsearch/

惊奇

用于管理和监控的插件

./bin/plugin -install elasticsearch/marvel/latest

安装后，只需启动即可通过以下方式进行查看：
http://{vagrant的IP}:9200/_plugin/marvel/

领导

弹性搜索操作的图形用户界面插件

./bin/plugin -install mobz/elasticsearch-head

安装后，只需启动即可在以下位置查看：
http://{vagrant的IP地址}：9200/_plugin/head/

黑色字

日文解析插件
看起来如果使用ES1.3版本，则需要安装2.3.0版本。
https://github.com/elasticsearch/elasticsearch-analysis-kuromoji

./bin/plugin -install elasticsearch/elasticsearch-analysis-kuromoji/2.3.0

审问者

查询的调试等

./bin/plugin -install polyfractal/elasticsearch-inquisitor

如果安装后启动，您可以在以下位置查看：
http://{vagrant的IP}：9200/_plugin/inquisitor/#/

6. Kibana的安装

我們將修改設定，使其通過Apache進行呈現，而不是經過插件。

curl -sL https://download.elasticsearch.org/kibana/kibana/kibana-3.1.2.tar.gz | sudo tar zxf - -C /var/www/html
mv /var/www/html/kibana-3.1.2 /var/www/html/kibana

只要能够在http://{vagrant的IP}/kibana/index.html上看到Kibana就可以了。

7. 进行 td-agent 安装之前的初始设定

在查看官方网站时，发现需要在操作系统上进行参数更改，因此需要提前做好准备。请参阅以下链接：http://docs.fluentd.org/ja/articles/before-install

#下記を追加
root soft nofile 65536
root hard nofile 65536
* soft nofile 65536
* hard nofile 65536

#下記を追加
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 10240    65535

重新启动服务器。

8. 安装td-agent。

根据官方指南执行sh脚本来安装
http://docs.fluentd.org/ja/articles/install-by-rpm

curl -L http://toolbelt.treasuredata.com/sh/install-redhat.sh | sh
chkconfig td-agent on
chkconfig --list | grep td-agent
service td-agent start
tail -f /var/log/td-agent/td-agent.log

只要正常启动就行

9. 安装 td-agent 的插件。

安装必要的插件和实用的插件。

fluent-plugin-rewrite-tag-filter

レコードの内容によりタグを変更するプラグイン

fluent-plugin-record-reformer

タグの情報をレコードの内容に変換するプラグイン

fluent-plugin-rewrite

レコードの内容により値を書換(無視したり、値を書き換えたり)

fluent-plugin-parser

レコードを正規表現でparseして値に入れ込んだり、フォーマット変更したり。

fluent-plugin-tail-ex

in_tailプラグインの拡張。tail先をワイルドカードで指定が可能に。

fluent-plugin-typecast

データの型を変換するプラグイン

fluent-plugin-forest

タグ毎にOutputプラグインのスレッドを自動的に生成してくれる。
わかりやすいのでここをどうぞ

fluent-plugin-filter

allow/denyでレコードをfilterをしてくれる。

fluent-plugin-multiprocess

fluentをマルチプロセス化するプラグイン

fluent-plugin-elasticsearch

elasticsearchにデータを送るためのプラグイン

cd /usr/lib64/fluent/ruby/bin/
./fluent-gem install fluent-plugin-rewrite-tag-filter --no-ri --no-rdoc -V
./fluent-gem install fluent-plugin-rewrite --no-ri --no-rdoc -V
./fluent-gem install fluent-plugin-elasticsearch --no-ri --no-rdoc -V
./fluent-gem install fluent-plugin-parser --no-ri --no-rdoc -V
./fluent-gem install fluent-plugin-tail-ex --no-ri --no-rdoc -V
./fluent-gem install fluent-plugin-record-reformer --no-ri --no-rdoc -V
./fluent-gem install fluent-plugin-typecast --no-ri --no-rdoc -V
./fluent-gem install fluent-plugin-forest --no-ri --no-rdoc -V
./fluent-gem install fluent-plugin-filter --no-ri --no-rdoc -V
./fluent-gem install fluent-plugin-multiprocess --no-ri --no-rdoc -V

10. ElasticSearch和td-agent之间的协作

我将尝试设置以获取本地Apache日志的配置。

让td-agent启动用户能够查看apache的日志。

chown -R td-agent:td-agent /var/log/httpd/

ES模板的创建

由于字段定义很麻烦，所以我们使用动态模板，全部定义为String类型。

curl -XPUT 'http://localhost:9200/_template/itopantest01' -d '
{
  "template": "itopantest01-*",
  "mappings": {
    "_default_": {
      "dynamic_templates": [{
        "string_with_raw": {
          "match": "*",
          "match_mapping_type": "string",
          "mapping": {
            "type": "string",
            "fields": {
              "raw": {
                "type": "string",
                "index": "not_analyzed" 
              }
            }
          }
        }
      }]
    }
  }
}
'

确认显示为 {“acknowledged”:true}

TD-Agent的配置

客户端
路径
方法
大小
上下文
。

我们会注册一个可以将字段完全分割并进行搜索的功能，就像这样。
由于我们还想对解析为apache2格式的内容进行进一步的查询，以收集url的query和上下文等信息，所以需要在中间添加解析过程。
我们还会将之前添加的插件与配置项相结合，编写配置文件。

# Input
# ローカルのアクセスログを監視する
<source>
  type tail
  path /var/log/httpd/access_log
  format apache2
  pos_file /var/log/td-agent/itopantest01_access_log.pos
  tag raw.apache.itopantest01.access
</source>

# アクセスログのパス部分からクエリ文字列を分離する
#  raw.apache.itopantest01.access
#    -> separated.apache.itopantest01.access
<match raw.**>
  type parser
  reserve_data yes
  key_name path
  format /^(?<path>[^?]+)(\?(?<query>.+))?$/
  tag separated.apache.itopantest01.access
</match>

# アクセスログのパス部分からコンテキストを抽出する
#
# separated.apache.itopantest01.access.**
#   -> separated2.apache.itopantest01.access.**
<match separated.**>
  type parser
  reserve_data yes
  key_name path
  format /^(?<context>/[^/]*).*$/
  tag separated2.apache.itopantest01.access
</match>

# sizeフィールドをintegerにキャストする
#
# separated2.apache.itopantest01.access.**
#   -> casted.apache.itopantest01.access.**
#
<match separated2.**>
   type typecast
   item_types size:integer
   tag casted.apache.itopantest01.access
</match>

# ホスト名を追加する
# apache2フォーマットでhostとして解析されていた
# フィールドはクライアントIPなのでclientフィールドに変換する。
# casted.apache.itopantest01.access.**
#   -> toelastic.apache.itopantest01.access.**
#
<match casted.**>
  type record_reformer
  renew_record false
  enable_ruby false
  <record>
    client ${host}
    host ${hostname}
  </record>
  tag toelastic.apache.itopantest01.access
</match>

#
# elasticsearchに登録する
#
<match toelastic.apache.itopantest01.access.**>
  type elasticsearch
  host localhost
  port 9200
  type_name itopantest01 #type名。なんでもよい
  include_tag_key true
  logstash_format true
  logstash_template itopantest01 #さっき指定したtemplate名
  logstash_prefix itopantest01 #ここでindex名が指定できます。
  buffer_type file
  buffer_path /var/log/td-agent/tmp/out_elasticsearch.*.buffer
  buffer_chunk_limit 8m
  buffer_queue_limit 256
  flush_interval 10s
  retry_limit 17
  retry_wait 1s
</match>

重新启动td-agent。

确认

查看td-agent的日志并在本地随意访问kibana，如果可以在head插件的概览或浏览器中确认索引，则表示通过。

只需指定在Kibana中创建的索引，就可以将其转化为图表，打开了一个有趣的世界等待着你。

请参考下面的网址

http://www.oracle.com/technetwork/java/javase/downloads/index.html – Oracle官网上的Java SE下载页面
http://www.elasticsearch.org/overview/kibana/installation/ – Elasticsearch官网上的Kibana安装指南
http://qiita.com/harukasan/items/0e69f5c17f12db7b2e98 – Qiita上的一篇文章
http://blog.kentarok.org/entry/2012/07/01/000518 – 一个名为”blog.kentarok.org”的博客文章
http://d.hatena.ne.jp/tagomoris/20120410/1334040981 – 一个名为”d.hatena.ne.jp”的页面
http://muddydixon.hatenablog.com/entry/2012/08/31/144853 – “muddydixon.hatenablog.com”上的一篇博客文章
http://docs.fluentd.org/articles/in_multiprocess – Fluentd官方文档中的一篇文章