在CentOS7上,安装Elasticsearch和Apache

为了什么

每当发生故障时,我们发现没有一个合适的Apache日志分析工具,需要花费很多时间进行调查。
我们想在ELK环境中如何分析Apache日志?希望能够充分利用它?这是一次学习的机会。
但这次不包括将生产服务器与实时同步。
⇒我们需要手动将生产环境的日志导入并进行分析。

工作环境

項目値VMVirtualBox 6.1HostOSWin10 ProfessionalGuestOSCentOS Linux release 7.8.2003 (Core)GuestIP192.168.56.10

建立环境

请参考以下网站

我以1作为中心参考。

    1. ELK(Elasticsearch+Logstash+Kibana)Apache图形化

 

    弹性栈7:安装Elasticsearch

安装Apache

yum install -y httpd
systemctl enable httpd
systemctl start httpd

安装ELK

cat > /etc/yum.repos.d/elasticsearch.repo <<EOF
[elasticsearch-7.x]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOF 

yum install -y java-1.8.0-openjdk-devel
yum install -y elasticsearch kibana logstash
# elasticsearchで日本語が扱えるようにプラグイン追加
/usr/share/elasticsearch/bin/elasticsearch-plugin install analysis-kuromoji 

Elasticsearch 的设置。

cd /etc/elasticsearch/
cp -p elasticsearch.yml elasticsearch.yml.org
vi elasticsearch.yml   
----------------------------------
#network.host: 192.168.0.1
network.host: 0.0.0.0
discovery.type: single-node   #(注)
----------------------------------

在中文中没有记录, 将会导致以下错误。请参考以下网站并进行设置添加:

Elasticsearch 7.0.0 がリリースされるもアップグレードでハマる

[1]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured

kibana设置

cd /etc/kibana/
cp -p kibana.yml kibana.yml.org
vi kibana.yml
----------------------------------
#server.host: "localhost"
server.host: "0.0.0.0"

#i18n.locale: "en"
i18n.locale: "ja-JP"
----------------------------------

i18n.locale: “ja-JP” 的部分是日本语,但是当日本语化后,Kibana 的某些页面出现错误无法查看。
(不用在意,这只是学习过程中的事。)

检查启动

systemctl daemon-reload
# elasticsearchが起動する事を確認
systemctl restart elasticsearch
systemctl status elasticsearch

自动执行设置

systemctl start kibana
systemctl start logstash
systemctl enable elasticsearch
systemctl enable kibana
systemctl enable logstash

确认连接

サービスURLelastichttp://サーバIP:9200/kibanahttp://サーバIP:5601/

试着读取Apache日志

请参考网站

    https://knowledge.sakura.ad.jp/2736/

创建logstash配置文件

# 設定ファイルの配置場所は任意だが、ここでは/etc/logstash配下に配置する。
cd /etc/logstash
vi apache_import.conf
----------------------------------
input {
  stdin { }
}

filter {
  grok {
#     https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns
#     defaultのapacheはCOMBINEDを利用
#     match => { "message" => "%{COMBINEDAPACHELOG}" }
      match => { "message" => "%{HTTPD_COMMONLOG}" }
  }
  date {
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
    locale => "en"
  }
  mutate {
    replace => { "type" => "apache_access" }
  }
}

output {
  # stdout { codec => rubydebug }
  # elasticsearch { host => '172.17.4.199' } (参考サイトの記述。これだとエラーになる)
  elasticsearch { hosts => '192.168.56.10' }
}
----------------------------------

将数据导入ElasticSearch

/usr/share/logstash/bin/logstash --path.settings /etc/logstash -f /etc/logstash/apache_import.conf < /etc/httpd/logs/access_log 
[root@centos7 logstash]# /usr/share/logstash/bin/logstash --path.settings /etc/logstash -f apache_import.conf < /etc/httpd/logs/access_log 
Sending Logstash logs to /var/log/logstash which is now configured via log4j2.properties
[2020-05-16T00:43:30,750][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-05-16T00:43:30,878][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.7.0"}
[2020-05-16T00:43:33,376][INFO ][org.reflections.Reflections] Reflections took 52 ms to scan 1 urls, producing 21 keys and 41 values 
[2020-05-16T00:43:34,491][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://192.168.56.10:9200/]}}
[2020-05-16T00:43:34,917][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"http://192.168.56.10:9200/"}
[2020-05-16T00:43:36,135][INFO ][logstash.outputs.elasticsearch][main] ES Output version determined {:es_version=>7}
[2020-05-16T00:43:36,141][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7}
[2020-05-16T00:43:37,592][INFO ][logstash.outputs.elasticsearch][main] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//192.168.56.10"]}
[2020-05-16T00:43:37,754][INFO ][logstash.outputs.elasticsearch][main] Using default mapping template
[2020-05-16T00:43:38,003][INFO ][logstash.outputs.elasticsearch][main] Attempting to install template {:manage_template=>{"index_patterns"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s", "number_of_shards"=>1, "index.lifecycle.name"=>"logstash-policy", "index.lifecycle.rollover_alias"=>"logstash"}, "mappings"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}
[2020-05-16T00:43:38,202][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.specialized.RubyArrayOneObject) has been created for key: cluster_uuids. This may result in invalid serialization.  It is recommended to log an issue to the responsible developer/development team.
[2020-05-16T00:43:38,207][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["/etc/logstash/apache_import.conf"], :thread=>"#<Thread:0x23adbb83 run>"}
[2020-05-16T00:43:38,213][INFO ][logstash.outputs.elasticsearch][main] Installing elasticsearch template to _template/logstash
[2020-05-16T00:43:39,821][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2020-05-16T00:43:39,960][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2020-05-16T00:43:40,845][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2020-05-16T00:43:43,213][INFO ][logstash.outputs.elasticsearch][main] Creating rollover alias <logstash-{now/d}-000001>
[2020-05-16T00:43:46,277][INFO ][logstash.outputs.elasticsearch][main] Installing ILM policy {"policy"=>{"phases"=>{"hot"=>{"actions"=>{"rollover"=>{"max_size"=>"50gb", "max_age"=>"30d"}}}}}} to _ilm/policy/logstash-policy
[2020-05-16T00:43:56,232][INFO ][logstash.runner          ] Logstash shut down.

修改版
在conf文件中指定要读取的文件+使用geoip(启用geoip会显著增加导入时间)
使用文件指定无法正常工作(原因不明)。如上所述,使用参数指定时,即使指定了geoip也可以读取。

[root@localhost logstash]# cat /etc/logstash/apache_import.conf
input {
#  stdin { }
  file {
    path => "/root/logs/ssl_access_log.2020-05-13"   
  }
}

filter {
  grok {
#     https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns
#     match => { "message" => "%{COMBINEDAPACHELOG}" }
      match => { "message" => "%{HTTPD_COMMONLOG}" }
  }
  date {
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
    locale => "en"
  }
  mutate {
    replace => { "type" => "apache_access" }
  }
  geoip {
    source => ["clientip"]
  }
}

output {
  # stdout { codec => rubydebug }
  # elasticsearch { host => '172.17.4.199' } (参考サイトの記述。これだとエラーになる)
  elasticsearch { hosts => '192.168.56.10' }
}

usr/share/logstash/bin/logstash --path.settings /etc/logstash -f /etc/logstash/apache_import.conf
bannerAds