普罗米修斯样本设定

简单概述

    • k8sテスト環境構築

 

    Prometheus サンプル設定

建立目录

    全体目次

环境

(Note: This is the native Chinese translation of the word “environment”)

    • Rancher: v2.4.8

 

    • kubernetes(Client): v1.19.1

 

    • kubernetes(Server): v1.18.8

 

    • kube-prometheus-stack Chart: v9.4.3

 

    kube-prometheus-stack App: v0.38.1

样品设置摘要

    1. Nginx度量的曝露设置

 

    1. 将Nginx添加到Prometheus的目标列表中

 

    1. 创建Grafana仪表盘来显示Nginx的HTTP请求计数

使用HTTP请求计数来创建警报 → 将警报外部转发

Nginx指標設定

    • 作業場所: ClientPC

 

    • ngx_http_stub_status_module Configuration Page

 

    • https://nginx.org/en/docs/http/ngx_http_stub_status_module.html#stub_status

Nginxのnginx.confに以下を追加し、ngx_http_stub_status_moduleを有効にする

server {
    listen   8080;
    location /metrics {
        stub_status;
    }
}
    • nginx-prometheus-exporter Page

 

    • https://github.com/nginxinc/nginx-prometheus-exporter

PrometheusにNginx metricsを追加するため、nginx-prometheus-exporterをサイドカーとして追加
マニフェスト作成

apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx-configmap
  labels:
    app: nginx
data:
  nginx.conf: |2

    user  nginx;
    worker_processes  1;
    error_log  /var/log/nginx/error.log warn;
    pid        /var/run/nginx.pid;
    events {
        worker_connections  1024;
    }
    http {
        include       /etc/nginx/mime.types;
        default_type  application/octet-stream;
        log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                          '$status $body_bytes_sent "$http_referer" '
                          '"$http_user_agent" "$http_x_forwarded_for"';
        access_log  /var/log/nginx/access.log  main;

        server {
            listen   8080;
            location /metrics {
                stub_status;
            }
        }

        sendfile        on;
        keepalive_timeout  65;
        include /etc/nginx/conf.d/*.conf;
    }
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-svc
  labels:
    app: nginx
spec:
  selector:
    app: nginx
  ports:
  - name: nginx-http
    protocol: TCP
    port: 80
    targetPort: 80
  - name: nginx-exporter
    protocol: TCP
    port: 9113
    targetPort: 9113
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deploy
  labels:
    app: nginx
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.19.2
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 80
        volumeMounts:
        - name: nginx-conf
          mountPath: /etc/nginx/nginx.conf
          subPath: nginx.conf
      - name: nginx-exporter
        image: nginx/nginx-prometheus-exporter:0.8.0
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9113
        args:
          - -nginx.scrape-uri=http://localhost:8080/metrics
      volumes:
      - name: nginx-conf
        configMap:
          name: nginx-configmap
          items:
          - key: nginx.conf
            path: nginx.conf
    Nginx 起動
$ kubectl apply -f test-nginx.yaml

增加Prometheus的目标

在安装Prometheus Operator时,可以使用新增的CRDServiceMonitor来添加目标。

    • serviceMonitorSelector 確認

 

    • serviceMonitorSelector.matchLabelsの値をServiceMonitorに追加する

 

    • この環境ではrelease: prometheusを追加

 

    → 設定しないとPrometheusはServiceMonitorを追加しない
$ kubectl get prometheus -n monitoring prometheus-kube-prometheus-prometheus -o yaml
..........
  serviceMonitorNamespaceSelector: {}
  serviceMonitorSelector:
    matchLabels:
      release: prometheus
..........
    マニフェスト作成
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: nginx-servicemonitor
  namespace: monitoring
  labels:
    app: nginx
    release: prometheus # serviceMonitorSelectorで確認したラベル
spec:
  endpoints:
  - port: nginx-exporter
  namespaceSelector:
    matchNames:
    - default
  selector:
    matchLabels:
      app: nginx # Serviceのラベル
    ServiceMonitor 作成
$ kubectl apply -f nginx-servicemonitor.yaml

## 確認 ##
$ kubectl get servicemonitor -n monitoring
NAME                             AGE
..........
nginx-servicemonitor             64s
..........
4-1.png
5-1.png

创建Grafana仪表盘

15-1.png
NameData sourceRefreshQueryclusterPrometheusOn Dashboard Loadlabel_values(kube_pod_info, cluster)namespacePrometheusOn Dashboard Loadlabel_values(kube_pod_info{cluster=”\$cluster”}, namespace)
3-1.png

警報設置

在安装 Prometheus Operator 时,使用添加的 CRD PrometheusRule 来添加目标

    • ruleSelector 確認

 

    • ruleSelector.matchLabelsの値をPrometheusRuleに追加する

 

    • この環境ではapp: kube-prometheus-stack、release: prometheusを追加

 

    → 設定しないとPrometheusはPrometheusRuleを追加しない
$ kubectl get prometheus -n monitoring prometheus-kube-prometheus-prometheus -o yaml
..........
  ruleNamespaceSelector: {}
  ruleSelector:
    matchLabels:
      app: kube-prometheus-stack
      release: prometheus
..........
    • マニフェスト作成

作成Rule: HTTP Request Count値が1分間20を超えたらアラート発生

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    app: kube-prometheus-stack # ruleSelectorで確認したラベル
    release: prometheus # ruleSelectorで確認したラベル
  name: http-request-rule.rules
  namespace: monitoring
spec:
  groups:
  - name: nginx-http-request.rules
    rules:
    - alert: NginxTooManyRequest
      annotations:
        message: HTTP Request to {{ $labels.namespace }}/{{ $labels.pod }} is over 20 / 1 minute.
      expr: sum(increase(nginx_http_requests_total{job="nginx-svc", namespace=~".*"}[1m])) by (namespace, pod) > 20
      for: 1m
      labels:
        severity: critical
    PrometheusRule 作成
$ kubectl apply -f nginx-http-request-rule.yaml

## 確認 ##
$ kubectl get prometheusrule -n monitoring
NAME                                AGE
..........
http-request-rule.rules             27m
..........
5-1.png

提醒:外部转移到

把警报转发到Slack

7-1.png
global:
  resolve_timeout: 1m
  slack_api_url: 'https://hooks.slack.com/services/xxxxx..............................'
route:
  receiver: 'slack_notifications'
  group_interval: 5m
  group_wait: 30s
  repeat_interval: 12h
  routes:
  - match:
      alertname: 'NginxTooManyRequest'
    receiver: 'nginx_request_count_error'
receivers:
- name: 'slack_notifications'
- name: 'nginx_request_count_error'
  slack_configs:
  - channel: '#it-test'
    send_resolved: true
    icon_url: https://avatars3.githubusercontent.com/u/3380462
    • 設定ファイルのsecret化

 

    上記マニフェストのbase64値を確認
$ kubectl create secret generic test --from-file=./alertmanager.yaml --dry-run=client -o yaml
apiVersion: v1
data:
  alertmanager.yaml: {base64値}
kind: Secret
metadata:
  creationTimestamp: null
  name: test
    • secret 更新用マニフェスト作成

 

    上記のbase64値を使って更新用マニフェスト作成
apiVersion: v1
kind: Secret
metadata:
  labels:
    release: prometheus
  name: alertmanager-prometheus-prometheus-oper-alertmanager
  namespace: monitoring
type: Opaque
data:
  alertmanager.yaml: {base64値}
    secret 更新
## secret 確認 ##
$ kubectl get secret -n monitoring
NAME                                                          TYPE              DATA   AGE
..........
alertmanager-prometheus-kube-prometheus-alertmanager          Opaque            1      3d10h
..........

## 更新 ##
$ kubectl apply -f alertmanager-slack.yaml
7.png
bannerAds