介绍让Prometheus在Kubernetes上更加简单的Prometheus Operator
在中文中,可以这样解释”いんとろだくしょん”:国际交流和合作。
你好,我是烏龜貓。
這篇文章是關於 Kubernetes Advent Calendar 2019 第十三天的內容。
这次是关于Kubernetes的圣诞节日历,但是我只会用kubectl get pod,所以我想谈一谈Prometheus。
但是,单纯地谈Prometheus也没什么意义,所以本次我想推荐一下在Kubernetes上运行Prometheus的最佳选择——Prometheus Operator。
Prometheus Operator 是什么?
PrometheusOperator的名字就是指它是Prometheus的操作员。
coreos/prometheus-operator – github.com
coreos/prometheus-operator – github.com(核心操作系统/prometheus-operator – github.com)
如果不明白什么是运营者,请放心。我也不知道运营者是什么。
然而,在这广阔的互联网上有很多资料。比如像这样的。
操作者如何改变? 未来的数据库运营 / cndt2019_k8s_operator – Speaker Deck
这是CyberAgent的@yukirii先生的精彩幻灯片。非常易懂,我非常喜欢。看了这个,操作员也能完全理解。
PrometheusOperator是什么?
好的,Operator在阅读上面的幻灯片后可能会提到的是PrometheusOperator。
PrometheusOperator是由CoreOS在2016年发布的。它为在Kubernetes上运行Prometheus提供了一些功能,例如以下几个。
-
- PrometheusやAlertManagerの作成や冗長化などを自動で行う
-
- Prometheusのバージョンやデータ永続化、ReplicaなどをKubernetes Likeに簡単に定義することができる
ServiceMonitorを使うことで、ラベルを定義するだけで簡単にPodやServiceのScrapeを追加する
除了其他一些功能外,主要就如上所述。
特别是,ServiceMonitor非常强大,可以简化使Prometheus变得棘手的ScrapeConfig。
此外,通过集中使用ServiceMonitor,可以将容易变得混乱的ScrapeConfig整合到统一而简单的格式中。
PrometheusOperator的架构

通过部署PrometheusOperator,将会根据设置的副本数来部署相应数量的Prometheus服务器。同时,还会部署AlertManager。此外,Operator还会定义CRD的ServiceMonitor,基于ServiceMonitor中定义的标签的Endpoint进行自动抓取。
PrometheusOperator的自定义资源
PrometheusOperator中定义了以下自定义资源。虽然不详细介绍所有资源,但简单来说。
-
- Prometheus
-
- ServiceMonitor
-
- PodMonitor
-
- ALertmanager
- PrometheusRule
实际上,我从来没有使用过PodMonitor,所以不太了解,但其他四个大致上是这样的。
普羅米修斯
我们将定义一个PrometheusServer。
通常情况下,Prometheus的配置是写在prometheus.yml中的,但我们将把它们定义为Kubernetes的自定义资源定义(CRD)。一个例子如下所示。
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
labels:
prometheus: k8s
name: k8s
namespace: monitoring
spec:
alerting:
alertmanagers:
- name: alertmanager-main
namespace: monitoring
port: web
baseImage: quay.io/prometheus/prometheus
nodeSelector:
kubernetes.io/os: linux
podMonitorNamespaceSelector: {}
podMonitorSelector: {}
replicas: 2
retention: 7d
resources:
requests:
memory: 400Mi
ruleSelector:
matchLabels:
prometheus: k8s
role: alert-rules
remoteWrite: {}
remoteRead: {}
securityContext:
fsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: prometheus-k8s
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector: {}
version: v2.11.0
复制品被设置为2,这样就会自动部署两个Prometheus。如果更改为3,就会部署三个实例。很简单吧。
此外,例如,retention、remoteWrite/RemoteRead等都是Prometheus的配置设置。详细信息,请参阅此API参考文档:
https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#prometheusspec
配置重载器
此外,PrometheusOperator不仅支持Prometheus的部署,还支持自动重新加载配置。
通过部署PrometheusOperator,我们可以发现有一个名为ConfigReloader的容器作为Prometheus Pod的Sidecar被部署。
详见:https://github.com/coreos/prometheus-operator/tree/master/cmd/prometheus-config-reloader
这将定期重新加载 kind: Prometheus 和 additionalScrapeConfigs(用于添加无法在 ServiceMonitor 中定义的 ScrapeConfig),并将其应用于运行中的 Prometheus。通常情况下,要应用 Prometheus 的配置更改,需要修改 prometheus.yml 并重新启动实例。然而,当 Prometheus 要扫描的目标数量增加时,需要花费很长时间才能使所有目标处于 Up 状态。此外,如果配置出现错误,可能导致无法启动(但可以通过还原配置来重新启动)。尤其是在采用 GitOps 等方式时,由于 Prometheus 实例没有重新启动,可能导致 ConfigMap 被更改但配置未应用,但通过这种方法就可以避免这种情况的发生。
服务监控
ServiceMonitor会查看Kubernetes上的ServiceResource,并比较该Resource所包含的标签与ServiceMonitor中设置的标签,自动选择匹配的内容进行Scrape。
通常情况下,在Prometheus中添加ScrapeTarget时需要逐一在ScrapeConfigs中进行配置。特别是当只是想监视具有”app: hoge-app”标签的Pod时,无法直观地编写,需要一些特殊处理。
然而,使用ServiceMonitor能够轻松地定义如下。
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: prometheus
name: prometheus
namespace: monitoring
spec:
endpoints:
- interval: 30s
port: web
selector:
matchLabels:
prometheus: k8s
将会话存档:这是一个用于监视 Prometheus 的 ServiceMonitor,由 PrometheusOperator 部署展开。值得注意的是这一点。
selector:
matchLabels:
prometheus: k8s
通过在Kubernetes中定义Deployment,我们可以通过为matchLabels定义目标标签的键/值来简单地指定目标。
然而,请注意这里需要指定的不是Pod的标签,而是Service……更确切地说是Endpoints的标签。
通过展开Service资源,将自动创建Endpoints资源,但反过来,即使没有Service资源,只要有Endpoints资源就可以使用。
另外,如果以前是通过Prometheus的Config进行Service监控的话,我们会设定为role: service。
这是因为它参考了Kubernetes上的Service资源,无论后端有多少个Pod,目标都只会是Service资源的数量。例如,如果有一个匹配3个Pod的Service存在,则目标只会是1个。
然而,通过ServiceMonitor,如果有一个与3个Pod匹配的Service,那么将监视这3个Target,以便引用Endpoints。
此外,通过使用Istio等VirtualService,可以创建任意目标的Service,因此可以在PrometheusOperator监视外部终端时使用ServiceMonitor。
请参考以下API参考文档,了解有关ServiceMonitor的其他详细信息。
https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#servicemonitorspec
告警管理器
定义AlertManager。
主要与kind:Prometheus无异。以下是Alertmanager清单的示例。
apiVersion: monitoring.coreos.com/v1
kind: Alertmanager
metadata:
labels:
alertmanager: main
name: main
namespace: monitoring
spec:
baseImage: quay.io/prometheus/alertmanager
externalUrl: 'alertmanager.example.com'
nodeSelector:
kubernetes.io/os: linux
replicas: 3
securityContext:
fsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: alertmanager-main
version: v0.18.0
在这里,就像 Prometheus 一样,它将自动扩展副本的数量。然而,与 Prometheus 大不相同的是,它还会自动设置集群的配置。
与 Prometheus 不同,AlertManager 的冗余性并不是简单地将相同的东西列出来就可以了。
需要在 AlertManager 的二进制参数中添加 –cluster.peer,并指定另一个 AlertManager 实例的端点。
https://github.com/prometheus/alertmanager#high-availability
在 Kubernetes 上部署时,Pod 的 IP 地址会不断变化。因此,需要进行一些处理,例如为每个 AlertManager 创建多个服务资源(例如,创建 AlertManager-Blue 和 AlertManager-Green),需要一些技巧。
但是,在 PrometheusOperator 中,这些都会自动配置,从而大大减少了这方面的工作量。
请参阅以下API参考文档,了解Alertmanager的详细信息:
https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#alertmanagerspec
普罗米修斯规则
PrometheusRule是Prometheus的一项功能,可以用于设置Prometheus的规则。
除了设置Prometheus本身的配置和抓取配置之外,Prometheus还有一项名为Rule的配置。
通过使用PromQL编写这些配置,可以定期执行PromQL查询,并在匹配时使用AlertManager进行告警。
将这些配置定义为Custom Resource Definition (CRD) 就是PrometheusRule。
关于这一点,写通常的yaml与之前没有什么不同,但不需要单独编写ConfigMap。
PrometheusRule的示例如下。
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
prometheus: k8s
role: alert-rules
name: prometheus-k8s-rules
namespace: monitoring
spec:
groups:
- name: general.rules
rules:
- alert: TargetDown
annotations:
message: '{{ printf "%.4g" $value }}% of the {{ $labels.job }} targets in
{{ $labels.namespace }} namespace are down.'
expr: 100 * (count(up == 0) BY (job, namespace, service) / count(up) BY (job,
namespace, service)) > 10
for: 10m
labels:
severity: warning
- name: node-network
rules:
- alert: NodeNetworkInterfaceFlapping
annotations:
message: Network interface "{{ $labels.device }}" changing it's up status
often on node-exporter {{ $labels.namespace }}/{{ $labels.pod }}"
expr: |
changes(node_network_up{job="node-exporter",device!~"veth.+"}[2m]) > 2
for: 2m
labels:
severity: warning
从https://github.com/coreos/kube-prometheus/blob/master/manifests/prometheus-rules.yaml中部分摘录
请参考以下API参考文档以获取PrometheuRule的详细信息:
https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#prometheusrulespec。
执行 PrometheusOperator
对于如何运行 Prometheus Operator,您应该怎么做呢?
虽然您可以自己从头开始编写清单,但这样会相当困难。实际上,CoreOS已经为官方提供了一些样例清单,所以我们可以试着使用它们。
这个仓库被放置在coreos/kube-prometheus下面。
把它克隆下来,然后只需要运行 kubectl apply。让我们开始吧。
首先,克隆存储库。
git clone https://github.com/coreos/kube-prometheus
cd kube-prometheus
在执行此操作前,请展开manifests文件夹中的所有内容。
此外,PrometheusOperator的manifests文件位于./manifests/setup目录中,因此请确保使用-R选项来参考子目录中的内容。
执行操作后应该会长时间创建,创建的Namespace将为Monitoring。
> kubectl apply -f manifests/ -R
alertmanager.monitoring.coreos.com/main created
secret/alertmanager-main created
service/alertmanager-main created
serviceaccount/alertmanager-main created
servicemonitor.monitoring.coreos.com/alertmanager created
secret/grafana-datasources created
configmap/grafana-dashboard-apiserver created
configmap/grafana-dashboard-cluster-total created
configmap/grafana-dashboard-controller-manager created
configmap/grafana-dashboard-k8s-resources-cluster created
configmap/grafana-dashboard-k8s-resources-namespace created
configmap/grafana-dashboard-k8s-resources-node created
configmap/grafana-dashboard-k8s-resources-pod created
configmap/grafana-dashboard-k8s-resources-workload created
configmap/grafana-dashboard-k8s-resources-workloads-namespace created
configmap/grafana-dashboard-kubelet created
configmap/grafana-dashboard-namespace-by-pod created
configmap/grafana-dashboard-namespace-by-workload created
configmap/grafana-dashboard-node-cluster-rsrc-use created
configmap/grafana-dashboard-node-rsrc-use created
configmap/grafana-dashboard-nodes created
configmap/grafana-dashboard-persistentvolumesusage created
configmap/grafana-dashboard-pod-total created
configmap/grafana-dashboard-pods created
configmap/grafana-dashboard-prometheus-remote-write created
configmap/grafana-dashboard-prometheus created
configmap/grafana-dashboard-proxy created
configmap/grafana-dashboard-scheduler created
configmap/grafana-dashboard-statefulset created
configmap/grafana-dashboard-workload-total created
configmap/grafana-dashboards created
deployment.apps/grafana created
service/grafana created
serviceaccount/grafana created
servicemonitor.monitoring.coreos.com/grafana created
clusterrole.rbac.authorization.k8s.io/kube-state-metrics created
clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created
deployment.apps/kube-state-metrics created
role.rbac.authorization.k8s.io/kube-state-metrics created
rolebinding.rbac.authorization.k8s.io/kube-state-metrics created
service/kube-state-metrics created
serviceaccount/kube-state-metrics created
servicemonitor.monitoring.coreos.com/kube-state-metrics created
clusterrole.rbac.authorization.k8s.io/node-exporter created
clusterrolebinding.rbac.authorization.k8s.io/node-exporter created
daemonset.apps/node-exporter created
service/node-exporter created
serviceaccount/node-exporter created
servicemonitor.monitoring.coreos.com/node-exporter created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
clusterrole.rbac.authorization.k8s.io/prometheus-adapter created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-adapter created
clusterrolebinding.rbac.authorization.k8s.io/resource-metrics:system:auth-delegator created
clusterrole.rbac.authorization.k8s.io/resource-metrics-server-resources created
configmap/adapter-config created
deployment.apps/prometheus-adapter created
rolebinding.rbac.authorization.k8s.io/resource-metrics-auth-reader created
service/prometheus-adapter created
deployment.apps/prometheus-adapter created
rolebinding.rbac.authorization.k8s.io/resource-metrics-auth-reader created
service/prometheus-adapter created
serviceaccount/prometheus-adapter created
clusterrole.rbac.authorization.k8s.io/prometheus-k8s created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-k8s created
servicemonitor.monitoring.coreos.com/prometheus-operator created
prometheus.monitoring.coreos.com/k8s created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s-config created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s-config created
role.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s created
prometheusrule.monitoring.coreos.com/prometheus-k8s-rules created
service/prometheus-k8s created
serviceaccount/prometheus-k8s created
servicemonitor.monitoring.coreos.com/prometheus created
servicemonitor.monitoring.coreos.com/kube-apiserver created
servicemonitor.monitoring.coreos.com/coredns created
servicemonitor.monitoring.coreos.com/kube-controller-manager created
servicemonitor.monitoring.coreos.com/kube-scheduler created
servicemonitor.monitoring.coreos.com/kubelet created
namespace/monitoring created
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created
clusterrole.rbac.authorization.k8s.io/prometheus-operator created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
deployment.apps/prometheus-operator created
service/prometheus-operator created
serviceaccount/prometheus-operator created
让我们使用 kubectl get all 命令来查看资源。
> kubectl get all
NAME READY STATUS RESTARTS AGE
pod/alertmanager-main-0 2/2 Running 0 4m56s
pod/alertmanager-main-1 2/2 Running 0 4m56s
pod/alertmanager-main-2 2/2 Running 0 4m56s
pod/grafana-5db74b88f4-2rhk9 1/1 Running 0 5m32s
pod/kube-state-metrics-54f98c4687-9wbhh 3/3 Running 0 5m31s
pod/node-exporter-4f6qx 0/2 Pending 0 5m31s
pod/node-exporter-s22tr 0/2 Pending 0 5m31s
pod/node-exporter-vdrxc 0/2 Pending 0 5m31s
pod/prometheus-adapter-8667948d79-qk5t8 1/1 Running 0 5m29s
pod/prometheus-k8s-0 3/3 Running 1 4m45s
pod/prometheus-k8s-1 3/3 Running 1 4m45s
pod/prometheus-operator-548c6dc45c-44gzx 1/1 Running 0 5m24s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager-main ClusterIP 10.43.124.83 <none> 9093/TCP 5m38s
service/alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 4m56s
service/grafana ClusterIP 10.43.49.25 <none> 3000/TCP 5m33s
service/kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 5m32s
service/node-exporter ClusterIP None <none> 9100/TCP 5m31s
service/prometheus-adapter ClusterIP 10.43.231.15 <none> 443/TCP 5m30s
service/prometheus-k8s ClusterIP 10.43.3.81 <none> 9090/TCP 5m28s
service/prometheus-operated ClusterIP None <none> 9090/TCP 4m45s
service/prometheus-operator ClusterIP None <none> 8080/TCP 5m25s
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/node-exporter 3 3 0 3 0 kubernetes.io/os=linux 5m31s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/grafana 1/1 1 1 5m34s
deployment.apps/kube-state-metrics 1/1 1 1 5m33s
deployment.apps/prometheus-adapter 1/1 1 1 5m31s
deployment.apps/prometheus-operator 1/1 1 1 5m26s
NAME DESIRED CURRENT READY AGE
replicaset.apps/grafana-5db74b88f4 1 1 1 5m34s
replicaset.apps/kube-state-metrics-54f98c4687 1 1 1 5m33s
replicaset.apps/prometheus-adapter-8667948d79 1 1 1 5m31s
replicaset.apps/prometheus-operator-548c6dc45c 1 1 1 5m26s
NAME READY AGE
statefulset.apps/alertmanager-main 3/3 4m57s
statefulset.apps/prometheus-k8s 2/2 4m46s
Node Expoter目前处于挂起状态,但其他部分似乎都能正常部署。
Pod中有一个名为pod/prometheus-k8s-0的对象,它是Prometheus服务器的核心组件。默认情况下定义了replicas: 2,因此部署了两个Prometheus实例。
此外,下方的pod/prometheus-operator是Operator的核心组件。
暂时先尝试连接到 Prometheus。
由于通常的 Prometheus 也可以通过 0.0.0.0:9090 进行连接,因此可以尝试进行端口转发,并通过浏览器进行连接。
> kubectl port-forward service/prometheus-k8s 9090:9090 --address 0.0.0.0
Forwarding from 0.0.0.0:9090 -> 9090
当我使用浏览器查看目标页时,我发现各种目标都成功连接了。

以上,我们现在总算是能够让它动起来了。
让我们尝试一下,更改PrometheusServer和AlertManager的副本数。我们将分别修改以下文件。
podMonitorSelector: {}
replicas: 10 # とりあえず気分がいいので10個に
retention: 7d
resources:
nodeSelector:
kubernetes.io/os: linux
replicas: 20 # AlertManagerは大事なので20個に
securityContext:
请使用与之前相同的方式进行apply,并执行 “kubectl get pod”。
> kubectl apply -f ./manifests -R
※以下略
> kubectl get pod
NAME READY STATUS RESTARTS AGE
pod/alertmanager-main-0 2/2 Running 0 16m
pod/alertmanager-main-1 2/2 Running 0 16m
pod/alertmanager-main-10 2/2 Running 0 32s
pod/alertmanager-main-11 2/2 Running 0 32s
pod/alertmanager-main-12 2/2 Running 0 32s
pod/alertmanager-main-13 2/2 Running 0 32s
pod/alertmanager-main-14 2/2 Running 0 32s
pod/alertmanager-main-15 2/2 Running 0 32s
pod/alertmanager-main-16 2/2 Running 0 32s
pod/alertmanager-main-17 2/2 Running 0 32s
pod/alertmanager-main-18 2/2 Running 0 32s
pod/alertmanager-main-19 2/2 Running 0 32s
pod/alertmanager-main-2 2/2 Running 0 10s
pod/alertmanager-main-3 2/2 Running 0 32s
pod/alertmanager-main-4 2/2 Running 0 32s
pod/alertmanager-main-5 2/2 Running 0 32s
pod/alertmanager-main-6 2/2 Running 0 32s
pod/alertmanager-main-7 2/2 Running 0 32s
pod/alertmanager-main-8 2/2 Running 0 32s
pod/alertmanager-main-9 2/2 Running 0 32s
pod/grafana-5db74b88f4-2rhk9 1/1 Running 0 16m
pod/kube-state-metrics-54f98c4687-9wbhh 3/3 Running 0 16m
pod/node-exporter-4f6qx 0/2 Pending 0 16m
pod/node-exporter-s22tr 0/2 Pending 0 16m
pod/node-exporter-vdrxc 0/2 Pending 0 16m
pod/prometheus-adapter-8667948d79-qk5t8 1/1 Running 0 16m
pod/prometheus-k8s-0 3/3 Running 1 16m
pod/prometheus-k8s-1 3/3 Running 1 16m
pod/prometheus-k8s-2 3/3 Running 0 22s
pod/prometheus-k8s-3 3/3 Running 0 22s
pod/prometheus-k8s-4 3/3 Running 0 22s
pod/prometheus-k8s-5 3/3 Running 1 22s
pod/prometheus-k8s-6 3/3 Running 1 22s
pod/prometheus-k8s-7 3/3 Running 1 22s
pod/prometheus-k8s-8 3/3 Running 0 22s
pod/prometheus-k8s-9 3/3 Running 1 21s
pod/prometheus-operator-548c6dc45c-44gzx 1/1 Running 0 16m
我真棒…
那么,就这样简单地展开了Pod。
另外,确认AlertManager的设置后,可以看到所有的20个都正确地设置为了同行。(太棒了)
我已经筋疲力尽了。
我本来希望也能介绍一下ServiceMonitor的写法和操作方式,但我已经筋疲力尽了,所以就到这里为止吧。
如果有任何反应的话,我会考虑介绍如何编写ServiceMonitor以及操作方法的。 (有关如何在Kubernetes上运营Prometheus呢?等等)
Prometheus本身是非常方便且简单的监控产品,但在Kubernetes上部署时可能需要考虑更多事项。然而,通过使用Operator,可以改进并提供更多便利,因此请务必尝试一下!