试试使用 Prometheus 操作员
使用GKE来运行Prometheus Operator。
Prometheus Operator 是什么?
CoreOS在2016年做出了宣布。
关于Kubernetes Operator的解释略去不提,但这个Kubernetes Operator也是由CoreOS提倡的。
使用 Prometheus 的这个操作员可以确保所需的 Prometheus 服务器配置集被执行。
-
- データの保持期間
-
- 永続的なボリュームの要求(PVC)
-
- レプリカ数
-
- Prometheusのバージョン
- アラートを送信するAlertmanagerインスタンス
通过使用操作员,将Prometheus存储的大部分知识封装起来,仅向终端用户展示有意义的部分。
https://github.com/coreos/prometheus-operator的中文释义为核心操作系统的Prometheus运算符。而https://coreos.com/operators/prometheus/docs/latest/user-guides/getting-started.html则是核心操作系统官网上有关Prometheus运算符的最新用户指南。
通过CRD(Custom Resource Definition)创建以下内容:https://coreos.com/operators/prometheus/docs/latest/design.html
-
- Prometheus
-
- SericeMonitor
- Alertmanager
请给我创造一个好的Prometheus环境。

引入 Prometheus Operator
创建Kubernetes集群
这次,我们尝试将GKE引入使用
主要版本:1.15.11-gke.3
节点规格:n1-standard-2
集群:区域集群
gcloud beta container --project "project-name" clusters create "sakon-prometheus-operator" --region "asia-northeast1" \
--no-enable-basic-auth --cluster-version "1.15.11-gke.3" --machine-type "n1-standard-2" --image-type "COS" \
--disk-type "pd-standard" --disk-size "100" --metadata disable-legacy-endpoints=true \
--scopes "https://www.googleapis.com/auth/devstorage.read_only","https://www.googleapis.com/auth/logging.write","https://www.googleapis.com/auth/monitoring","https://www.googleapis.com/auth/servicecontrol","https://www.googleapis.com/auth/service.management.readonly","https://www.googleapis.com/auth/trace.append" \
--num-nodes "1" --enable-stackdriver-kubernetes --enable-ip-alias --network "projects/project-name/global/networks/sakon-network" \
--subnetwork "projects/project-name/regions/asia-northeast1/subnetworks/sakon-gke-subnet-1" \
--cluster-secondary-range-name "sakon-gkesub1-2nd-range-1" --services-secondary-range-name "sakon-gkesub1-2nd-range-2" \
--default-max-pods-per-node "110" --no-enable-master-authorized-networks \
--addons HorizontalPodAutoscaling,HttpLoadBalancing --enable-autoupgrade --enable-autorepair
部署Prometheus-Operator
基本上,需要进行以下的入门操作:
https://coreos.com/operators/prometheus/docs/latest/user-guides/getting-started.html
创建 Prometheus-operator 及其所需的 ClusterRole、ClusterRoleBinding 和 ServiceAccount。
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: prometheus-operator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus-operator
subjects:
- kind: ServiceAccount
name: prometheus-operator
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: prometheus-operator
rules:
- apiGroups:
- extensions
resources:
- thirdpartyresources
verbs:
- "*"
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- "*"
- apiGroups:
- monitoring.coreos.com
resources:
- alertmanagers
- prometheuses
- prometheuses/finalizers
- servicemonitors
verbs:
- "*"
- apiGroups:
- apps
resources:
- statefulsets
verbs: ["*"]
- apiGroups: [""]
resources:
- configmaps
- secrets
verbs: ["*"]
- apiGroups: [""]
resources:
- pods
verbs: ["list", "delete"]
- apiGroups: [""]
resources:
- services
- endpoints
verbs: ["get", "create", "update"]
- apiGroups: [""]
resources:
- nodes
verbs: ["list", "watch"]
- apiGroups: [""]
resources:
- namespaces
verbs: ["list"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus-operator
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
k8s-app: prometheus-operator
name: prometheus-operator
spec:
replicas: 1
template:
metadata:
labels:
k8s-app: prometheus-operator
spec:
containers:
- args:
- --kubelet-service=kube-system/kubelet
- --config-reloader-image=quay.io/coreos/configmap-reload:v0.0.1
image: quay.io/coreos/prometheus-operator:v0.17.0
name: prometheus-operator
ports:
- containerPort: 8080
name: http
resources:
limits:
cpu: 200m
memory: 100Mi
requests:
cpu: 100m
memory: 50Mi
securityContext:
runAsNonRoot: true
runAsUser: 65534
serviceAccountName: prometheus-operator
$ kubectl apply -f prometheus-oprator.yml
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
clusterrole.rbac.authorization.k8s.io/prometheus-operator created
serviceaccount/prometheus-operator created
deployment.extensions/prometheus-operator created
$ prometheus-operator kubectl get pod
NAME READY STATUS RESTARTS AGE
prometheus-operator-5fff966576-lw6f7 1/1 Running 0 19s
将example-app部署为基本示例应用程序。
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: example-app
spec:
replicas: 3
template:
metadata:
labels:
app: example-app
spec:
containers:
- name: example-app
image: fabxc/instrumented_app
ports:
- name: web
containerPort: 8080
$ kubectl apply -f example-app.yml
deployment.extensions/example-app created
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
example-app-66db748757-btvfp 1/1 Running 0 44s
example-app-66db748757-rdl5h 1/1 Running 0 44s
example-app-66db748757-t65fp 1/1 Running 0 44s
prometheus-operator-5fff966576-lw6f7 1/1 Running 0 11m
示例应用程序的部署。
在SericeMonitor中,有一个标签选择器用于选择服务及其相关的端点对象。
示例应用程序的Service对象通过具有标签app: example-app来选择Pod。
Service对象还指定了公开度量的端口。
kind: Service
apiVersion: v1
metadata:
name: example-app
labels:
app: example-app
spec:
selector:
app: example-app
ports:
- name: web
port: 8080
$ kubectl apply -f example-app-service.yml
service/example-app created
$kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
example-app ClusterIP 10.126.124.18 <none> 8080/TCP 6s
kubernetes ClusterIP 10.124.0.1 <none> 443/TCP 30m
这个服务对象会被使用相同方式的ServiceMonitor所检测到。它需要具有标有app: example-app的标签。
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: example-app
labels:
team: frontend
spec:
selector:
matchLabels:
app: example-app
endpoints:
- port: web
$ kubectl apply -f example-app-ServiceMonitor.yml
servicemonitor.monitoring.coreos.com/example-app created
$ kubectl get ServiceMonitor
NAME AGE
example-app 39s
启用Prometheus Pod的RBAC规则
如果启用了RBAC,则需要创建Prometheus和Prometheus Operator的RBAC规则。在上面的部分,我们创建了Promethus-Operator的ClusterRole和ClusterRoleBinding,但是Promethus Pod也需要进行类似的操作。
创建 Prometheus Pod 的 ClusterRole 和 ClusterRoleBinding。
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""]
resources:
- nodes
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources:
- configmaps
verbs: ["get"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: default
$ kubectl apply -f prometheus-ServiceAccount.yml
serviceaccount/prometheus created
clusterrole.rbac.authorization.k8s.io/prometheus created
clusterrolebinding.rbac.authorization.k8s.io/prometheus created
将ServiceMonitor包含在内
最后,Prometheus对象将定义serviceMonitorSelector以指定包含哪些ServiceMonitor。
由于指定了team: frontend标签,因此Prometheus对象进行了选择。
请参考以下链接,因为在getting-started示例中没有指定serviceaccount等。
https://github.com/coreos/prometheus-operator/blob/v0.17.0/example/rbac/prometheus/prometheus.yaml
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
labels:
prometheus: prometheus
spec:
replicas: 2
serviceAccountName: prometheus
serviceMonitorSelector:
matchLabels:
team: frontend
alerting:
alertmanagers:
- namespace: default
name: alertmanager
port: web
resources:
requests:
memory: 400Mi
$ kubectl apply -f frontend-prometheus.yml
prometheus.monitoring.coreos.com/prometheus created
$ kubectl get prometheus
NAME AGE
prometheus 12s
发布prometheus实例
要想访问Prometheus实例,需要将其公开到外部。在这个例子中,我们使用NodePort来公开实例。
apiVersion: v1
kind: Service
metadata:
name: prometheus
spec:
type: NodePort
ports:
- name: web
nodePort: 30900
port: 9090
protocol: TCP
targetPort: web
selector:
prometheus: prometheus
$ kubectl apply -f frontend-prometheus-svc.yml
service/prometheus created
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
example-app ClusterIP 10.126.124.18 <none> 8080/TCP 25m
kubernetes ClusterIP 10.124.0.1 <none> 443/TCP 55m
prometheus NodePort 10.125.116.34 <none> 9090:30900/TCP 37s
prometheus-operated ClusterIP None <none> 9090/TCP 5m13s
我按照例子用NodePort创建了服务,但最终还是使用了端口转发。
使用 kubectl 将 svc/prometheus 转发到 9090 端口。

我們已經成功獲取到了這樣的指標數據。
