试试使用 Prometheus 操作员

使用GKE来运行Prometheus Operator。

Prometheus Operator 是什么?

CoreOS在2016年做出了宣布。
关于Kubernetes Operator的解释略去不提,但这个Kubernetes Operator也是由CoreOS提倡的。

使用 Prometheus 的这个操作员可以确保所需的 Prometheus 服务器配置集被执行。

    • データの保持期間

 

    • 永続的なボリュームの要求(PVC)

 

    • レプリカ数

 

    • Prometheusのバージョン

 

    アラートを送信するAlertmanagerインスタンス

通过使用操作员,将Prometheus存储的大部分知识封装起来,仅向终端用户展示有意义的部分。

https://github.com/coreos/prometheus-operator的中文释义为核心操作系统的Prometheus运算符。而https://coreos.com/operators/prometheus/docs/latest/user-guides/getting-started.html则是核心操作系统官网上有关Prometheus运算符的最新用户指南。

通过CRD(Custom Resource Definition)创建以下内容:https://coreos.com/operators/prometheus/docs/latest/design.html

    • Prometheus

 

    • SericeMonitor

 

    Alertmanager

请给我创造一个好的Prometheus环境。

image.png

引入 Prometheus Operator

创建Kubernetes集群

这次,我们尝试将GKE引入使用

主要版本:1.15.11-gke.3
节点规格:n1-standard-2
集群:区域集群


gcloud beta container --project "project-name" clusters create "sakon-prometheus-operator" --region "asia-northeast1" \
--no-enable-basic-auth --cluster-version "1.15.11-gke.3" --machine-type "n1-standard-2" --image-type "COS" \
--disk-type "pd-standard" --disk-size "100" --metadata disable-legacy-endpoints=true \
--scopes "https://www.googleapis.com/auth/devstorage.read_only","https://www.googleapis.com/auth/logging.write","https://www.googleapis.com/auth/monitoring","https://www.googleapis.com/auth/servicecontrol","https://www.googleapis.com/auth/service.management.readonly","https://www.googleapis.com/auth/trace.append" \
--num-nodes "1" --enable-stackdriver-kubernetes --enable-ip-alias --network "projects/project-name/global/networks/sakon-network" \
--subnetwork "projects/project-name/regions/asia-northeast1/subnetworks/sakon-gke-subnet-1" \
--cluster-secondary-range-name "sakon-gkesub1-2nd-range-1" --services-secondary-range-name "sakon-gkesub1-2nd-range-2" \
--default-max-pods-per-node "110" --no-enable-master-authorized-networks \
--addons HorizontalPodAutoscaling,HttpLoadBalancing --enable-autoupgrade --enable-autorepair

部署Prometheus-Operator

基本上,需要进行以下的入门操作:
https://coreos.com/operators/prometheus/docs/latest/user-guides/getting-started.html

创建 Prometheus-operator 及其所需的 ClusterRole、ClusterRoleBinding 和 ServiceAccount。

apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: prometheus-operator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus-operator
subjects:
- kind: ServiceAccount
  name: prometheus-operator
  namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: prometheus-operator
rules:
- apiGroups:
  - extensions
  resources:
  - thirdpartyresources
  verbs:
  - "*"
- apiGroups:
  - apiextensions.k8s.io
  resources:
  - customresourcedefinitions
  verbs:
  - "*"
- apiGroups:
  - monitoring.coreos.com
  resources:
  - alertmanagers
  - prometheuses
  - prometheuses/finalizers
  - servicemonitors
  verbs:
  - "*"
- apiGroups:
  - apps
  resources:
  - statefulsets
  verbs: ["*"]
- apiGroups: [""]
  resources:
  - configmaps
  - secrets
  verbs: ["*"]
- apiGroups: [""]
  resources:
  - pods
  verbs: ["list", "delete"]
- apiGroups: [""]
  resources:
  - services
  - endpoints
  verbs: ["get", "create", "update"]
- apiGroups: [""]
  resources:
  - nodes
  verbs: ["list", "watch"]
- apiGroups: [""]
  resources:
  - namespaces
  verbs: ["list"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus-operator
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    k8s-app: prometheus-operator
  name: prometheus-operator
spec:
  replicas: 1
  template:
    metadata:
      labels:
        k8s-app: prometheus-operator
    spec:
      containers:
      - args:
        - --kubelet-service=kube-system/kubelet
        - --config-reloader-image=quay.io/coreos/configmap-reload:v0.0.1
        image: quay.io/coreos/prometheus-operator:v0.17.0
        name: prometheus-operator
        ports:
        - containerPort: 8080
          name: http
        resources:
          limits:
            cpu: 200m
            memory: 100Mi
          requests:
            cpu: 100m
            memory: 50Mi
      securityContext:
        runAsNonRoot: true
        runAsUser: 65534
      serviceAccountName: prometheus-operator
$ kubectl apply -f prometheus-oprator.yml
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
clusterrole.rbac.authorization.k8s.io/prometheus-operator created
serviceaccount/prometheus-operator created
deployment.extensions/prometheus-operator created

$ prometheus-operator kubectl get pod
NAME                                   READY   STATUS    RESTARTS   AGE
prometheus-operator-5fff966576-lw6f7   1/1     Running   0          19s

将example-app部署为基本示例应用程序。


apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: example-app
spec:
  replicas: 3
  template:
    metadata:
      labels:
        app: example-app
    spec:
      containers:
      - name: example-app
        image: fabxc/instrumented_app
        ports:
        - name: web
          containerPort: 8080
$ kubectl apply -f example-app.yml
deployment.extensions/example-app created

$ kubectl get pod
NAME                                   READY   STATUS    RESTARTS   AGE
example-app-66db748757-btvfp           1/1     Running   0          44s
example-app-66db748757-rdl5h           1/1     Running   0          44s
example-app-66db748757-t65fp           1/1     Running   0          44s
prometheus-operator-5fff966576-lw6f7   1/1     Running   0          11m

示例应用程序的部署。

在SericeMonitor中,有一个标签选择器用于选择服务及其相关的端点对象。
示例应用程序的Service对象通过具有标签app: example-app来选择Pod。
Service对象还指定了公开度量的端口。

kind: Service
apiVersion: v1
metadata:
  name: example-app
  labels:
    app: example-app
spec:
  selector:
    app: example-app
  ports:
  - name: web
    port: 8080
$ kubectl apply -f example-app-service.yml
service/example-app created

$kubectl get svc
NAME          TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
example-app   ClusterIP   10.126.124.18   <none>        8080/TCP   6s
kubernetes    ClusterIP   10.124.0.1      <none>        443/TCP    30m

这个服务对象会被使用相同方式的ServiceMonitor所检测到。它需要具有标有app: example-app的标签。

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: example-app
  labels:
    team: frontend
spec:
  selector:
    matchLabels:
      app: example-app
  endpoints:
  - port: web
$ kubectl apply -f example-app-ServiceMonitor.yml
servicemonitor.monitoring.coreos.com/example-app created

$ kubectl get ServiceMonitor
NAME          AGE
example-app   39s

启用Prometheus Pod的RBAC规则

如果启用了RBAC,则需要创建Prometheus和Prometheus Operator的RBAC规则。在上面的部分,我们创建了Promethus-Operator的ClusterRole和ClusterRoleBinding,但是Promethus Pod也需要进行类似的操作。

创建 Prometheus Pod 的 ClusterRole 和 ClusterRoleBinding。

apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources:
  - nodes
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources:
  - configmaps
  verbs: ["get"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: default
$ kubectl apply -f prometheus-ServiceAccount.yml
serviceaccount/prometheus created
clusterrole.rbac.authorization.k8s.io/prometheus created
clusterrolebinding.rbac.authorization.k8s.io/prometheus created

将ServiceMonitor包含在内

最后,Prometheus对象将定义serviceMonitorSelector以指定包含哪些ServiceMonitor。
由于指定了team: frontend标签,因此Prometheus对象进行了选择。
请参考以下链接,因为在getting-started示例中没有指定serviceaccount等。
https://github.com/coreos/prometheus-operator/blob/v0.17.0/example/rbac/prometheus/prometheus.yaml

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prometheus
  labels:
    prometheus: prometheus
spec:
  replicas: 2
  serviceAccountName: prometheus
  serviceMonitorSelector:
    matchLabels:
      team: frontend
  alerting:
    alertmanagers:
    - namespace: default
      name: alertmanager
      port: web
  resources:
    requests:
      memory: 400Mi

$ kubectl apply -f frontend-prometheus.yml
prometheus.monitoring.coreos.com/prometheus created

$ kubectl get prometheus
NAME         AGE
prometheus   12s

发布prometheus实例

要想访问Prometheus实例,需要将其公开到外部。在这个例子中,我们使用NodePort来公开实例。

apiVersion: v1
kind: Service
metadata:
  name: prometheus
spec:
  type: NodePort
  ports:
  - name: web
    nodePort: 30900
    port: 9090
    protocol: TCP
    targetPort: web
  selector:
    prometheus: prometheus
$ kubectl apply -f frontend-prometheus-svc.yml
service/prometheus created

$ kubectl get svc
NAME                  TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
example-app           ClusterIP   10.126.124.18   <none>        8080/TCP         25m
kubernetes            ClusterIP   10.124.0.1      <none>        443/TCP          55m
prometheus            NodePort    10.125.116.34   <none>        9090:30900/TCP   37s
prometheus-operated   ClusterIP   None            <none>        9090/TCP         5m13s

我按照例子用NodePort创建了服务,但最终还是使用了端口转发。

使用 kubectl 将 svc/prometheus 转发到 9090 端口。

スクリーンショット 2020-04-12 14.30.54.png

我們已經成功獲取到了這樣的指標數據。

スクリーンショット 2020-04-12 14.47.50.png
广告
将在 10 秒后关闭
bannerAds