在IBM云上的Red Hat OpenShift中部署自定义的Ansible Operator(通过OLM安装)

目标

在之前的文章中,我们手动将自己创建的Operator安装到集群中。这一次,我们将尝试使用Operator Lifecycle Manager(OLM)。通过使用OLM,我们可以通过OpenShift控制台对Operator进行操作。

步骤

假设从上一篇文章中的步骤执行完毕后的状态开始。

创建Bundle

将用于OLM管理的Operator包信息称为Bundle。生成Bundle的定义。需要手动输入一些信息。

$ make bundle
operator-sdk generate kustomize manifests -q

Display name for the operator (required):
> Hello Operator

Description for the operator (required):
> An example of Ansible Operator

Provider's name for the operator (required):
> teruq

Any relevant URL for the provider name (optional):
> https://qiita.com/teruq

Comma-separated list of keywords for your operator (required):
> ansible,hello

Comma-separated list of maintainers and their emails (e.g. 'name1:email1, name2:email2') (required):
> teruq
cd config/manager && /.../hello-ansible-operator/bin/kustomize edit set image controller=jp.icr.io/teruq/hello-ansible-operator:0.0.1
/.../hello-ansible-operator/bin/kustomize build config/manifests | operator-sdk generate bundle -q --overwrite --version 0.0.1
INFO[0002] Creating bundle.Dockerfile
INFO[0002] Creating bundle/metadata/annotations.yaml
INFO[0002] Bundle metadata generated suceessfully
operator-sdk bundle validate ./bundle
INFO[0000] All validation tests have completed successfully

构建捆绑图像

Bundle本身也是一个容器镜像。对镜像进行构建。

$ make bundle-build
docker build -f bundle.Dockerfile -t jp.icr.io/teruq/hello-ansible-operator-bundle:v0.0.1 .
...

捆绑推送图像

若经过上一篇文章所提到的步骤已经过了一段时间,可能会出现认证过期的情况,因此需要重新进行认证。

$ export IBMCLOUD_API_KEY=********
$ ibmcloud login
$ ibmcloud cr login

将图像推送到ICR。

$ make bundle-push
make docker-push IMG=jp.icr.io/teruq/hello-ansible-operator-bundle:v0.0.1
make[1]: ディレクトリ '/.../hello-ansible-operator' に入ります
docker push jp.icr.io/teruq/hello-ansible-operator-bundle:v0.0.1
The push refers to repository [jp.icr.io/teruq/hello-ansible-operator-bundle]
598401fa8392: Layer already exists
14f153f590a2: Layer already exists
9d6970a7da8f: Layer already exists
v0.0.1: digest: sha256:aa57c38c38d897e1275e4b95dbc62c3b53d37399bdbb7529e9ff15ab4c098494 size: 939
make[1]: ディレクトリ '/.../hello-ansible-operator' から出ます

我会确认是否在ICR注册。

$ ibmcloud cr images | grep hello-ansible
$ ibmcloud cr images | grep hello-ansible
jp.icr.io/teruq/hello-ansible-operator               0.0.1                              031595ea6f53   teruq      2 hours ago    156 MB   55 件の問題
jp.icr.io/teruq/hello-ansible-operator-bundle        v0.0.1                             aa57c38c38d8   teruq      1 hour ago     3.4 kB   サポート対象外 OS

将 ImagePullSecret 复制并链接

为了从ICR获取各种映像,像往常一样,复制ImagePullSecret。

创建命名空间

事先创建一个名为qiita-operators的命名空间以安装Operator。

$ oc create ns qiita-operators

将ImagePullSecret复制

从默认命名空间中复制。

$ oc get secret all-icr-io -n default -o yaml | grep -v namespace: | oc create -n qiita-operators -f-

将ImagePullSecret链接到ServiceAccount。

将all-icr-io链接到Bundle的defaultServiceAccount。

$ oc secrets link default all-icr-io --for pull -n qiita-operators

使用OLM进行安装。

确认 OLM

OpenShift 4.x及以上的版本会默认安装OLM。ROKS也是如此。您可以按照下面的方式来确认状态。

$ operator-sdk olm status --olm-namespace openshift-operator-lifecycle-manager
I0930 15:08:43.304528   15260 request.go:668] Waited for 1.0293062s due to client-side throttling, not priority and fairness, request: GET:https://c100-e.jp-tok.containers.cloud.ibm.com:32154/apis/samples.operator.openshift.io/v1?timeout=32s
INFO[0004] Fetching CRDs for version "0.16.1"
INFO[0004] Using locally stored resource manifests
INFO[0004] Successfully got OLM status for version "0.16.1"

NAME                                            NAMESPACE    KIND                        STATUS
operators.operators.coreos.com                               CustomResourceDefinition    Installed
operatorgroups.operators.coreos.com                          CustomResourceDefinition    Installed
installplans.operators.coreos.com                            CustomResourceDefinition    Installed
clusterserviceversions.operators.coreos.com                  CustomResourceDefinition    Installed
subscriptions.operators.coreos.com                           CustomResourceDefinition    Installed
system:controller:operator-lifecycle-manager                 ClusterRole                 Installed
aggregate-olm-edit                                           ClusterRole                 Installed
aggregate-olm-view                                           ClusterRole                 Installed
catalogsources.operators.coreos.com                          CustomResourceDefinition    Installed
olm                                                          Namespace                   namespaces "olm" not found
olm-operator-binding-olm                                     ClusterRoleBinding          clusterrolebindings.rbac.authorization.k8s.io "olm-operator-binding-olm" not found
olm-operator                                    olm          Deployment                  deployments.apps "olm-operator" not found
catalog-operator                                olm          Deployment                  deployments.apps "catalog-operator" not found
olm-operator-serviceaccount                     olm          ServiceAccount              serviceaccounts "olm-operator-serviceaccount" not found
operators                                                    Namespace                   namespaces "operators" not found
global-operators                                operators    OperatorGroup               operatorgroups.operators.coreos.com "global-operators" not found
olm-operators                                   olm          OperatorGroup               operatorgroups.operators.coreos.com "olm-operators" not found
packageserver                                   olm          ClusterServiceVersion       clusterserviceversions.operators.coreos.com "packageserver" not found
operatorhubio-catalog                           olm          CatalogSource               catalogsources.operators.coreos.com "operatorhubio-catalog" not found

运行 Bundle

请使用以下命令来执行Bundle。您需要在这里重新指定刚刚链接到ServiceAccount的ImagePullSecret。

$ operator-sdk run bundle jp.icr.io/teruq/hello-ansible-operator-bundle:v0.0.1 --pull-secret-name all-icr-io -n qiita-operators
INFO[0009] Successfully created registry pod: jp-icr-io-teruq-hello-ansible-operator-bundle-v0-0-1
INFO[0010] Created CatalogSource: hello-ansible-operator-catalog
INFO[0010] OperatorGroup "operator-sdk-og" created
INFO[0010] Created Subscription: hello-ansible-operator-v0-0-1-sub
INFO[0012] Approved InstallPlan install-lcqlh for the Subscription: hello-ansible-operator-v0-0-1-sub
INFO[0012] Waiting for ClusterServiceVersion "qiita-operators/hello-ansible-operator.v0.0.1" to reach 'Succeeded' phase
INFO[0012]   Waiting for ClusterServiceVersion "qiita-operators/hello-ansible-operator.v0.0.1" to appear
INFO[0018]   Found ClusterServiceVersion "qiita-operators/hello-ansible-operator.v0.0.1" phase: Pending
INFO[0020]   Found ClusterServiceVersion "qiita-operators/hello-ansible-operator.v0.0.1" phase: Installing
INFO[0107]   Found ClusterServiceVersion "qiita-operators/hello-ansible-operator.v0.0.1" phase: Succeeded
INFO[0108] OLM has successfully installed "hello-ansible-operator.v0.0.1"

以下是Pod的状态。

$ oc get pods -n qiita-operators
NAME                                                              READY   STATUS      RESTARTS   AGE
9cbeaa081bdcc6ecb2ba12064d0e68820efc754b3a48e9acca07d3987bcgsgl   0/1     Completed   0          28s
hello-ansible-operator-controller-manager-7cb6dcf49d-99rnt        2/2     Running     0          14s
jp-icr-io-teruq-hello-ansible-operator-bundle-v0-0-1              1/1     Running     0          40s

关于ImagePullSecret的补充说明。jp-icr-io-teruq〜是与默认ServiceAccount关联的,9cbeaa081bd〜是在CLI执行时指定的,hello-ansible-operator〜是指定给Deployment使用的。可以从这种不一致中推断出,开发这个SDK的人可能没有考虑到镜像位于私有注册表中的情况。

确认OpenShift控制台

从OpenShift控制台中选择OperatorHub。存在Hello Operator。

image.png

查看已安装的运营商,发现已安装了Hello Operator。

image.png

应用程序部署

只需提供一个选项,用中文将以下内容进行释义翻译:
通过已安装的操作员,我们可以将屏幕顶部的项目更改为要部署的应用程序项目。然后选择Hello Operator。

image.png

选择“创建实例”。

image.png

上次在自定义资源定义中将“replicas”设为了必需属性,现在已经可以在界面上输入了。选择“创建”。

image.png

状态变为运行。

image.png

我要确认Pod正在运行。

$ oc get pods -n qiita | grep hello
sample-hello-c54fb8b58-7hmbn      1/1     Running   0          28s
sample-hello-c54fb8b58-sdjlk      1/1     Running   0          28s

请参考上一篇文章以确认应用程序的运行情况。

清理

您可以使用SDK进行清理。

$ operator-sdk cleanup hello-ansible-operator
INFO[0002] subscription "hello-ansible-operator-v0-0-1-sub" deleted
INFO[0002] customresourcedefinition "hellos.example.teruq.example.com" deleted
INFO[0002] clusterserviceversion "hello-ansible-operator.v0.0.1" deleted
INFO[0002] catalogsource "hello-ansible-operator-catalog" deleted
INFO[0002] operatorgroup "operator-sdk-og" deleted
INFO[0002] Operator "hello-ansible-operator" uninstalled

为了完全清理,需要手动逐个删除已创建的内容。

$ oc secrets unlink default all-icr-io
$ oc delete secret all-icr-io
$ oc delete ns qiita-operators

印象

使用私有注册表确实会增加手续的复杂性。如果现在这样继续下去,将无法将Operator编目化并轻松安装,因此我们希望找到一种解决办法。我想进一步验证,是否有好的方法可供选择。

请参考

    • https://operatorhub.io/getting-started

 

    https://sdk.operatorframework.io/docs/building-operators/ansible/tutorial/