在Kubernetes上运行Redis集群的生命周期运维步骤
以下是关于在Kubernetes上构建、扩展、缩小和删除Redis集群的生命周期的K8s和Redis操作步骤的备忘录,如下图所示。

配置Redis集群
这是搭建包含3个Redis主节点和3个Redis从节点的Redis集群的记录。
K8s集群节点列表
创建的Redis集群环境是IBM Cloud Kubernetes Service (IKS) 的三节点K8s集群配置。
$ kubectl get node
NAME STATUS ROLES AGE VERSION
10.193.10.14 Ready <none> 22m v1.13.8+IKS
10.193.10.20 Ready <none> 21m v1.13.8+IKS
10.193.10.58 Ready <none> 22m v1.13.8+IKS
使用清单文件启动Redis集群
这里使用的清单是从https://github.com/sanderploegsma/redis-cluster 直接使用的。尽管这个清单不是针对IKS设计的,但可以直接使用没有修改过的版本。
$ kubectl apply -f redis-cluster.yml
configmap/redis-cluster created
service/redis-cluster created
statefulset.apps/redis-cluster created
Pod和PersistentVolume的创建完成状态
在这个阶段,Redis集群中的每个Pod都已经挂载了持久性卷。
$ kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
redis-cluster-0 0/1 Pending 0 40s <none> <none> <none> <none>
$ kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE
redis-cluster-0 1/1 Running 0 28m 172.30.254.8 10.193.10.14
redis-cluster-1 1/1 Running 0 26m 172.30.103.132 10.193.10.58
redis-cluster-2 1/1 Running 0 24m 172.30.63.2 10.193.10.20
redis-cluster-3 1/1 Running 0 22m 172.30.254.9 10.193.10.14
redis-cluster-4 1/1 Running 0 20m 172.30.63.3 10.193.10.20
redis-cluster-5 1/1 Running 0 18m 172.30.103.133 10.193.10.58
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-redis-cluster-0 Bound pvc-43b58162 20Gi RWO ibmc-file-bronze 28m
data-redis-cluster-1 Bound pvc-86a09801 20Gi RWO ibmc-file-bronze 26m
data-redis-cluster-2 Bound pvc-d3110510 20Gi RWO ibmc-file-bronze 24m
data-redis-cluster-3 Bound pvc-19828267 20Gi RWO ibmc-file-bronze 22m
data-redis-cluster-4 Bound pvc-6515149b 20Gi RWO ibmc-file-bronze 20m
data-redis-cluster-5 Bound pvc-a8929ead 20Gi RWO ibmc-file-bronze 18m
Redis集群的初始化
在Redis集群的一个节点上执行命令,从而构建Redis集群。
$ kubectl exec -it redis-cluster-0 -- redis-cli --cluster create --cluster-replicas 1 \
> $(kubectl get pods -l app=redis-cluster -o jsonpath='{range.items[*]}{.status.podIP}:6379 ')
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 172.30.254.9:6379 to 172.30.254.8:6379
Adding replica 172.30.63.3:6379 to 172.30.103.132:6379
Adding replica 172.30.103.133:6379 to 172.30.63.2:6379
M: 901cdffd0f11f30e09ba2c22540ca2d31f1219e2 172.30.254.8:6379
slots:[0-5460] (5461 slots) master
M: e3246d584d2390f7775ff168da4f8838939c6e53 172.30.103.132:6379
slots:[5461-10922] (5462 slots) master
M: 9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 172.30.63.2:6379
slots:[10923-16383] (5461 slots) master
S: 3309f5c2f345f4d4facd5af967588d794225fef1 172.30.254.9:6379
replicates 901cdffd0f11f30e09ba2c22540ca2d31f1219e2
S: cfc0a7bb96305308cc35fd6af37b59b2e2b117ae 172.30.63.3:6379
replicates e3246d584d2390f7775ff168da4f8838939c6e53
S: 2b4a9d44f3576c239101afc16a002db6cc19be2f 172.30.103.133:6379
replicates 9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
....
>>> Performing Cluster Check (using node 172.30.254.8:6379)
M: 901cdffd0f11f30e09ba2c22540ca2d31f1219e2 172.30.254.8:6379
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: 2b4a9d44f3576c239101afc16a002db6cc19be2f 172.30.103.133:6379
slots: (0 slots) slave
replicates 9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe
M: e3246d584d2390f7775ff168da4f8838939c6e53 172.30.103.132:6379
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
M: 9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 172.30.63.2:6379
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
S: cfc0a7bb96305308cc35fd6af37b59b2e2b117ae 172.30.63.3:6379
slots: (0 slots) slave
replicates e3246d584d2390f7775ff168da4f8838939c6e53
S: 3309f5c2f345f4d4facd5af967588d794225fef1 172.30.254.9:6379
slots: (0 slots) slave
replicates 901cdffd0f11f30e09ba2c22540ca2d31f1219e2
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
确认Redis集群的成员
$ kubectl run -it redis-cli --rm --image redis --restart=Never -- bash
If you don't see a command prompt, try pressing enter.
root@redis-cli:/data# redis-cli -c -h 172.30.254.8 -p 6379
172.30.254.8:6379> cluster nodes
2b4a9d44f3576c239101afc16a002db6cc19be2f 172.30.103.133:6379@16379 slave 9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 0 1564558966000 6 connected
e3246d584d2390f7775ff168da4f8838939c6e53 172.30.103.132:6379@16379 master - 0 1564558965000 2 connected 5461-10922
9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 172.30.63.2:6379@16379 master - 0 1564558966134 3 connected 10923-16383
f4631fb91f42024db54a829dfc00a12e9035761e 172.30.254.10:6379@16379 master - 0 1564558965000 0 connected
cfc0a7bb96305308cc35fd6af37b59b2e2b117ae 172.30.63.3:6379@16379 slave e3246d584d2390f7775ff168da4f8838939c6e53 0 1564558967137 5 connected
3309f5c2f345f4d4facd5af967588d794225fef1 172.30.254.9:6379@16379 slave 901cdffd0f11f30e09ba2c22540ca2d31f1219e2 0 1564558965128 4 connected
901cdffd0f11f30e09ba2c22540ca2d31f1219e2 172.30.254.8:6379@16379 myself,master - 0 1564558964000 1 connected 0-5460
在按照姓名顺序的视图中进行分析
按照主节点的顺序启动,依次启动从节点。可以了解到主节点和从节点的关系。
NAME IP NODE ROLE MASTER
redis-cluster-0 172.30.254.8 10.193.10.14 Master -
redis-cluster-1 172.30.103.132 10.193.10.58 Master -
redis-cluster-2 172.30.63.2 10.193.10.20 Master -
redis-cluster-3 172.30.254.9 10.193.10.14 Slave redis-cluster-0
redis-cluster-4 172.30.63.3 10.193.10.20 Slave redis-cluster-1
redis-cluster-5 172.30.103.133 10.193.10.58 Slave redis-cluster-2
根据上述结果,制作主从对应表如下。
マスター スレーブ(レプリカ)
redis-cluster-0 -> redis-cluster-3
redis-cluster-1 -> redis-cluster-4
redis-cluster-2 -> redis-cluster-5
在节点排序视图中进行分析
通过按节点顺序进行排序,可以了解在K8s上共存的Pods。作为结果,得知主节点 redis-cluster-0 和从节点 redis-cluster-3 在节点 *.14 上的同一个K8s节点上。因此,如果此K8s节点停止,哈希槽 slots:[0-5460] 将会丢失。因此,主节点和从节点将同时丢失,并且从节点无法升级以进行持续服务。
如果K8s节点(虚拟服务器)*.14停止运行,对于GCP而言,会立即启动替代的K8s节点(虚拟服务器),因此能够在短时间内恢复正常。而对于IKS而言,如果能够在短时间内启动K8s节点(虚拟服务器),就能够降低停机时间对系统造成的影响。
NAME IP NODE ROLE MASTER
redis-cluster-0 172.30.254.8 10.193.10.14 Master -
redis-cluster-3 172.30.254.9 10.193.10.14 Slave redis-cluster-0
redis-cluster-2 172.30.63.2 10.193.10.20 Master -
redis-cluster-4 172.30.63.3 10.193.10.20 Slave redis-cluster-1
redis-cluster-1 172.30.103.132 10.193.10.58 Master -
redis-cluster-5 172.30.103.133 10.193.10.58 Slave redis-cluster-2
Redis集群已经构建完成。K8s集群内的客户端现在可以通过内部DNS redis-cluster 访问Redis集群。当然,使用支持Redis集群的编程语言库是必要的。
Redis集群的增强
在已启动的Redis集群中添加Pod,增强Redis集群的主节点和从节点的数量。但是,并没有增加K8s集群的节点数。操作步骤如下:
-
- ステートフルセット・コントローラに対して、レプリカセット数 8 をセット
-
- Redisマスターの追加
-
- Redisステーブの追加
- リバランスを実行して、追加したRedisマスターにデータを均一化
复制品的数量增加,从6个增加到8个。
将新的副本数量设置到状态控制器中。这样一来,成员将在序列的末尾添加。新增的Pod与现有的Pod一样,都挂载了持久卷。
$ kubectl scale statefulset redis-cluster --replicas=8
statefulset.apps/redis-cluster scaled
$ kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE
redis-cluster-0 1/1 Running 0 36m 172.30.254.8 10.193.10.14
redis-cluster-1 1/1 Running 0 35m 172.30.103.132 10.193.10.58
redis-cluster-2 1/1 Running 0 32m 172.30.63.2 10.193.10.20
redis-cluster-3 1/1 Running 0 30m 172.30.254.9 10.193.10.14
redis-cluster-4 1/1 Running 0 28m 172.30.63.3 10.193.10.20
redis-cluster-5 1/1 Running 0 26m 172.30.103.133 10.193.10.58
redis-cluster-6 1/1 Running 0 3m41s 172.30.254.10 10.193.10.14
redis-cluster-7 1/1 Running 0 112s 172.30.63.4 10.193.10.20
增加Redis主节点
将 Pod redis-cluster-6 作为 Redis 主节点添加到 redis-cluster-0,并通知。在其中,“$(kubectl get pod redis-cluster-6 -o jsonpath='{.status.podIP}’)”将返回指定 Pod 的 IP 地址。虽然也可以查找并设置 IP 地址,但这种方法更加巧妙。
imac:redis-cluster maho$ kubectl exec redis-cluster-0 -- redis-cli --cluster add-node \
> $(kubectl get pod redis-cluster-6 -o jsonpath='{.status.podIP}'):6379 \
> $(kubectl get pod redis-cluster-0 -o jsonpath='{.status.podIP}'):6379
>>> Adding node 172.30.254.10:6379 to cluster 172.30.254.8:6379
>>> Performing Cluster Check (using node 172.30.254.8:6379)
M: 901cdffd0f11f30e09ba2c22540ca2d31f1219e2 172.30.254.8:6379
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: 2b4a9d44f3576c239101afc16a002db6cc19be2f 172.30.103.133:6379
slots: (0 slots) slave
replicates 9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe
M: e3246d584d2390f7775ff168da4f8838939c6e53 172.30.103.132:6379
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
M: 9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 172.30.63.2:6379
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
S: cfc0a7bb96305308cc35fd6af37b59b2e2b117ae 172.30.63.3:6379
slots: (0 slots) slave
replicates e3246d584d2390f7775ff168da4f8838939c6e53
S: 3309f5c2f345f4d4facd5af967588d794225fef1 172.30.254.9:6379
slots: (0 slots) slave
replicates 901cdffd0f11f30e09ba2c22540ca2d31f1219e2
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 172.30.254.10:6379 to make it join the cluster.
[OK] New node added correctly.
这是有关获取 Pod IP 地址的方法的补充说明。仅执行 $() 中的内容,即可理解返回的是 redis-cluster-0 Pod 的 IP 地址。
$ kubectl get pod redis-cluster-0 -o jsonpath='{.status.podIP}';echo
172.30.254.8
显示Redis节点的列表
可以启动另一个对话型驻点,主节点可以确认4个,从节点可以确认3个状态。
$ kubectl run -it redis-cli --rm --image redis --restart=Never -- bash
If you don't see a command prompt, try pressing enter.
root@redis-cli:/data# redis-cli -c -h 172.30.254.8 -p 6379
172.30.254.8:6379> cluster nodes
2b4a9d44f3576c239101afc16a002db6cc19be2f 172.30.103.133:6379@16379 slave 9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 0 1564558966000 6 connected
e3246d584d2390f7775ff168da4f8838939c6e53 172.30.103.132:6379@16379 master - 0 1564558965000 2 connected 5461-10922
9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 172.30.63.2:6379@16379 master - 0 1564558966134 3 connected 10923-16383
f4631fb91f42024db54a829dfc00a12e9035761e 172.30.254.10:6379@16379 master - 0 1564558965000 0 connected <--- 追加
cfc0a7bb96305308cc35fd6af37b59b2e2b117ae 172.30.63.3:6379@16379 slave e3246d584d2390f7775ff168da4f8838939c6e53 0 1564558967137 5 connected
3309f5c2f345f4d4facd5af967588d794225fef1 172.30.254.9:6379@16379 slave 901cdffd0f11f30e09ba2c22540ca2d31f1219e2 0 1564558965128 4 connected
901cdffd0f11f30e09ba2c22540ca2d31f1219e2 172.30.254.8:6379@16379 myself,master - 0 1564558964000 1 connected 0-5460
添加一个Redis从服务器
由于我们添加了Redis主节点,所以我们需要添加一个用于复制该主节点数据的Redis从节点。我们可以使用指定的IP地址和端口号向Pod redis-cluster-0 发出通知以添加Pod redis-cluster-7。由于Redis集群的主节点都是相同级别的,所以在其他Redis主节点的Pod中进行也是可以的。
$ kubectl exec redis-cluster-0 -- redis-cli --cluster add-node --cluster-slave \
> $(kubectl get pod redis-cluster-7 -o jsonpath='{.status.podIP}'):6379 \
> $(kubectl get pod redis-cluster-0 -o jsonpath='{.status.podIP}'):6379
>>> Adding node 172.30.63.4:6379 to cluster 172.30.254.8:6379
>>> Performing Cluster Check (using node 172.30.254.8:6379)
M: 901cdffd0f11f30e09ba2c22540ca2d31f1219e2 172.30.254.8:6379
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: 2b4a9d44f3576c239101afc16a002db6cc19be2f 172.30.103.133:6379
slots: (0 slots) slave
replicates 9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe
M: e3246d584d2390f7775ff168da4f8838939c6e53 172.30.103.132:6379
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
M: 9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 172.30.63.2:6379
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
M: f4631fb91f42024db54a829dfc00a12e9035761e 172.30.254.10:6379
slots: (0 slots) master
S: cfc0a7bb96305308cc35fd6af37b59b2e2b117ae 172.30.63.3:6379
slots: (0 slots) slave
replicates e3246d584d2390f7775ff168da4f8838939c6e53
S: 3309f5c2f345f4d4facd5af967588d794225fef1 172.30.254.9:6379
slots: (0 slots) slave
replicates 901cdffd0f11f30e09ba2c22540ca2d31f1219e2
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
Automatically selected master 172.30.254.10:6379
>>> Send CLUSTER MEET to node 172.30.63.4:6379 to make it join the cluster.
Waiting for the cluster to join
>>> Configure node as replica of 172.30.254.10:6379.
[OK] New node added correctly.
列出主机和从机,并确认状态。
172.30.254.8:6379> cluster nodes
2b4a9d44f3576c239101afc16a002db6cc19be2f 172.30.103.133:6379@16379 slave 9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 0 1564559113000 6 connected
e3246d584d2390f7775ff168da4f8838939c6e53 172.30.103.132:6379@16379 master - 0 1564559115804 2 connected 5461-10922
9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 172.30.63.2:6379@16379 master - 0 1564559113000 3 connected 10923-16383
f4631fb91f42024db54a829dfc00a12e9035761e 172.30.254.10:6379@16379 master - 0 1564559114799 0 connected
63e279bbd0fd303240248dc530968ff8e0177d33 172.30.63.4:6379@16379 slave f4631fb91f42024db54a829dfc00a12e9035761e 0 1564559112789 0 connected
cfc0a7bb96305308cc35fd6af37b59b2e2b117ae 172.30.63.3:6379@16379 slave e3246d584d2390f7775ff168da4f8838939c6e53 0 1564559114000 5 connected
3309f5c2f345f4d4facd5af967588d794225fef1 172.30.254.9:6379@16379 slave 901cdffd0f11f30e09ba2c22540ca2d31f1219e2 0 1564559114000 4 connected
901cdffd0f11f30e09ba2c22540ca2d31f1219e2 172.30.254.8:6379@16379 myself,master - 0 1564559111000 1 connected 0-5460
由于可以添加群集的主控和从控配置,因此进行了重新平衡操作。
由于可以添加Redis主节点和从节点的Pod,因此我们需要对现有的Redis主节点进行数据重新平衡,以确保添加的Redis主节点和从节点能够正常运行。
$ kubectl exec redis-cluster-0 -- redis-cli --cluster rebalance --cluster-use-empty-masters \
> $(kubectl get pod redis-cluster-0 -o jsonpath='{.status.podIP}'):6379
>>> Performing Cluster Check (using node 172.30.254.8:6379)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Rebalancing across 4 nodes. Total weight = 4.00
Moving 1366 slots from 172.30.103.132:6379 to 172.30.254.10:6379
###### 省略
Moving 1365 slots from 172.30.63.2:6379 to 172.30.254.10:6379
###### 省略
Moving 1365 slots from 172.30.254.8:6379 to 172.30.254.10:6379
###### 省略
这样一来,主控和从控的扩展工作就完成了。
缩小Redis集群,减少Redis主节点和从节点的集合。
下一步是Redis集群的缩小方法。仅减少有数据的Redis集群的有状态复制品数量是不够的。因此,我将这个步骤列为以下项目:
-
- 削除対象のRedisマスターのスレーブを削除
-
- 削除対象のRedisマスターからデータを他のマスターへ追い出し
-
- Redisクラスタから削除対象のRedisマスターを削除
-
- 残ったRedisマスター間でデータをリバランス
- ステートフルセットのポッド数を減らして、Redisクラスタの規模を縮小
当使用有状态副本集来减少副本数量时,会按照最后添加的顺序进行删除。因此,如果通知副本数量为6,则会删除最后添加的redis-cluster-7和redis-cluster-6。
删除奴隶
将要删除的Redis集群”redis-cluster-7″的Redis集群ID做个备忘。
$ kubectl exec redis-cluster-7 -- redis-cli cluster nodes | grep myself
63e279bbd0fd303240248dc530968ff8e0177d33 172.30.63.4:6379@16379 myself,slave f4631fb91f42024db54a829dfc00a12e9035761e 0 1564559920000 7 connected
在 Redis 主服务器 redis-cluster-0 上执行删除 redis-cluster-7 的命令。
$ kubectl exec redis-cluster-0 -- redis-cli --cluster del-node \
> $(kubectl get pod redis-cluster-0 -o jsonpath='{.status.podIP}'):6379 \
> 63e279bbd0fd303240248dc530968ff8e0177d33
>>> Removing node 63e279bbd0fd303240248dc530968ff8e0177d33 from cluster 172.30.254.8:6379
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
command terminated with exit code 1
我不知道为什么退出代码为1而结束,但删除成功了。
我来确认一下从节点已经消失。删除后,主节点有4个,从节点有3个。
## 消す前
172.30.254.8:6379> cluster nodes
2b4a9d44f3576c239101afc16a002db6cc19be2f 172.30.103.133:6379@16379 slave 9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 0 1564559113000 6 connected
e3246d584d2390f7775ff168da4f8838939c6e53 172.30.103.132:6379@16379 master - 0 1564559115804 2 connected 5461-10922
9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 172.30.63.2:6379@16379 master - 0 1564559113000 3 connected 10923-16383
f4631fb91f42024db54a829dfc00a12e9035761e 172.30.254.10:6379@16379 master - 0 1564559114799 0 connected
63e279bbd0fd303240248dc530968ff8e0177d33 172.30.63.4:6379@16379 slave f4631fb91f42024db54a829dfc00a12e9035761e 0 1564559112789 0 connected
cfc0a7bb96305308cc35fd6af37b59b2e2b117ae 172.30.63.3:6379@16379 slave e3246d584d2390f7775ff168da4f8838939c6e53 0 1564559114000 5 connected
3309f5c2f345f4d4facd5af967588d794225fef1 172.30.254.9:6379@16379 slave 901cdffd0f11f30e09ba2c22540ca2d31f1219e2 0 1564559114000 4 connected
901cdffd0f11f30e09ba2c22540ca2d31f1219e2 172.30.254.8:6379@16379 myself,master - 0 1564559111000 1 connected 0-5460
## 削除後
172.30.254.8:6379> cluster nodes
2b4a9d44f3576c239101afc16a002db6cc19be2f 172.30.103.133:6379@16379 slave 9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 0 1564560086177 6 connected
e3246d584d2390f7775ff168da4f8838939c6e53 172.30.103.132:6379@16379 master - 0 1564560085000 2 connected 6827-10922
9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 172.30.63.2:6379@16379 master - 0 1564560085176 3 connected 12288-16383
f4631fb91f42024db54a829dfc00a12e9035761e 172.30.254.10:6379@16379 master - 0 1564560085000 8 connected 0-1364 5461-6826 10923-12287
cfc0a7bb96305308cc35fd6af37b59b2e2b117ae 172.30.63.3:6379@16379 slave e3246d584d2390f7775ff168da4f8838939c6e53 0 1564560084171 5 connected
3309f5c2f345f4d4facd5af967588d794225fef1 172.30.254.9:6379@16379 slave 901cdffd0f11f30e09ba2c22540ca2d31f1219e2 0 1564560083000 4 connected
901cdffd0f11f30e09ba2c22540ca2d31f1219e2 172.30.254.8:6379@16379 myself,master - 0 1564560082000 1 connected 1365-5460
删除Redis主节点
请记下要删除的Redis主节点的ID。获取名为redis-cluster-6的Pod中的Redis集群ID的命令如下。
$ kubectl exec redis-cluster-6 -- redis-cli cluster nodes | grep myself
f4631fb91f42024db54a829dfc00a12e9035761e 172.30.254.10:6379@16379 myself,master - 0 1564562825000 8 connected 0-1364 5461-6826 10923-12287
$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
redis-cli 1/1 Running 0 65m 172.30.254.11 10.193.10.14
redis-cluster-0 1/1 Running 0 104m 172.30.254.8 10.193.10.14
redis-cluster-1 1/1 Running 0 102m 172.30.103.132 10.193.10.58
redis-cluster-2 1/1 Running 0 100m 172.30.63.2 10.193.10.20
redis-cluster-3 1/1 Running 0 98m 172.30.254.9 10.193.10.14
redis-cluster-4 1/1 Running 0 96m 172.30.63.3 10.193.10.20
redis-cluster-5 1/1 Running 0 94m 172.30.103.133 10.193.10.58
redis-cluster-6 1/1 Running 0 70m 172.30.254.10 10.193.10.14
redis-cluster-7 1/1 Running 1 69m 172.30.63.4 10.193.10.20
执行数据迁移的reshard操作
将存储在 Redis 主节点“redis-cluster-6”的数据转移到其他 Redis 主节点。
–cluster-from 是指“redis-cluster-6”的ID。
–cluster-to 是指“redis-cluster-0”的ID。
换句话说,将 redis-cluster-6 的数据传输到 redis-cluster-0。然后,稍后会再次进行重新平衡,因此这将是一个临时转移。
$ kubectl exec redis-cluster-0 -- redis-cli --cluster reshard --cluster-yes ¥
--cluster-from f4631fb91f42024db54a829dfc00a12e9035761e ¥
--cluster-to 901cdffd0f11f30e09ba2c22540ca2d31f1219e2 ¥
--cluster-slots 16384 $(kubectl get pod redis-cluster-0 -o jsonpath='{.status.podIP}'):6379
>>> Performing Cluster Check (using node 172.30.254.8:6379)
M: 901cdffd0f11f30e09ba2c22540ca2d31f1219e2 172.30.254.8:6379
slots:[1365-5460] (4096 slots) master
1 additional replica(s)
S: 2b4a9d44f3576c239101afc16a002db6cc19be2f 172.30.103.133:6379
slots: (0 slots) slave
replicates 901cdffd0f11f30e09ba2c22540ca2d31f1219e2
M: e3246d584d2390f7775ff168da4f8838939c6e53 172.30.103.132:6379
slots:[6827-10922] (4096 slots) master
1 additional replica(s)
M: 9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 172.30.63.2:6379
slots:[12288-16383] (4096 slots) master
1 additional replica(s)
M: f4631fb91f42024db54a829dfc00a12e9035761e 172.30.254.10:6379
slots:[0-1364],[5461-6826],[10923-12287] (4096 slots) master
S: cfc0a7bb96305308cc35fd6af37b59b2e2b117ae 172.30.63.3:6379
slots: (0 slots) slave
replicates e3246d584d2390f7775ff168da4f8838939c6e53
S: 3309f5c2f345f4d4facd5af967588d794225fef1 172.30.254.9:6379
slots: (0 slots) slave
replicates 9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
Ready to move 16384 slots.
Source nodes:
M: f4631fb91f42024db54a829dfc00a12e9035761e 172.30.254.10:6379
slots:[0-1364],[5461-6826],[10923-12287] (4096 slots) master
Destination node:
M: 901cdffd0f11f30e09ba2c22540ca2d31f1219e2 172.30.254.8:6379
slots:[1365-5460] (4096 slots) master
1 additional replica(s)
Resharding plan:
Moving slot 0 from f4631fb91f42024db54a829dfc00a12e9035761e
Moving slot 1 from f4631fb91f42024db54a829dfc00a12e9035761e
Moving slot 2 from f4631fb91f42024db54a829dfc00a12e9035761e
范围为f4631fb91f42024db54a829dfc00a12e9035761e的哈希槽未显示,我们可以看出数据从redis-cluster-6丢失了。
$ kubectl exec redis-cluster-6 -- redis-cli cluster nodes | grep master
f4631fb91f42024db54a829dfc00a12e9035761e 172.30.254.10:6379@16379 myself,master - 0 1564574487000 8 connected
901cdffd0f11f30e09ba2c22540ca2d31f1219e2 172.30.254.8:6379@16379 master - 0 1564574488000 10 connected 0-6826 10923-12287
e3246d584d2390f7775ff168da4f8838939c6e53 172.30.103.132:6379@16379 master - 0 1564574490000 2 connected 6827-10922
9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 172.30.63.2:6379@16379 master - 0 1564574485735 9 connected 12288-16383
imac:redis-cluster maho$ kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE
redis-cli 1/1 Running 0 4h19m 172.30.254.11 10.193.10.14
redis-cluster-0 1/1 Running 0 4h58m 172.30.254.8 10.193.10.14
redis-cluster-1 1/1 Running 0 4h56m 172.30.103.132 10.193.10.58
redis-cluster-2 1/1 Running 0 4h54m 172.30.63.2 10.193.10.20
redis-cluster-3 1/1 Running 0 4h52m 172.30.254.9 10.193.10.14
redis-cluster-4 1/1 Running 0 4h50m 172.30.63.3 10.193.10.20
redis-cluster-5 1/1 Running 0 4h48m 172.30.103.133 10.193.10.58
redis-cluster-6 1/1 Running 0 4h25m 172.30.254.10 10.193.10.14
redis-cluster-7 1/1 Running 1 4h23m 172.30.63.4 10.193.10.20
从Redis集群中删除Redis主节点redis-cluster-6。
由于我们已经从Redis主节点redis-cluster-6中将数据迁移出来,所以我们将从Redis集群中删除redis-cluster-6。
kubectl exec redis-cluster-0 -- redis-cli --cluster del-node \
$(kubectl get pod redis-cluster-0 -o jsonpath='{.status.podIP}'):6379 \
f4631fb91f42024db54a829dfc00a12e9035761e
确认结果是删除。我们可以看出Redis的主节点已经减少到3个了。
172.30.103.132:6379> cluster nodes
cfc0a7bb96305308cc35fd6af37b59b2e2b117ae 172.30.63.3:6379@16379 slave e3246d584d2390f7775ff168da4f8838939c6e53 0 1564574917531 5 connected
2b4a9d44f3576c239101afc16a002db6cc19be2f 172.30.103.133:6379@16379 slave 901cdffd0f11f30e09ba2c22540ca2d31f1219e2 0 1564574919544 10 connected
9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 172.30.63.2:6379@16379 master - 0 1564574920548 9 connected 12288-16383
3309f5c2f345f4d4facd5af967588d794225fef1 172.30.254.9:6379@16379 slave 9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 0 1564574919000 9 connected
e3246d584d2390f7775ff168da4f8838939c6e53 172.30.103.132:6379@16379 myself,master - 0 1564574919000 2 connected 6827-10922
901cdffd0f11f30e09ba2c22540ca2d31f1219e2 172.30.254.8:6379@16379 master - 0 1564574920000 10 connected 0-6826 10923-12287
删除后,数据重新平衡
最后,进行一次再平衡操作,以实现资产均衡。
$ kubectl exec redis-cluster-0 -- redis-cli --cluster rebalance --cluster-use-empty-masters \
> $(kubectl get pod redis-cluster-0 -o jsonpath='{.status.podIP}'):6379
>>> Performing Cluster Check (using node 172.30.254.8:6379)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Rebalancing across 3 nodes. Total weight = 3.00
Moving 1366 slots from 172.30.254.8:6379 to 172.30.103.132:6379
###### 省略
Moving 1365 slots from 172.30.254.8:6379 to 172.30.63.2:6379
###### 省略
确认重新平衡后的状态。
172.30.103.132:6379> cluster nodes
cfc0a7bb96305308cc35fd6af37b59b2e2b117ae 172.30.63.3:6379@16379 slave e3246d584d2390f7775ff168da4f8838939c6e53 0 1564575387676 11 connected
2b4a9d44f3576c239101afc16a002db6cc19be2f 172.30.103.133:6379@16379 slave 901cdffd0f11f30e09ba2c22540ca2d31f1219e2 0 1564575390000 10 connected
9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 172.30.63.2:6379@16379 master - 0 1564575389000 12 connected 1366-2730 12288-16383
3309f5c2f345f4d4facd5af967588d794225fef1 172.30.254.9:6379@16379 slave 9a3951b4d0fd3c78a927ccb6e43ff62fe7875fbe 0 1564575388000 12 connected
e3246d584d2390f7775ff168da4f8838939c6e53 172.30.103.132:6379@16379 myself,master - 0 1564575388000 11 connected 0-1365 6827-10922
901cdffd0f11f30e09ba2c22540ca2d31f1219e2 172.30.254.8:6379@16379 master - 0 1564575390694 10 connected 2731-6826 10923-12287
减少有状态Pod的数量,缩小Redis集群的规模。
最后,我们将状态集合的副本数量减少到6,并观察redis-cluster-6和redis-cluster-7被删除的情况。
$ kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE
redis-cli 1/1 Running 0 4h35m 172.30.254.11 10.193.10.14
redis-cluster-0 1/1 Running 0 5h14m 172.30.254.8 10.193.10.14
redis-cluster-1 1/1 Running 0 5h12m 172.30.103.132 10.193.10.58
redis-cluster-2 1/1 Running 0 5h10m 172.30.63.2 10.193.10.20
redis-cluster-3 1/1 Running 0 5h8m 172.30.254.9 10.193.10.14
redis-cluster-4 1/1 Running 0 5h6m 172.30.63.3 10.193.10.20
redis-cluster-5 1/1 Running 0 5h4m 172.30.103.133 10.193.10.58
redis-cluster-6 1/1 Running 1 4h41m 172.30.254.10 10.193.10.14
redis-cluster-7 1/1 Running 1 4h39m 172.30.63.4 10.193.10.20
$ kubectl scale statefulset redis-cluster --replicas=6
statefulset.apps/redis-cluster scaled
$ kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE
redis-cli 1/1 Running 0 4h38m 172.30.254.11 10.193.10.14
redis-cluster-0 1/1 Running 0 5h17m 172.30.254.8 10.193.10.14
redis-cluster-1 1/1 Running 0 5h15m 172.30.103.132 10.193.10.58
redis-cluster-2 1/1 Running 0 5h13m 172.30.63.2 10.193.10.20
redis-cluster-3 1/1 Running 0 5h11m 172.30.254.9 10.193.10.14
redis-cluster-4 1/1 Running 0 5h9m 172.30.63.3 10.193.10.20
redis-cluster-5 1/1 Running 0 5h7m 172.30.103.133 10.193.10.58
Redis集群的缩容操作已完成。
清理簇集
以下是删除每个持久卷、全部节点控制器、以及其管理下的Pod和服务的方法。
$ kubectl delete statefulset,svc,configmap,pvc -l app=redis-cluster
statefulset.apps "redis-cluster" deleted
service "redis-cluster" deleted
configmap "redis-cluster" deleted
persistentvolumeclaim "data-redis-cluster-0" deleted
persistentvolumeclaim "data-redis-cluster-1" deleted
persistentvolumeclaim "data-redis-cluster-2" deleted
persistentvolumeclaim "data-redis-cluster-3" deleted
persistentvolumeclaim "data-redis-cluster-4" deleted
persistentvolumeclaim "data-redis-cluster-5" deleted
persistentvolumeclaim "data-redis-cluster-6" deleted
persistentvolumeclaim "data-redis-cluster-7" deleted
作者在最后附上的话
我对Red Hat推出的运算符能够提供方便实用的工作方式非常感兴趣。在此之前,我们需要先确认如何在多个区域部署。
请提供参考文献
- https://github.com/sanderploegsma/redis-cluster 的内容进行中文翻译