Cassandra集群的配置和修复

首先

我们正在使用Cassandra作为分布式大数据数据库。在节点集群中进行节点添加、节点删除、复制因子更改以及修复等操作。

Cassandra 是什么?

网址:http://cassandra.apache.org/

image.png

来源:https://db-engines.com/en/ranking

集群構成策略

参考链接:http://cassandra.apache.org/doc/latest/architecture/dynamo.html#复制策略

网络拓扑策略

如果需要配置多个数据中心,应选择NetworkTopologyStrategy。

简单策略

如果只考虑单一数据中心的架构,选择SimpleStrategy。

查看Keyspace的当前Strategy。

select * from system_schema.keyspaces  where keyspace_name = 'my_keyspcae'
image.png

更改Keyspace的Strategy

ALTER KEYSPACE my_keyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3};

调整JVM参数

配置文件:/etc/cassandra/conf/jvm.options
这是关于堆设置和垃圾回收等设置的内容。
例如:

#################
# HEAP SETTINGS #
#################

# Heap size is automatically calculated by cassandra-env based on this
# formula: max(min(1/2 ram, 1024MB), min(1/4 ram, 8GB))
# That is:
# - calculate 1/2 ram and cap to 1024MB
# - calculate 1/4 ram and cap to 8192MB
# - pick the max
#
# For production use you may wish to adjust this for your environment.
# If that's the case, uncomment the -Xmx and Xms options below to override the
# automatic calculation of JVM heap memory.
#
# It is recommended to set min (-Xms) and max (-Xmx) heap sizes to
# the same value to avoid stop-the-world GC pauses during resize, and
# so that we can lock the heap in memory on startup to prevent any
# of it from being swapped out.
-Xms8G
-Xmx8G

请参考这篇文章:https://docs.datastax.com/en/ddac/doc/datastax_enterprise/operations/opsTuneJVM.html

在Cluster中添加Node。

更改cassandra.yaml的配置

设置文件:/etc/cassandra/conf/cassandra.yaml

# The name of the cluster. This is mainly used to prevent machines in
# one logical cluster from joining another.
cluster_name: 'MyCluster'

# Whether to start the thrift rpc server.
start_rpc: true

# For security reasons, you should not expose this port to the internet.  Firewall it if needed.
rpc_address: <Local ip>

# any class that implements the SeedProvider interface and has a
# constructor that takes a Map<String, String> of parameters will do.
seed_provider:
    # Addresses of hosts that are deemed contact points. 
    # Cassandra nodes use this list of hosts to find each other and learn
    # the topology of the ring.  You must change this if you are running
    # multiple nodes!
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
      parameters:
          # seeds is actually a comma-delimited list of addresses.
          # Ex: "<ip1>,<ip2>,<ip3>"
          - seeds: "<Local ip>,<Node1 ip>"

# Address or interface to bind to and tell other Cassandra nodes to connect to.
# You _must_ change this if you want multiple nodes to be able to communicate!
#
# Set listen_address OR listen_interface, not both.
#
# Leaving it blank leaves it up to InetAddress.getLocalHost(). This
# will always do the Right Thing _if_ the node is properly configured
# (hostname, name resolution, etc), and the Right Thing is to use the
# address associated with the hostname (it might not be).
#
# Setting listen_address to 0.0.0.0 is always wrong.
#
listen_address: <Local ip>

参考页面:http://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html

参考网页:http://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html

将其重新启动以将其添加到集群中。

rm -rf /var/lib/cassandra/data/system/*

service cassandra start

确认集群状态

nodetool status
image.png

对于多个节点的情况,它们会显示在列表中,并可了解到每个节点数据的所占比例。

如果只有一个节点,那么它将占据100%。当`replication_factor`是3、节点数量也是3时,`Owns`将为100%。

当节点数量超过4时,`Owns`将小于100%的数值。

從 Cluster 中刪除 Node

nodetool --host 外したいNodeのIP decommission -f

在登录想要退出的服务器后

nodetool decommission -f

修改复制因子后的修复

完全修复 xiū fù)

如果有更改replication_factor的情况,最好执行这个操作,因为它需要一段时间。

登录到任何一个节点,执行以下命令(仅一次)。

nodetool repair --full

分区器范围修复

nodetool repair -pr

在Keyspace中进行的Repair操作

nodetool repair mykeyspace

在指定的桌子上进行修复。

使用 nodetool repair 命令修复选项来修复中和这两个表。

nodetool repair mykeyspace mytable
image.png

参考网页:http://cassandra.apache.org/doc/latest/operating/repair.html?highlight=repair

以上 is the Chinese paraphrase for “above” or “the foregoing”.

bannerAds