让我们来看看 [Oracle Cloud] 流媒体在键值行为方面的情况
首先
在 Oracle Cloud Infractructure(以下简称OCI)上,提供了名为Streaming的服务,可以实时收集和处理流数据。Streaming存储的数据以指定键(key)的形式存储,对应实际数据(value)。通过指定的键,大体上可以控制将数据存储在Streaming中的哪个分区。
让我们实际观察一下会有什么样的行为。需要注意的是,Streaming 提供了与 Kafka 兼容的 API,可以通过 Kafka 客户端存储数据,以确认分区是如何存储的。
假设条件
teststream1 这个流是由以下方式构成的,有3个分区。

无效的关键
我将在以下的Java代码中进行执行。
package jp.test.sugi;
import java.util.Properties;
import java.util.concurrent.Future;
import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.common.config.SaslConfigs;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.clients.producer.RecordMetadata;
import org.apache.kafka.common.serialization.StringSerializer;
public class App {
public static void main(final String[] args) {
System.out.println("Start.");
// 接続時の設定値を Properties インスタンスとして構築する
final Properties properties = new Properties();
properties.put(CommonClientConfigs.BOOTSTRAP_SERVERS_CONFIG,
"cell-1.streaming.ap-tokyo-1.oci.oraclecloud.com:9092");
properties.put(CommonClientConfigs.SECURITY_PROTOCOL_CONFIG, "SASL_SSL");
properties.put(SaslConfigs.SASL_MECHANISM, "PLAIN");
properties.put(SaslConfigs.SASL_JAAS_CONFIG,
"org.apache.kafka.common.security.plain.PlainLoginModule required username=\"poc02/oracleidentitycloudservice/suguru.sugiyama@oracle.com/ocid1.streampool.oc1.ap-tokyo-1.amaaaaaaycetm7yawtz56lnnerap4r45y4vheekgvhdaevxf3clfpuew6mla\" password=\"8t[shwUN}I-d+{}8Nx_a\";");
properties.put(CommonClientConfigs.RETRIES_CONFIG, 5);
properties.put("max.request.size", 1024 * 1024);
properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
// Producer を構築する
final KafkaProducer<String, String> producer = new KafkaProducer<>(properties, new StringSerializer(),
new StringSerializer());
try {
// トピックを指定してメッセージを送信する
for (int i = 0; i < 10; i++) {
String value = "message=" + i;
ProducerRecord<String, String> record = new ProducerRecord<>("teststream01", value);
Future<RecordMetadata> send = producer.send(record);
RecordMetadata recordMetadata = send.get();
System.out.print("partition: " + recordMetadata.partition() + ", ");
System.out.print("topic: " + recordMetadata.topic() + ", ");
System.out.println("value: " + value);
}
} catch (final Exception e) {
System.out.println("例外が発生しました。");
System.out.println(e);
} finally {
producer.close();
}
System.out.println("End.");
}
}
這裡是生成實際記錄的地方。由於未指定鍵,它將變為空值。
ProducerRecord<String, String> record = new ProducerRecord<>("teststream01", value);
查看执行结果后,确认数据已分散在三个分区中。
Start.
partition: 0, topic: teststream01, value: message=0
partition: 2, topic: teststream01, value: message=1
partition: 0, topic: teststream01, value: message=2
partition: 1, topic: teststream01, value: message=3
partition: 2, topic: teststream01, value: message=4
partition: 0, topic: teststream01, value: message=5
partition: 1, topic: teststream01, value: message=6
partition: 0, topic: teststream01, value: message=7
partition: 1, topic: teststream01, value: message=8
partition: 2, topic: teststream01, value: message=9
End.
指定关键
接下来,我们将指定Key并验证其运动。在流式处理中,与Kafka类似,根据Key的哈希值来确定存储的分区。当Key相同时,它们会被放置在同一个分区中。为了验证这个行为,让我们尝试使用相同的Key存储数据。
代码
package jp.test.sugi;
import java.util.Properties;
import java.util.concurrent.Future;
import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.common.config.SaslConfigs;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.clients.producer.RecordMetadata;
import org.apache.kafka.common.serialization.StringSerializer;
public class App {
public static void main(final String[] args) {
System.out.println("Start.");
// 接続時の設定値を Properties インスタンスとして構築する
final Properties properties = new Properties();
properties.put(CommonClientConfigs.BOOTSTRAP_SERVERS_CONFIG,
"cell-1.streaming.ap-tokyo-1.oci.oraclecloud.com:9092");
properties.put(CommonClientConfigs.SECURITY_PROTOCOL_CONFIG, "SASL_SSL");
properties.put(SaslConfigs.SASL_MECHANISM, "PLAIN");
properties.put(SaslConfigs.SASL_JAAS_CONFIG,
"org.apache.kafka.common.security.plain.PlainLoginModule required username=\"poc02/oracleidentitycloudservice/suguru.sugiyama@oracle.com/ocid1.streampool.oc1.ap-tokyo-1.amaaaaaaycetm7yawtz56lnnerap4r45y4vheekgvhdaevxf3clfpuew6mla\" password=\"8t[shwUN}I-d+{}8Nx_a\";");
properties.put(CommonClientConfigs.RETRIES_CONFIG, 5);
properties.put("max.request.size", 1024 * 1024);
properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
// Producer を構築する
final KafkaProducer<String, String> producer = new KafkaProducer<>(properties, new StringSerializer(),
new StringSerializer());
try {
// トピックを指定してメッセージを送信する
for (int i = 0; i < 10; i++) {
String value = "message=" + i;
ProducerRecord<String, String> record = new ProducerRecord<>("teststream01", "key1", value);
Future<RecordMetadata> send = producer.send(record);
RecordMetadata recordMetadata = send.get();
System.out.print("partition: " + recordMetadata.partition() + ", ");
System.out.print("topic: " + recordMetadata.topic() + ", ");
System.out.println("value: " + value);
}
} catch (final Exception e) {
System.out.println("例外が発生しました。");
System.out.println(e);
} finally {
producer.close();
}
System.out.println("End.");
}
}
指定key的位置在这里。指定的key是key1。
ProducerRecord<String, String> record = new ProducerRecord<>("teststream01", "key1", value);
看看执行示例,我们可以看出所有的分区都向2倾斜。
Start.
partition: 2, topic: teststream01, value: message=0
partition: 2, topic: teststream01, value: message=1
partition: 2, topic: teststream01, value: message=2
partition: 2, topic: teststream01, value: message=3
partition: 2, topic: teststream01, value: message=4
partition: 2, topic: teststream01, value: message=5
partition: 2, topic: teststream01, value: message=6
partition: 2, topic: teststream01, value: message=7
partition: 2, topic: teststream01, value: message=8
partition: 2, topic: teststream01, value: message=9
End.