让我们来看看 [Oracle Cloud] 流媒体在键值行为方面的情况

首先

在 Oracle Cloud Infractructure(以下简称OCI)上,提供了名为Streaming的服务,可以实时收集和处理流数据。Streaming存储的数据以指定键(key)的形式存储,对应实际数据(value)。通过指定的键,大体上可以控制将数据存储在Streaming中的哪个分区。

让我们实际观察一下会有什么样的行为。需要注意的是,Streaming 提供了与 Kafka 兼容的 API,可以通过 Kafka 客户端存储数据,以确认分区是如何存储的。

假设条件

teststream1 这个流是由以下方式构成的,有3个分区。

1588703929401.png

无效的关键

我将在以下的Java代码中进行执行。

package jp.test.sugi;

import java.util.Properties;
import java.util.concurrent.Future;

import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.common.config.SaslConfigs;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.clients.producer.RecordMetadata;
import org.apache.kafka.common.serialization.StringSerializer;

public class App {
    public static void main(final String[] args) {
        System.out.println("Start.");

        // 接続時の設定値を Properties インスタンスとして構築する
        final Properties properties = new Properties();

        properties.put(CommonClientConfigs.BOOTSTRAP_SERVERS_CONFIG,
                "cell-1.streaming.ap-tokyo-1.oci.oraclecloud.com:9092");
        properties.put(CommonClientConfigs.SECURITY_PROTOCOL_CONFIG, "SASL_SSL");
        properties.put(SaslConfigs.SASL_MECHANISM, "PLAIN");
        properties.put(SaslConfigs.SASL_JAAS_CONFIG,
                "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"poc02/oracleidentitycloudservice/suguru.sugiyama@oracle.com/ocid1.streampool.oc1.ap-tokyo-1.amaaaaaaycetm7yawtz56lnnerap4r45y4vheekgvhdaevxf3clfpuew6mla\" password=\"8t[shwUN}I-d+{}8Nx_a\";");
        properties.put(CommonClientConfigs.RETRIES_CONFIG, 5);
        properties.put("max.request.size", 1024 * 1024);
        properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);

        // Producer を構築する
        final KafkaProducer<String, String> producer = new KafkaProducer<>(properties, new StringSerializer(),
                new StringSerializer());

        try {
            // トピックを指定してメッセージを送信する
            for (int i = 0; i < 10; i++) {
                String value = "message=" + i;
                ProducerRecord<String, String> record = new ProducerRecord<>("teststream01", value);

                Future<RecordMetadata> send = producer.send(record);
                RecordMetadata recordMetadata = send.get();

                System.out.print("partition: " + recordMetadata.partition() + ", ");
                System.out.print("topic: " + recordMetadata.topic() + ", ");
                System.out.println("value: " + value);
            }
        } catch (final Exception e) {
            System.out.println("例外が発生しました。");
            System.out.println(e);
        } finally {
            producer.close();
        }

        System.out.println("End.");
    }
}

這裡是生成實際記錄的地方。由於未指定鍵,它將變為空值。

ProducerRecord<String, String> record = new ProducerRecord<>("teststream01", value);

查看执行结果后,确认数据已分散在三个分区中。

Start.
partition: 0, topic: teststream01, value: message=0
partition: 2, topic: teststream01, value: message=1
partition: 0, topic: teststream01, value: message=2
partition: 1, topic: teststream01, value: message=3
partition: 2, topic: teststream01, value: message=4
partition: 0, topic: teststream01, value: message=5
partition: 1, topic: teststream01, value: message=6
partition: 0, topic: teststream01, value: message=7
partition: 1, topic: teststream01, value: message=8
partition: 2, topic: teststream01, value: message=9
End.

指定关键

接下来,我们将指定Key并验证其运动。在流式处理中,与Kafka类似,根据Key的哈希值来确定存储的分区。当Key相同时,它们会被放置在同一个分区中。为了验证这个行为,让我们尝试使用相同的Key存储数据。

代码

package jp.test.sugi;

import java.util.Properties;
import java.util.concurrent.Future;

import org.apache.kafka.clients.CommonClientConfigs;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.common.config.SaslConfigs;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.clients.producer.RecordMetadata;
import org.apache.kafka.common.serialization.StringSerializer;

public class App {
    public static void main(final String[] args) {
        System.out.println("Start.");

        // 接続時の設定値を Properties インスタンスとして構築する
        final Properties properties = new Properties();

        properties.put(CommonClientConfigs.BOOTSTRAP_SERVERS_CONFIG,
                "cell-1.streaming.ap-tokyo-1.oci.oraclecloud.com:9092");
        properties.put(CommonClientConfigs.SECURITY_PROTOCOL_CONFIG, "SASL_SSL");
        properties.put(SaslConfigs.SASL_MECHANISM, "PLAIN");
        properties.put(SaslConfigs.SASL_JAAS_CONFIG,
                "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"poc02/oracleidentitycloudservice/suguru.sugiyama@oracle.com/ocid1.streampool.oc1.ap-tokyo-1.amaaaaaaycetm7yawtz56lnnerap4r45y4vheekgvhdaevxf3clfpuew6mla\" password=\"8t[shwUN}I-d+{}8Nx_a\";");
        properties.put(CommonClientConfigs.RETRIES_CONFIG, 5);
        properties.put("max.request.size", 1024 * 1024);
        properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);

        // Producer を構築する
        final KafkaProducer<String, String> producer = new KafkaProducer<>(properties, new StringSerializer(),
                new StringSerializer());

        try {
            // トピックを指定してメッセージを送信する
            for (int i = 0; i < 10; i++) {
                String value = "message=" + i;
                ProducerRecord<String, String> record = new ProducerRecord<>("teststream01", "key1", value);

                Future<RecordMetadata> send = producer.send(record);
                RecordMetadata recordMetadata = send.get();

                System.out.print("partition: " + recordMetadata.partition() + ", ");
                System.out.print("topic: " + recordMetadata.topic() + ", ");
                System.out.println("value: " + value);
            }
        } catch (final Exception e) {
            System.out.println("例外が発生しました。");
            System.out.println(e);
        } finally {
            producer.close();
        }

        System.out.println("End.");
    }
}

指定key的位置在这里。指定的key是key1。

ProducerRecord<String, String> record = new ProducerRecord<>("teststream01", "key1", value);

看看执行示例,我们可以看出所有的分区都向2倾斜。

Start.
partition: 2, topic: teststream01, value: message=0
partition: 2, topic: teststream01, value: message=1
partition: 2, topic: teststream01, value: message=2
partition: 2, topic: teststream01, value: message=3
partition: 2, topic: teststream01, value: message=4
partition: 2, topic: teststream01, value: message=5
partition: 2, topic: teststream01, value: message=6
partition: 2, topic: teststream01, value: message=7
partition: 2, topic: teststream01, value: message=8
partition: 2, topic: teststream01, value: message=9
End.
bannerAds