{"id":36844,"date":"2023-07-11T20:36:11","date_gmt":"2023-04-21T08:56:22","guid":{"rendered":"https:\/\/www.silicloud.com\/zh\/blog\/%e5%85%b3%e4%ba%8eapache-flume%e5%92%8cspark-streaming%e7%9a%84%e6%95%b4%e5%90%88\/"},"modified":"2024-04-30T13:06:28","modified_gmt":"2024-04-30T05:06:28","slug":"%e5%85%b3%e4%ba%8eapache-flume%e5%92%8cspark-streaming%e7%9a%84%e6%95%b4%e5%90%88","status":"publish","type":"post","link":"https:\/\/www.silicloud.com\/zh\/blog\/%e5%85%b3%e4%ba%8eapache-flume%e5%92%8cspark-streaming%e7%9a%84%e6%95%b4%e5%90%88\/","title":{"rendered":"\u5173\u4e8eApache Flume\u548cSpark Streaming\u7684\u6574\u5408"},"content":{"rendered":"<p>\u5047\u8bbe\u5206\u6790Web\u670d\u52a1\u5668\u7684\u8bbf\u95ee\u65e5\u5fd7\u7684\u5b9e\u65f6\u6d41\u5f0f\u5904\u7406\u662f\u4e00\u79cd\u5e38\u89c1\u7684\u5e94\u7528\u573a\u666f\u3002\u8fd9\u7bc7\u6587\u7ae0\u5c06\u4ecb\u7ecd\u5173\u4e8e\u5982\u4f55\u96c6\u6210\u540d\u4e3aApache Flume\u7684\u65e5\u5fd7\u6536\u96c6\u5e73\u53f0\u548cE-MapReduce\u96c6\u7fa4\u7684Spark Streaming\u5206\u6790\u5e73\u53f0\u7684\u65b9\u6cd5\u3002<\/p>\n<div><img decoding=\"async\" class=\"post-images\" title=\"\" src=\"https:\/\/cdn.silicloud.com\/blog-img\/blog\/img\/657d2af237434c4406c48ef3\/1-0.png\" alt=\"\" \/><\/div>\n<ul class=\"post-ul\">\u524d\u63d0<\/ul>\n<ul class=\"post-ul\">\n<li style=\"list-style-type: none;\">\n<ul class=\"post-ul\">EMR-3.16.0<\/ul>\n<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<ul class=\"post-ul\">\n<li style=\"list-style-type: none;\">\n<ul class=\"post-ul\">\u30af\u30e9\u30b9\u30bf\u30fc\u30bf\u30a4\u30d7\u306f Hadoop<\/ul>\n<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<ul class=\"post-ul\">\n<li style=\"list-style-type: none;\">\n<ul class=\"post-ul\">\u30cf\u30fc\u30c9\u30a6\u30a7\u30a2\u69cb\u6210(Header)\u306fecs.sn1ne.2xlarge\u30921\u53f0<\/ul>\n<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<ul class=\"post-ul\">\u30cf\u30fc\u30c9\u30a6\u30a7\u30a2\u69cb\u6210(Worker)\u306fecs.sn1ne.2xlarge\u30923\u53f0<\/ul>\n<pre class=\"post-pre\"><code># cat \/etc\/redhat-release\r\nCentOS Linux release 7.4.1708 (Core) \r\n# uname -r\r\n3.10.0-693.2.2.el7.x86_64\r\n# flume-ng version\r\nFlume 1.8.0\r\nSource code repository: https:\/\/git-wip-us.apache.org\/repos\/asf\/flume.git\r\nRevision: Unknown\r\nCompiled by root on Wed Nov 28 11:09:28 CST 2018\r\nFrom source with checksum 63b5d03c9afd862ff786f7826ffe55d0\r\n# hadoop version\r\nHadoop 2.7.2\r\nSubversion http:\/\/gitlab.alibaba-inc.com\/soe\/emr-hadoop.git -r d2cd70f951304b8ca3d12991262e7e0d321abefc\r\nCompiled by root on 2018-11-30T09:31Z\r\nCompiled with protoc 2.5.0\r\nFrom source with checksum 4447ed9f24dcd981df7daaadd5bafc0\r\nThis command was run using \/opt\/apps\/ecm\/service\/hadoop\/2.7.2-1.3.2\/package\/hadoop-2.7.2-1.3.2\/share\/hadoop\/common\/hadoop-common-2.7.2.jar<\/code><\/pre>\n<ul class=\"post-ul\">Flume\u306e\u8a2d\u5b9a<\/ul>\n<p>\u5173\u4e8eFlume\u7684\u4f7f\u7528\u65b9\u6cd5\uff0c\u4e0d\u5728\u6b64\u5904\u8fdb\u884c\u89e3\u91ca\uff0c\u4f46\u5e0c\u671b\u5bf9\u611f\u5174\u8da3\u7684\u4eba\u80fd\u53c2\u8003\u5b98\u65b9\u6587\u6863\u3002Flume\u7684\u914d\u7f6e\u6587\u4ef6\u5982\u4e0b\u6240\u793a\u3002\u6e90\u548c\u6c47\u5206\u522b\u914d\u7f6e\u4e3aSpooldir\u548cAvro\u5ba2\u6237\u7aef\u3002<\/p>\n<pre class=\"post-pre\"><code>\r\na1.sources = r1\r\na1.sinks = k1\r\na1.channels = c1\r\n\r\na1.sources.r1.type = spooldir\r\na1.sources.r1.spoolDir = \/root\/spool\r\na1.sources.r1.fileHeader = true\r\n\r\na1.sinks.k1.type = avro\r\na1.sinks.k1.hostname = localhost\r\na1.sinks.k1.port = 9906\r\n\r\na1.channels.c1.type = memory\r\na1.channels.c1.capacity = 1000\r\na1.channels.c1.transactionCapacity = 100\r\n\r\na1.sources.r1.channels = c1\r\na1.sinks.k1.channel = c1<\/code><\/pre>\n<ul class=\"post-ul\">Spark Streaming\u306b\u3064\u3044\u3066<\/ul>\n<p>Spark Streaming\u63d0\u4f9b\u4e86\u57fa\u4e8e\u5fae\u6279\u5904\u7406\u7684\u6d41\u6570\u636e\u5904\u7406\u529f\u80fd\uff0c\u53ef\u4ee5\u5728\u51e0\u79d2\u949f\u5230\u51e0\u5206\u949f\u7684\u77ed\u65f6\u95f4\u95f4\u9694\u5185\u91cd\u590d\u6267\u884c\u3002\u672c\u6b21\u4f7f\u7528\u7684\u65f6\u95f4\u95f4\u9694\u6570\u636e\u5982\u4e0b\uff1a<\/p>\n<div>\n<div class=\"post-table\">DStream\u306e\u30d0\u30c3\u30c1\u9593\u9694\uff11\u79d2\u30b9\u30e9\u30a4\u30c7\u30a3\u30f3\u30b0\u9593\u9694\uff11\u79d2\u30a6\u30a3\u30f3\u30c9\u30a6\u30b5\u30a4\u30ba300 \u79d2<\/div>\n<\/div>\n<pre class=\"post-pre\"><code><span class=\"kn\">import<\/span> <span class=\"nn\">re<\/span>\r\n<span class=\"kn\">from<\/span> <span class=\"nn\">pyspark<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">SparkContext<\/span>\r\n<span class=\"kn\">from<\/span> <span class=\"nn\">pyspark.streaming<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">StreamingContext<\/span>\r\n<span class=\"kn\">from<\/span> <span class=\"nn\">pyspark.streaming.flume<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">FlumeUtils<\/span>\r\n\r\n<span class=\"n\">parts<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[<\/span>\r\n    <span class=\"sa\">r<\/span><span class=\"s\">'(?P&amp;lt;host&amp;gt;\\S+)'<\/span><span class=\"p\">,<\/span>                   \r\n    <span class=\"sa\">r<\/span><span class=\"s\">'\\S+'<\/span><span class=\"p\">,<\/span>                            \r\n    <span class=\"sa\">r<\/span><span class=\"s\">'(?P&amp;lt;user&amp;gt;\\S+)'<\/span><span class=\"p\">,<\/span>                  \r\n    <span class=\"sa\">r<\/span><span class=\"s\">'\\[(?P&amp;lt;time&amp;gt;.+)\\]'<\/span><span class=\"p\">,<\/span>               \r\n    <span class=\"sa\">r<\/span><span class=\"s\">'\"(?P&amp;lt;request&amp;gt;.+)\"'<\/span><span class=\"p\">,<\/span>               \r\n    <span class=\"sa\">r<\/span><span class=\"s\">'(?P&amp;lt;status&amp;gt;[0-9]+)'<\/span><span class=\"p\">,<\/span>              \r\n    <span class=\"sa\">r<\/span><span class=\"s\">'(?P&amp;lt;size&amp;gt;\\S+)'<\/span><span class=\"p\">,<\/span>                   \r\n    <span class=\"sa\">r<\/span><span class=\"s\">'\"(?P&amp;lt;referer&amp;gt;.*)\"'<\/span><span class=\"p\">,<\/span>               \r\n    <span class=\"sa\">r<\/span><span class=\"s\">'\"(?P&amp;lt;agent&amp;gt;.*)\"'<\/span><span class=\"p\">,<\/span> \r\n<span class=\"p\">]<\/span>\r\n<span class=\"n\">pattern<\/span> <span class=\"o\">=<\/span> <span class=\"n\">re<\/span><span class=\"p\">.<\/span><span class=\"nb\">compile<\/span><span class=\"p\">(<\/span><span class=\"sa\">r<\/span><span class=\"s\">'\\s+'<\/span><span class=\"p\">.<\/span><span class=\"n\">join<\/span><span class=\"p\">(<\/span><span class=\"n\">parts<\/span><span class=\"p\">)<\/span><span class=\"o\">+<\/span><span class=\"sa\">r<\/span><span class=\"s\">'\\s*\\Z'<\/span><span class=\"p\">)<\/span>\r\n\r\n<span class=\"k\">def<\/span> <span class=\"nf\">extractURLRequest<\/span><span class=\"p\">(<\/span><span class=\"n\">line<\/span><span class=\"p\">):<\/span>\r\n    <span class=\"n\">exp<\/span> <span class=\"o\">=<\/span> <span class=\"n\">pattern<\/span><span class=\"p\">.<\/span><span class=\"n\">match<\/span><span class=\"p\">(<\/span><span class=\"n\">line<\/span><span class=\"p\">)<\/span>\r\n    <span class=\"k\">if<\/span> <span class=\"n\">exp<\/span><span class=\"p\">:<\/span>\r\n        <span class=\"n\">request<\/span> <span class=\"o\">=<\/span> <span class=\"n\">exp<\/span><span class=\"p\">.<\/span><span class=\"n\">groupdict<\/span><span class=\"p\">()[<\/span><span class=\"s\">\"request\"<\/span><span class=\"p\">]<\/span>\r\n        <span class=\"k\">if<\/span> <span class=\"n\">request<\/span><span class=\"p\">:<\/span>\r\n           <span class=\"n\">requestFields<\/span> <span class=\"o\">=<\/span> <span class=\"n\">request<\/span><span class=\"p\">.<\/span><span class=\"n\">split<\/span><span class=\"p\">()<\/span>\r\n           <span class=\"k\">if<\/span> <span class=\"p\">(<\/span><span class=\"nb\">len<\/span><span class=\"p\">(<\/span><span class=\"n\">requestFields<\/span><span class=\"p\">)<\/span> <span class=\"o\">&amp;<\/span><span class=\"n\">gt<\/span><span class=\"p\">;<\/span> <span class=\"mi\">1<\/span><span class=\"p\">):<\/span>\r\n                <span class=\"k\">return<\/span> <span class=\"n\">requestFields<\/span><span class=\"p\">[<\/span><span class=\"mi\">1<\/span><span class=\"p\">]<\/span>\r\n\r\n<span class=\"k\">if<\/span> <span class=\"n\">__name__<\/span> <span class=\"o\">==<\/span> <span class=\"s\">\"__main__\"<\/span><span class=\"p\">:<\/span>\r\n\r\n    <span class=\"n\">sc<\/span> <span class=\"o\">=<\/span> <span class=\"n\">SparkContext<\/span><span class=\"p\">(<\/span><span class=\"n\">appName<\/span><span class=\"o\">=<\/span><span class=\"s\">\"StreamingFlumeLogAggregator\"<\/span><span class=\"p\">)<\/span>\r\n    <span class=\"n\">sc<\/span><span class=\"p\">.<\/span><span class=\"n\">setLogLevel<\/span><span class=\"p\">(<\/span><span class=\"s\">\"ERROR\"<\/span><span class=\"p\">)<\/span>\r\n    <span class=\"n\">ssc<\/span> <span class=\"o\">=<\/span> <span class=\"n\">StreamingContext<\/span><span class=\"p\">(<\/span><span class=\"n\">sc<\/span><span class=\"p\">,<\/span> <span class=\"mi\">1<\/span><span class=\"p\">)<\/span>\r\n    <span class=\"n\">flumeStream<\/span> <span class=\"o\">=<\/span> <span class=\"n\">FlumeUtils<\/span><span class=\"p\">.<\/span><span class=\"n\">createStream<\/span><span class=\"p\">(<\/span><span class=\"n\">ssc<\/span><span class=\"p\">,<\/span> <span class=\"s\">\"localhost\"<\/span><span class=\"p\">,<\/span> <span class=\"mi\">9906<\/span><span class=\"p\">)<\/span>\r\n    <span class=\"n\">lines<\/span> <span class=\"o\">=<\/span> <span class=\"n\">flumeStream<\/span><span class=\"p\">.<\/span><span class=\"nb\">map<\/span><span class=\"p\">(<\/span><span class=\"k\">lambda<\/span> <span class=\"n\">x<\/span><span class=\"p\">:<\/span> <span class=\"n\">x<\/span><span class=\"p\">[<\/span><span class=\"mi\">1<\/span><span class=\"p\">])<\/span>\r\n    <span class=\"n\">urls<\/span> <span class=\"o\">=<\/span> <span class=\"n\">lines<\/span><span class=\"p\">.<\/span><span class=\"nb\">map<\/span><span class=\"p\">(<\/span><span class=\"n\">extractURLRequest<\/span><span class=\"p\">)<\/span>   \r\n    <span class=\"n\">urlCounts<\/span> <span class=\"o\">=<\/span> <span class=\"n\">urls<\/span><span class=\"p\">.<\/span><span class=\"nb\">map<\/span><span class=\"p\">(<\/span><span class=\"k\">lambda<\/span> <span class=\"n\">x<\/span><span class=\"p\">:<\/span> <span class=\"p\">(<\/span><span class=\"n\">x<\/span><span class=\"p\">,<\/span> <span class=\"mi\">1<\/span><span class=\"p\">)).<\/span><span class=\"n\">reduceByKeyAndWindow<\/span><span class=\"p\">(<\/span><span class=\"k\">lambda<\/span> <span class=\"n\">x<\/span><span class=\"p\">,<\/span> <span class=\"n\">y<\/span><span class=\"p\">:<\/span> <span class=\"n\">x<\/span> <span class=\"o\">+<\/span> <span class=\"n\">y<\/span><span class=\"p\">,<\/span> <span class=\"k\">lambda<\/span> <span class=\"n\">x<\/span><span class=\"p\">,<\/span> <span class=\"n\">y<\/span> <span class=\"p\">:<\/span> <span class=\"n\">x<\/span> <span class=\"o\">-<\/span> <span class=\"n\">y<\/span><span class=\"p\">,<\/span> <span class=\"mi\">300<\/span><span class=\"p\">,<\/span> <span class=\"mi\">1<\/span><span class=\"p\">)<\/span>\r\n\r\n    <span class=\"n\">sortedResults<\/span> <span class=\"o\">=<\/span> <span class=\"n\">urlCounts<\/span><span class=\"p\">.<\/span><span class=\"n\">transform<\/span><span class=\"p\">(<\/span><span class=\"k\">lambda<\/span> <span class=\"n\">rdd<\/span><span class=\"p\">:<\/span> <span class=\"n\">rdd<\/span><span class=\"p\">.<\/span><span class=\"n\">sortBy<\/span><span class=\"p\">(<\/span><span class=\"k\">lambda<\/span> <span class=\"n\">x<\/span><span class=\"p\">:<\/span> <span class=\"n\">x<\/span><span class=\"p\">[<\/span><span class=\"mi\">1<\/span><span class=\"p\">],<\/span> <span class=\"bp\">False<\/span><span class=\"p\">))<\/span>\r\n    <span class=\"n\">sortedResults<\/span><span class=\"p\">.<\/span><span class=\"n\">pprint<\/span><span class=\"p\">()<\/span>\r\n    <span class=\"n\">ssc<\/span><span class=\"p\">.<\/span><span class=\"n\">checkpoint<\/span><span class=\"p\">(<\/span><span class=\"s\">\"\/root\/checkpoint\"<\/span><span class=\"p\">)<\/span>\r\n    <span class=\"n\">ssc<\/span><span class=\"p\">.<\/span><span class=\"n\">start<\/span><span class=\"p\">()<\/span>\r\n    <span class=\"n\">ssc<\/span><span class=\"p\">.<\/span><span class=\"n\">awaitTermination<\/span><span class=\"p\">()<\/span>\r\n<\/code><\/pre>\n<ul class=\"post-ul\">\u5b9f\u884c\u3059\u308b<\/ul>\n<p>\u5c06\u7528\u4e8e\u5206\u6790\u7684\u8bbf\u95ee\u65e5\u5fd7\u5bfc\u5165\u5230SpoolDir\uff08\/root\/spool\uff09\u4e2d\uff0c\u7136\u540e\u4f7f\u7528\u4e0b\u9762\u7684\u547d\u4ee4\u542f\u52a8Flume NG\u3002\u7136\u540e\uff0c\u5728\u53e6\u4e00\u4e2a\u8fdb\u7a0b\u4e2d\u542f\u52a8\u4e0a\u8ff0\u7684SparkStreaming\u5e94\u7528\u7a0b\u5e8f\u3002<\/p>\n<pre class=\"post-pre\"><code># bin\/flume-ng agent --conf conf --conf-file ~\/sparkstreamingflume.conf --name a1 -Dflume.root.logger=INFO,console\r\n\r\nspark-submit --packages org.apache.spark:spark-streaming-flume_2.11:2.3.2 SparkFlume.py<\/code><\/pre>\n<pre class=\"post-pre\"><code># tail  access_log.txt \r\n46.166.139.20 - - [06\/Dec\/2015:03:14:54 +0000] \"POST \/xmlrpc.php HTTP\/1.0\" 200 370 \"-\" \"Mozilla\/4.0 (compatible: MSIE 7.0; Windows NT 6.0)\"\r\n46.166.139.20 - - [06\/Dec\/2015:03:14:54 +0000] \"POST \/xmlrpc.php HTTP\/1.0\" 200 370 \"-\" \"Mozilla\/4.0 (compatible: MSIE 7.0; Windows NT 6.0)\"\r\n46.166.139.20 - - [06\/Dec\/2015:03:14:55 +0000] \"POST \/xmlrpc.php HTTP\/1.0\" 200 370 \"-\" \"Mozilla\/4.0 (compatible: MSIE 7.0; Windows NT 6.0)\"\r\n46.166.139.20 - - [06\/Dec\/2015:03:14:55 +0000] \"POST \/xmlrpc.php HTTP\/1.0\" 200 370 \"-\" \"Mozilla\/4.0 (compatible: MSIE 7.0; Windows NT 6.0)\"\r\n46.166.139.20 - - [06\/Dec\/2015:03:14:56 +0000] \"POST \/xmlrpc.php HTTP\/1.0\" 200 370 \"-\" \"Mozilla\/4.0 (compatible: MSIE 7.0; Windows NT 6.0)\"\r\n46.166.139.20 - - [06\/Dec\/2015:03:14:56 +0000] \"POST \/xmlrpc.php HTTP\/1.0\" 200 370 \"-\" \"Mozilla\/4.0 (compatible: MSIE 7.0; Windows NT 6.0)\"\r\n46.166.139.20 - - [06\/Dec\/2015:03:14:57 +0000] \"POST \/xmlrpc.php HTTP\/1.0\" 200 370 \"-\" \"Mozilla\/4.0 (compatible: MSIE 7.0; Windows NT 6.0)\"\r\n46.166.139.20 - - [06\/Dec\/2015:03:14:58 +0000] \"POST \/xmlrpc.php HTTP\/1.0\" 200 370 \"-\" \"Mozilla\/4.0 (compatible: MSIE 7.0; Windows NT 6.0)\"\r\n46.166.139.20 - - [06\/Dec\/2015:03:14:58 +0000] \"POST \/xmlrpc.php HTTP\/1.0\" 200 370 \"-\" \"Mozilla\/4.0 (compatible: MSIE 7.0; Windows NT 6.0)\"\r\n46.166.139.20 - - [06\/Dec\/2015:03:14:59 +0000] \"POST \/xmlrpc.php HTTP\/1.0\" 200 370 \"-\" \"Mozilla\/4.0 (compatible: MSIE 7.0; Windows NT 6.0)\"<\/code><\/pre>\n<p>\u73b0\u5728\u6211\u4eec\u80fd\u591f\u4ee5\u6bcf\u79d2\u4e00\u6b21\u7684\u5b9e\u65f6\u5904\u7406\u65b9\u5f0f\u6765\u5904\u7406\u4ee5\u4e0b\u5185\u5bb9\uff0c\u8f93\u51fa\u4e86\u6bcf\u4e2a\u8bbf\u95eeURL\u7684\u8bbf\u95ee\u6b21\u6570\uff08\u524d10\u4f4d\uff09\u5217\u8868\u3002<\/p>\n<pre class=\"post-pre\"><code>-------------------------------------------\r\n\r\nTime: 2019-03-15 14:02:19\r\n\r\n(u'\/xmlrpc.php', 8509)\r\n(u'\/wp-login.php', 1798)\r\n(u'\/', 119)\r\n(u'\/robots.txt', 44)\r\n(u'\/blog\/', 36)\r\n(u'\/page-sitemap.xml', 29)\r\n(u'\/post-sitemap.xml', 29)\r\n(u'\/category-sitemap.xml', 29)\r\n(u'\/sitemap_index.xml', 29)\r\n(u'http:\/\/51.254.206.142\/httptest.php', 26)\r\n...\r\n\r\n\r\n\r\nTime: 2019-03-15 14:02:20\r\n\r\n(u'\/xmlrpc.php', 68415)\r\n(u'\/wp-login.php', 1923)\r\n(u'\/', 440)\r\n(u'\/blog\/', 138)\r\n(u'\/robots.txt', 123)\r\n(u'\/post-sitemap.xml', 118)\r\n(u'\/sitemap_index.xml', 118)\r\n(u'\/page-sitemap.xml', 117)\r\n(u'\/category-sitemap.xml', 117)\r\n(u'\/orlando-headlines\/', 95)\r\n...\r\n\r\n\r\n\r\nTime: 2019-03-15 14:02:21\r\n\r\n(u'\/xmlrpc.php', 68415)\r\n(u'\/wp-login.php', 1923)\r\n(u'\/', 440)\r\n(u'\/blog\/', 138)\r\n(u'\/robots.txt', 123)\r\n(u'\/post-sitemap.xml', 118)\r\n(u'\/sitemap_index.xml', 118)\r\n(u'\/page-sitemap.xml', 117)\r\n(u'\/category-sitemap.xml', 117)\r\n(u'\/orlando-headlines\/', 95)\r\n...<\/code><\/pre>\n<ul class=\"post-ul\">\u6700\u5f8c<\/ul>\n<p>\u5927\u5bb6\u597d\uff0c\u60a8\u5bf9\u7ed3\u5408Spark Streaming\u548cApache Flume\u6784\u5efa\u7684\u6d41\u6570\u636e\u5904\u7406\u7cfb\u7edf\u4ee5\u53ca\u9a8c\u8bc1\u7ed3\u679c\u7684\u4ecb\u7ecd\u611f\u89c9\u5982\u4f55\u5462\uff1f\u5b9e\u9645\u4e0a\uff0c\u9664\u4e86Web\u8bbf\u95ee\u4e4b\u5916\uff0c\u6211\u8ba4\u4e3a\u8fd9\u79cd\u7cfb\u7edf\u5728\u5b9e\u73b0\u5b9e\u65f6\u5206\u6790\u548c\u76d1\u63a7\u5404\u79cd\u65e5\u5e38\u4e1a\u52a1\u65e5\u5fd7\u3001\u6d41\u91cf\u65e5\u5fd7\u3001\u90ae\u4ef6\u6570\u636e\u7b49\u65b9\u9762\u4e5f\u662f\u53ef\u884c\u7684\uff0c\u5e0c\u671b\u60a8\u53ef\u4ee5\u8003\u8651\u4e00\u4e0b\u3002<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u5047\u8bbe\u5206\u6790Web\u670d\u52a1\u5668\u7684\u8bbf\u95ee\u65e5\u5fd7\u7684\u5b9e\u65f6\u6d41\u5f0f\u5904\u7406\u662f\u4e00\u79cd\u5e38\u89c1\u7684\u5e94\u7528\u573a\u666f\u3002\u8fd9\u7bc7\u6587\u7ae0\u5c06\u4ecb\u7ecd\u5173\u4e8e\u5982\u4f55\u96c6\u6210\u540d\u4e3aApache  [&hellip;]<\/p>\n","protected":false},"author":9,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-36844","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v21.5 (Yoast SEO v21.5) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>\u5173\u4e8eApache Flume\u548cSpark Streaming\u7684\u6574\u5408 - Blog - Silicon Cloud<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.silicloud.com\/zh\/blog\/\u5173\u4e8eapache-flume\u548cspark-streaming\u7684\u6574\u5408\/\" \/>\n<meta property=\"og:locale\" content=\"zh_CN\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"\u5173\u4e8eApache Flume\u548cSpark Streaming\u7684\u6574\u5408\" \/>\n<meta property=\"og:description\" content=\"\u5047\u8bbe\u5206\u6790Web\u670d\u52a1\u5668\u7684\u8bbf\u95ee\u65e5\u5fd7\u7684\u5b9e\u65f6\u6d41\u5f0f\u5904\u7406\u662f\u4e00\u79cd\u5e38\u89c1\u7684\u5e94\u7528\u573a\u666f\u3002\u8fd9\u7bc7\u6587\u7ae0\u5c06\u4ecb\u7ecd\u5173\u4e8e\u5982\u4f55\u96c6\u6210\u540d\u4e3aApache [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.silicloud.com\/zh\/blog\/\u5173\u4e8eapache-flume\u548cspark-streaming\u7684\u6574\u5408\/\" \/>\n<meta property=\"og:site_name\" content=\"Blog - Silicon Cloud\" \/>\n<meta property=\"article:published_time\" content=\"2023-04-21T08:56:22+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-04-30T05:06:28+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/cdn.silicloud.com\/blog-img\/blog\/img\/657d2af237434c4406c48ef3\/1-0.png\" \/>\n<meta name=\"author\" content=\"\u6e05, \u626c\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"\u4f5c\u8005\" \/>\n\t<meta name=\"twitter:data1\" content=\"\u6e05, \u626c\" \/>\n\t<meta name=\"twitter:label2\" content=\"\u9884\u8ba1\u9605\u8bfb\u65f6\u95f4\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 \u5206\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/%e5%85%b3%e4%ba%8eapache-flume%e5%92%8cspark-streaming%e7%9a%84%e6%95%b4%e5%90%88\/\",\"url\":\"https:\/\/www.silicloud.com\/zh\/blog\/%e5%85%b3%e4%ba%8eapache-flume%e5%92%8cspark-streaming%e7%9a%84%e6%95%b4%e5%90%88\/\",\"name\":\"\u5173\u4e8eApache Flume\u548cSpark Streaming\u7684\u6574\u5408 - Blog - Silicon Cloud\",\"isPartOf\":{\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/#website\"},\"datePublished\":\"2023-04-21T08:56:22+00:00\",\"dateModified\":\"2024-04-30T05:06:28+00:00\",\"author\":{\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/cb5556d2501da73d864cac945e8d9461\"},\"breadcrumb\":{\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/%e5%85%b3%e4%ba%8eapache-flume%e5%92%8cspark-streaming%e7%9a%84%e6%95%b4%e5%90%88\/#breadcrumb\"},\"inLanguage\":\"zh-Hans\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.silicloud.com\/zh\/blog\/%e5%85%b3%e4%ba%8eapache-flume%e5%92%8cspark-streaming%e7%9a%84%e6%95%b4%e5%90%88\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/%e5%85%b3%e4%ba%8eapache-flume%e5%92%8cspark-streaming%e7%9a%84%e6%95%b4%e5%90%88\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"\u9996\u9875\",\"item\":\"https:\/\/www.silicloud.com\/zh\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"\u5173\u4e8eApache Flume\u548cSpark Streaming\u7684\u6574\u5408\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/#website\",\"url\":\"https:\/\/www.silicloud.com\/zh\/blog\/\",\"name\":\"Blog - Silicon Cloud\",\"description\":\"\",\"inLanguage\":\"zh-Hans\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/cb5556d2501da73d864cac945e8d9461\",\"name\":\"\u6e05, \u626c\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"zh-Hans\",\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/32a4239de8ff29adace466261d309424a1e5fe9f7e3036bf89fe03f2e3dbe717?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/32a4239de8ff29adace466261d309424a1e5fe9f7e3036bf89fe03f2e3dbe717?s=96&d=mm&r=g\",\"caption\":\"\u6e05, \u626c\"},\"url\":\"https:\/\/www.silicloud.com\/zh\/blog\/author\/qingyang\/\"},{\"@type\":\"ImageObject\",\"inLanguage\":\"zh-Hans\",\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/%e5%85%b3%e4%ba%8eapache-flume%e5%92%8cspark-streaming%e7%9a%84%e6%95%b4%e5%90%88\/#local-main-organization-logo\",\"url\":\"\",\"contentUrl\":\"\",\"caption\":\"Blog - Silicon Cloud\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"\u5173\u4e8eApache Flume\u548cSpark Streaming\u7684\u6574\u5408 - Blog - Silicon Cloud","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.silicloud.com\/zh\/blog\/\u5173\u4e8eapache-flume\u548cspark-streaming\u7684\u6574\u5408\/","og_locale":"zh_CN","og_type":"article","og_title":"\u5173\u4e8eApache Flume\u548cSpark Streaming\u7684\u6574\u5408","og_description":"\u5047\u8bbe\u5206\u6790Web\u670d\u52a1\u5668\u7684\u8bbf\u95ee\u65e5\u5fd7\u7684\u5b9e\u65f6\u6d41\u5f0f\u5904\u7406\u662f\u4e00\u79cd\u5e38\u89c1\u7684\u5e94\u7528\u573a\u666f\u3002\u8fd9\u7bc7\u6587\u7ae0\u5c06\u4ecb\u7ecd\u5173\u4e8e\u5982\u4f55\u96c6\u6210\u540d\u4e3aApache [&hellip;]","og_url":"https:\/\/www.silicloud.com\/zh\/blog\/\u5173\u4e8eapache-flume\u548cspark-streaming\u7684\u6574\u5408\/","og_site_name":"Blog - Silicon Cloud","article_published_time":"2023-04-21T08:56:22+00:00","article_modified_time":"2024-04-30T05:06:28+00:00","og_image":[{"url":"https:\/\/cdn.silicloud.com\/blog-img\/blog\/img\/657d2af237434c4406c48ef3\/1-0.png"}],"author":"\u6e05, \u626c","twitter_card":"summary_large_image","twitter_misc":{"\u4f5c\u8005":"\u6e05, \u626c","\u9884\u8ba1\u9605\u8bfb\u65f6\u95f4":"3 \u5206"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.silicloud.com\/zh\/blog\/%e5%85%b3%e4%ba%8eapache-flume%e5%92%8cspark-streaming%e7%9a%84%e6%95%b4%e5%90%88\/","url":"https:\/\/www.silicloud.com\/zh\/blog\/%e5%85%b3%e4%ba%8eapache-flume%e5%92%8cspark-streaming%e7%9a%84%e6%95%b4%e5%90%88\/","name":"\u5173\u4e8eApache Flume\u548cSpark Streaming\u7684\u6574\u5408 - Blog - Silicon Cloud","isPartOf":{"@id":"https:\/\/www.silicloud.com\/zh\/blog\/#website"},"datePublished":"2023-04-21T08:56:22+00:00","dateModified":"2024-04-30T05:06:28+00:00","author":{"@id":"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/cb5556d2501da73d864cac945e8d9461"},"breadcrumb":{"@id":"https:\/\/www.silicloud.com\/zh\/blog\/%e5%85%b3%e4%ba%8eapache-flume%e5%92%8cspark-streaming%e7%9a%84%e6%95%b4%e5%90%88\/#breadcrumb"},"inLanguage":"zh-Hans","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.silicloud.com\/zh\/blog\/%e5%85%b3%e4%ba%8eapache-flume%e5%92%8cspark-streaming%e7%9a%84%e6%95%b4%e5%90%88\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.silicloud.com\/zh\/blog\/%e5%85%b3%e4%ba%8eapache-flume%e5%92%8cspark-streaming%e7%9a%84%e6%95%b4%e5%90%88\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"\u9996\u9875","item":"https:\/\/www.silicloud.com\/zh\/blog\/"},{"@type":"ListItem","position":2,"name":"\u5173\u4e8eApache Flume\u548cSpark Streaming\u7684\u6574\u5408"}]},{"@type":"WebSite","@id":"https:\/\/www.silicloud.com\/zh\/blog\/#website","url":"https:\/\/www.silicloud.com\/zh\/blog\/","name":"Blog - Silicon Cloud","description":"","inLanguage":"zh-Hans"},{"@type":"Person","@id":"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/cb5556d2501da73d864cac945e8d9461","name":"\u6e05, \u626c","image":{"@type":"ImageObject","inLanguage":"zh-Hans","@id":"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/32a4239de8ff29adace466261d309424a1e5fe9f7e3036bf89fe03f2e3dbe717?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/32a4239de8ff29adace466261d309424a1e5fe9f7e3036bf89fe03f2e3dbe717?s=96&d=mm&r=g","caption":"\u6e05, \u626c"},"url":"https:\/\/www.silicloud.com\/zh\/blog\/author\/qingyang\/"},{"@type":"ImageObject","inLanguage":"zh-Hans","@id":"https:\/\/www.silicloud.com\/zh\/blog\/%e5%85%b3%e4%ba%8eapache-flume%e5%92%8cspark-streaming%e7%9a%84%e6%95%b4%e5%90%88\/#local-main-organization-logo","url":"","contentUrl":"","caption":"Blog - Silicon Cloud"}]}},"_links":{"self":[{"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/posts\/36844","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/users\/9"}],"replies":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/comments?post=36844"}],"version-history":[{"count":2,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/posts\/36844\/revisions"}],"predecessor-version":[{"id":92085,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/posts\/36844\/revisions\/92085"}],"wp:attachment":[{"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/media?parent=36844"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/categories?post=36844"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/tags?post=36844"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}