{"id":46118,"date":"2023-09-25T13:53:58","date_gmt":"2023-11-30T11:01:32","guid":{"rendered":"https:\/\/www.silicloud.com\/zh\/blog\/46118-2\/"},"modified":"2024-05-04T01:30:20","modified_gmt":"2024-05-03T17:30:20","slug":"46118-2","status":"publish","type":"post","link":"https:\/\/www.silicloud.com\/zh\/blog\/46118-2\/","title":{"rendered":""},"content":{"rendered":"<h1>\u306f\u3058\u3081\u306b<\/h1>\n<p>\u7814\u7a76\u3067\u30c7\u30fc\u30bf\u30b5\u30a4\u30a8\u30f3\u30b9\u3084\u3063\u3066\u308b\u3051\u3069\u3001\u7814\u7a76\u5ba4\u306b\u30c7\u30fc\u30bf\u5206\u6790\u57fa\u76e4\u7684\u306a\u306e\u304c\u306a\u3044\u3002<br \/>\n\u8a08\u7b97\u7528\u306e\u30b5\u30fc\u30d0\u30fc\u304c\u3044\u304f\u3064\u304b\u3042\u308b\u304b\u3089\u30af\u30e9\u30b9\u30bf\u7d44\u3093\u3067\u5206\u6563\u51e6\u7406\u3057\u3066\u307f\u305f\u3044\u3068\u601d\u3044\u3001spark\u3092\u3044\u3058\u3063\u3066\u307f\u308b\u3002<br \/>\npython\u3092\u3044\u3064\u3082\u4f7f\u3063\u3066\u3044\u308b\u304b\u3089spark\u306eAPI\u3092python\u3067\u52d5\u304b\u305b\u308bpyspark\u306b\u6311\u6226\u3002<\/p>\n<p>\u4f53\u7cfb\u7684\u306b\u307e\u3068\u307e\u3063\u3066\u308b\u8a18\u4e8b\u304c\u306a\u3044\u304b\u3089\u3044\u304f\u3064\u304b\u306b\u5206\u3051\u3066\u66f8\u3044\u3066\u307f\u308b\u3002<br \/>\n\u3042\u3068\u3001\u7269\u7406\u30de\u30b7\u30f3\u3067\u30af\u30e9\u30b9\u30bf\u7d44\u3080\u70b9\u3082\u304b\u306a\u308a\u8a66\u884c\u932f\u8aa4\u3057\u305f\u304b\u3089\u66f8\u304d\u6b8b\u3057\u3066\u304a\u304d\u305f\u3044\u3002<\/p>\n<h1>\u30b4\u30fc\u30eb<\/h1>\n<p>\u3068\u308a\u3042\u3048\u305a\u3001\u5206\u6563\u51e6\u7406\u3092\u3055\u305b\u308b\u3053\u3068\u3092\u76ee\u7684\u3068\u3059\u308b\u3002<br \/>\n1. pyspark\u3092\u52d5\u304b\u3059<br \/>\n2. \u30af\u30e9\u30b9\u30bf\u3092\u7d44\u3080<br \/>\n3. standalone\u30e2\u30fc\u30c9\u3067\u5206\u6563\u51e6\u7406\u3092\u3059\u308b<br \/>\n4. jupyter notebook\u3067pyspark\u3059\u308b<\/p>\n<h1>\u30b4\u30fc\u30eb\u2460 pyspark\u3092\u52d5\u304b\u3059<\/h1>\n<p>\u307e\u305a\u3001\u4e00\u756a\u91cd\u8981\u306apyspark\u3092\u52d5\u304b\u305b\u308b\u3088\u3046\u306b\u3059\u308b\u3002<br \/>\n\u3053\u308c\u306f\u8272\u3005\u8a18\u4e8b\u304c\u3042\u308b\u304b\u3089\u697d\u52dd\u3002<\/p>\n<h2>\u74b0\u5883<\/h2>\n<p>\u4eca\u5f8c\u3001\u5206\u6563\u74b0\u5883\u306b\u3057\u305f\u3068\u304dmaster\u3068\u3057\u3066\u6a5f\u80fd\u3055\u305b\u308b\u3002windows\u4e0a\u306eVM\u3067\u52d5\u304b\u3059\u3002<br \/>\n* OS:Windows<br \/>\n* VertualMachineOS:Ubuntu<br \/>\n* python:3.6.0<br \/>\n* pyenv:1.1.5<br \/>\n* spark:2.2.0<br \/>\n* private netowork\u306b\u63a5\u7d9a\u3055\u308c\u3066\u3044\u308b\u3053\u3068\u304c\u524d\u63d0<\/p>\n<h2>\u624b\u9806<\/h2>\n<h3>jdk\u306e\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb<\/h3>\n<pre class=\"post-pre\"><code>$ sudo apt-get install -y openjdk-8-jdk\r\n<\/code><\/pre>\n<h3>spark\u306e\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb<\/h3>\n<p>spark2.2.0\u306ehadoop\u306e\u30d0\u30fc\u30b8\u30e7\u30f3\u306f2.7\u306a\u306e\u3067\u4ee5\u4e0b\u306eURL\u3067wget\u3002<br \/>\n\u89e3\u51cd\u3057\u305f\u30d5\u30a9\u30eb\u30c0\u3092\/usr\/local\/\u306bspark\u3068\u3044\u3046\u540d\u524d\u3067\u79fb\u52d5\u3059\u308b\u3002<br \/>\n\u3053\u306e\u30d5\u30a9\u30eb\u30c0\u79fb\u52d5\u306f\u306e\u3061\u306e\u3061\u30af\u30e9\u30b9\u30bf\u7d44\u3093\u3060\u6642\u306b\u30af\u30e9\u30b9\u30bf\u5074\u3068\u30c7\u30a3\u30ec\u30af\u30c8\u30ea\u69cb\u6210\u304c\u4e00\u81f4\u3057\u3066\u3044\u308b\u5fc5\u8981\u304c\u3042\u308b\u305f\u3081\u3001\u91cd\u8981\u3002<\/p>\n<pre class=\"post-pre\"><code>$ wget http:\/\/ftp.riken.jp\/net\/apache\/spark\/spark-2.2.0\/spark-2.2.0-bin-hadoop2.7.tgz\r\n$ tar zxvf spark-2.2.0-bin-hadoop2.7.tgz\r\n$ sudo mv spark-2.2.0-bin-hadoop2.7 \/usr\/local\/\r\n$ sudo ln -s \/usr\/local\/spark-2.2.0-bin-hadoop2.7 \/usr\/local\/spark\r\n<\/code><\/pre>\n<p>\u30d1\u30b9\u3092\u901a\u3059\u3002\u4ee5\u4e0b\u3092.bashrc\u306b\u8a18\u8f09\u3057\u3066source\u3002<\/p>\n<pre class=\"post-pre\"><code>export SPARK_HOME=\/usr\/local\/spark\r\nexport PATH=$PATH:$SPARK_HOME\/bin\r\n<\/code><\/pre>\n<h3>pyspark\u306e\u8d77\u52d5<\/h3>\n<p>pyspark\u304c$SPARK_HOME\/bin\u306e\u76f4\u4e0b\u306b\u3042\u308b\u305f\u3081\u3001<\/p>\n<pre class=\"post-pre\"><code>$ pyspark\r\n<\/code><\/pre>\n<p>\u3067\u8d77\u52d5\u3067\u304d\u308c\u3070\u304a\u3063\u3051\u3044\u3002<\/p>\n<div><img decoding=\"async\" class=\"post-images\" title=\"\" src=\"https:\/\/cdn.silicloud.com\/blog-img\/blog\/img\/657d62c337434c4406d00a70\/21-0.png\" alt=\"\u30b9\u30af\u30ea\u30fc\u30f3\u30b7\u30e7\u30c3\u30c8 2017-10-12 9.54.39.png\" \/><\/div>\n<p>spark\u3067\u306fSparkContext\u3068\u3044\u3046\u30c9\u30e9\u30a4\u30d0\u30d7\u30ed\u30b0\u30e9\u30e0\u304cspark\u306b\u30a2\u30af\u30bb\u30b9\u3059\u308b\u305f\u3081\u306e\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u304c\u63d0\u4f9b\u3055\u308c\u3066\u3044\u308b\u3002\u30a4\u30f3\u30bf\u30e9\u30af\u30c6\u30a3\u30d6\u30b7\u30a7\u30eb\u3067\u64cd\u4f5c\u3059\u308b\u5834\u5408\u3001\u81ea\u52d5\u7684\u306b\u751f\u6210\u3055\u308c\u308b\u3002<br \/>\n\u305d\u306e\u305f\u3081\u3001\u4e0a\u8a18\u306e\u30b3\u30de\u30f3\u30c9\u3067pyspark\u3092\u8d77\u52d5\u3057\u3066\u3001<\/p>\n<pre class=\"post-pre\"><code>&gt;&gt;&gt; sc\r\n<\/code><\/pre>\n<p>\u3067\u30a8\u30e9\u30fc\u304c\u3067\u306a\u3051\u308c\u3070\u3061\u3083\u3093\u3068\u8d77\u52d5\u3067\u304d\u3066\u3044\u308b\u3002<\/p>\n<h3>python\u30d0\u30fc\u30b8\u30e7\u30f3<\/h3>\n<p>\u3082\u3057\u304b\u3059\u308b\u3068\u3001\u8d77\u52d5\u3057\u305fpyspark\u3067\u4f7f\u7528\u3055\u308c\u3066\u3044\u308bpython\u306e\u30d0\u30fc\u30b8\u30e7\u30f3\u304c2\u7cfb\u306e\u5834\u5408\u304c\u3042\u308b\u3002<br \/>\n\u3053\u308c\u306f\u30b7\u30b9\u30c6\u30e0\u306e\u30c7\u30d5\u30a9\u30eb\u30c8\u306epython\u3067pyspark\u304c\u8d77\u52d5\u3057\u3066\u3057\u307e\u3046\u304b\u3089\u3067\u3042\u308b\u3002<br \/>\n\u30af\u30e9\u30b9\u30bf\u3092\u7d44\u3080\u5834\u5408\u3001python\u306e\u30d0\u30fc\u30b8\u30e7\u30f3\u304c\u4e00\u81f4\u3057\u3066\u3044\u308b\u5fc5\u8981\u304c\u3042\u308b\u305f\u3081\u3001python3.6\u3067pyspark\u3092\u8d77\u52d5\u3057\u306a\u304f\u3066\u306f\u306a\u3089\u306a\u3044\u3002<\/p>\n<p>\u81ea\u5206\u306fpyenv\u30673.6\u3092\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u3057\u3066pyspark\u3067\u4f7f\u308f\u308c\u308bpython\u306e\u30d0\u30fc\u30b8\u30e7\u30f3\u3092\u5909\u66f4\u3057\u305f\u3002<br \/>\n\u3053\u306e\u30b5\u30a4\u30c8\u304c\u308f\u304b\u308a\u3084\u3059\u3044\u3002<br \/>\nUbuntu\u3067Python\u306e\u958b\u767a\u74b0\u5883\u3092\u6574\u3048\u308b<\/p>\n<h1>\u30b4\u30fc\u30eb\u2461 \u30af\u30e9\u30b9\u30bf\u3092\u7d44\u3080<\/h1>\n<h2>\u74b0\u5883<\/h2>\n<p>\u30af\u30e9\u30b9\u30bf\u306e\u4e2d\u3067\u4f5c\u696d\u3059\u308b\u30ce\u30fc\u30c9\u3092slave(worker)\u3068\u547c\u3076\u3002<br \/>\ndocker\u3067\u74b0\u5883\u5c0e\u5165\u3059\u308b\u305f\u3081\u3001OS\u306a\u3069\u306fdocker\u304c\u4f7f\u3048\u308c\u3070\u826f\u3044\u3002<br \/>\nmaster\u540c\u69d8\u3001private network\u306b\u63a5\u7d9a\u3055\u308c\u3066\u3044\u308b\u3053\u3068\u304c\u524d\u63d0\u3002<br \/>\n\u4e00\u5fdc\u3001\u81ea\u5206\u304c\u4f5c\u3063\u305f\u74b0\u5883\u306f\u4ee5\u4e0b\u3002<\/p>\n<ul class=\"post-ul\">\n<li style=\"list-style-type: none;\">\n<ul class=\"post-ul\">OS:Ubuntu<\/ul>\n<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<ul class=\"post-ul\">\n<li style=\"list-style-type: none;\">\n<ul class=\"post-ul\">core:12<\/ul>\n<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<ul class=\"post-ul\">\n<li style=\"list-style-type: none;\">\n<ul class=\"post-ul\">memory:64G<\/ul>\n<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<ul class=\"post-ul\">\n<li style=\"list-style-type: none;\">\n<ul class=\"post-ul\">docker:1.12.6<\/ul>\n<\/li>\n<\/ul>\n<p>spark-master0(master\u30ce\u30fc\u30c9)<\/p>\n<p>ip:192.100.0.1<\/p>\n<p>spark-slave1(slave\u30ce\u30fc\u30c9)<br \/>\nspark-slave2(slave\u30ce\u30fc\u30c9)<\/p>\n<h2>spark\u306e\u74b0\u5883\u5c0e\u5165<\/h2>\n<p>\u7269\u7406\u30de\u30b7\u30f3\u3067\u30af\u30e9\u30b9\u30bf\u3092\u7d44\u3080\u3068\u304d\u306b\u3001OS\u304c\u30d0\u30e9\u30d0\u30e9\u3060\u3063\u305f\u306e\u3067\u624b\u3063\u53d6\u308a\u65e9\u304f\u540c\u3058\u74b0\u5883\u3092\u4f5c\u308a\u305f\u3044\u3068\u601d\u3044docker\u3092\u4f7f\u3046\u3002<br \/>\ndockerhub\u304b\u3089\u3053\u306e\u30a4\u30e1\u30fc\u30b8\u3092\u3082\u3063\u3066\u304f\u308b\u3002<br \/>\njupyter\/pyspark-notebook<br \/>\n\u3053\u308c\u306b\u306f\u3001python3.6\u4f7f\u3048\u308b\u3088\u3046\u306b\u306a\u3063\u3066\u3044\u3066spark\u3082\u52d5\u304b\u305b\u308b\u3088\u3046\u306b\u306a\u3063\u3066\u3044\u308b\u3002<\/p>\n<pre class=\"post-pre\"><code>$ docker pull jupyter\/pyspark-notebook\r\n<\/code><\/pre>\n<p>\u3053\u308c\u3067\u3001\u30a4\u30e1\u30fc\u30b8\u304c\u3067\u304d\u3066\u4ee5\u4e0b\u306e\u30b3\u30de\u30f3\u30c9\u3067run\u3059\u308b\u3002<\/p>\n<pre class=\"post-pre\"><code>$ docker run -d --name spark-slave1 --user root --net=host jupyter\/pyspark-notebook\r\n<\/code><\/pre>\n<p>\u30aa\u30d7\u30b7\u30e7\u30f3\u306e\u8aac\u660e<br \/>\n* &#8211;name spark-slave1<br \/>\n\u30b3\u30f3\u30c6\u30ca\u306b\u540d\u524d\u3092\u3064\u3051\u308b\u3002<br \/>\n\u4eca\u56de\u306f2\u3064\u306eslave\u30ce\u30fc\u30c9\u3092\u751f\u6210\u3059\u308b\u305f\u3081\u3001spark-slave1,spark-slave2\u3068\u540d\u4ed8\u3051\u305f\u3002<br \/>\n* &#8211;user root<br \/>\n\u30ed\u30b0\u30a4\u30f3\u3059\u308b\u30e6\u30fc\u30b6\u30fc\u3092\u6307\u5b9a\u3059\u308b\u3001\u3053\u306e\u30e6\u30fc\u30b6\u30fc\u3067\u306a\u3044\u3068sudo\u6a29\u9650\u304c\u306a\u3044\u305f\u3081\u3084\u3084\u3053\u3057\u3044\u3002<br \/>\n* &#8211;net=host<br \/>\n\u3053\u308c\u304c\u3068\u3066\u3082\u91cd\u8981\u3002spark\u3067\u30af\u30e9\u30b9\u30bf\u3092\u7d44\u3093\u3067\u5206\u6563\u51e6\u7406\u3055\u305b\u308b\u3068\u3001\u3044\u304f\u3064\u304b\u306e\u51e6\u7406\u3067\u30e9\u30f3\u30c0\u30e0\u306b\u30dd\u30fc\u30c8\u3092\u958b\u653e\u3055\u305b\u3066\u884c\u3046\u3002\u305d\u306e\u305f\u3081-p\u306b\u3088\u308b\u30dd\u30fc\u30c8\u30d5\u30a9\u30ef\u30fc\u30c7\u30a3\u30f3\u30b0\u3067\u306f\u5bfe\u5fdc\u3067\u304d\u306a\u3044\u3002\u3053\u308c\u306b\u3088\u3063\u3066\u3001\u30b3\u30f3\u30c6\u30ca\u306e\u30cd\u30c3\u30c8\u30ef\u30fc\u30af\u306fhost\u3068\u540c\u3058\u306b\u306a\u308b\u3002\u305d\u306e\u305f\u3081\u3001host\u306eIP\u30a2\u30c9\u30ec\u30b9\u3067\u30dd\u30fc\u30c8\u304c\u4f7f\u3048\u308b\u3088\u3046\u306b\u306a\u308b\u3002<\/p>\n<h2>\u30af\u30e9\u30b9\u30bf\u3092\u69cb\u6210\u3059\u308b<\/h2>\n<p>\u4eca\u56de\u3001\u4e0a\u306e\u3084\u308a\u65b9\u30672\u53f0\u306e\u7269\u7406\u30de\u30b7\u30f3\u306b\u305d\u308c\u305e\u308cslave\u30ce\u30fc\u30c9\u3092\u751f\u6210\u3057\u305f\u3002<br \/>\n\u30af\u30e9\u30b9\u30bf\u306e\u7d44\u307f\u65b9\u306f\u624b\u52d5\u3068\u81ea\u52d5\u306e2\u901a\u308a\u304c\u3042\u308b\u304c\u3001\u4eca\u56de\u306f\u624b\u52d5\u3067\u884c\u3046\u3002<br \/>\n\u307e\u305a\u3001\u305d\u306e\u2460\u3067\u751f\u6210\u3057\u305fmaster\u30ce\u30fc\u30c9\u3067start-master.sh\u3092\u5b9f\u884c\u3059\u308b\u3002<\/p>\n<pre class=\"post-pre\"><code>sudo \/usr\/local\/spark\/sbin\/start-master.sh\r\n<\/code><\/pre>\n<p>\u3053\u308c\u3067\u4ee5\u4e0b\u306e7077\u756a\u30dd\u30fc\u30c8\u3067master\u3068\u3057\u3066\u8d77\u52d5\u3067\u304d\u308b\u3002<br \/>\n\u305d\u3057\u3066master\u3067\u8d77\u52d5\u3059\u308b\u3068\u30018080\u756a\u30dd\u30fc\u30c8\u3067\u7ba1\u7406\u753b\u9762\u3092\u898b\u308b\u3053\u3068\u3067\u304d\u308b\u3002<br \/>\nhttp:\/\/$(master_node_IP):8080<\/p>\n<p>\u6b21\u306b\u3001slave\u30ce\u30fc\u30c9\u3068\u3057\u3066\u52d5\u304b\u3059\u30b3\u30f3\u30c6\u30ca\u306b\u30ed\u30b0\u30a4\u30f3\u3057\u3066start-slave.sh\u3092\u5b9f\u884c\u3059\u308b\u3002<\/p>\n<pre class=\"post-pre\"><code>$ sudo \/usr\/local\/spark\/sbin\/start-slave.sh spark:\/\/192.100.0.1:7077\r\n<\/code><\/pre>\n<p>spark:\/\/192.100.0.1\u306fmaster\u306eIP\u30a2\u30c9\u30ec\u30b9\u3002<br \/>\nslave\u304c\u3061\u3083\u3093\u3068\u8d77\u52d5\u3067\u304d\u308b\u3068master\u306e\u7ba1\u7406\u753b\u9762\u3067\u30af\u30e9\u30b9\u30bf\u306b\u53c2\u52a0\u3059\u308b\u306e\u304c\u78ba\u8a8d\u3067\u304d\u308b\u3002<\/p>\n<div><img decoding=\"async\" class=\"post-images\" title=\"\" src=\"https:\/\/cdn.silicloud.com\/blog-img\/blog\/img\/657d62c337434c4406d00a70\/45-0.png\" alt=\"\u30b9\u30af\u30ea\u30fc\u30f3\u30b7\u30e7\u30c3\u30c8 2017-10-13 16.28.40.png\" \/><\/div>\n<p>\u30c7\u30d5\u30a9\u30eb\u30c8\u3060\u3068slave\u306ecore\u6570\u306f\u30de\u30b7\u30f3\u306e\u6700\u5927\u6570\u3067\u3001Memory\u306f1024m\u3060\u3063\u305f\u6c17\u304c\u3059\u308b\u3002<br \/>\nslave\u3082master\u3068\u540c\u69d8\u306b\u4f5c\u696d\u7ba1\u7406\u753b\u9762\u30928081\u756a\u30dd\u30fc\u30c8\u3067\u307f\u308c\u308b\u3002<br \/>\nhttp:\/\/$(slave_node_IP):8081<\/p>\n<div><img decoding=\"async\" class=\"post-images\" title=\"\" src=\"https:\/\/cdn.silicloud.com\/blog-img\/blog\/img\/657d62c337434c4406d00a70\/47-0.png\" alt=\"\u30b9\u30af\u30ea\u30fc\u30f3\u30b7\u30e7\u30c3\u30c8 2017-10-13 16.38.07.png\" \/><\/div>\n<p>\u3061\u306a\u307f\u306b40915\u3068\u3044\u3046\u306e\u306fslave\u30ce\u30fc\u30c9\u306e\u4f5c\u696d\u30dd\u30fc\u30c8\u3002<br \/>\nslave\u3092\u8d77\u52d5\u3059\u308b\u305f\u3073\u306b\u30e9\u30f3\u30c0\u30e0\u306b\u958b\u653e\u3055\u308c\u308b\u3002<\/p>\n<h1>\u30b4\u30fc\u30eb\u2462 \u5206\u6563\u51e6\u7406\u3092\u884c\u3046<\/h1>\n<p>\u3068\u306f\u8a00\u3063\u3066\u3082\u3001\u3082\u3046\u7279\u306b\u3059\u308b\u3053\u3068\u306f\u306a\u3044\u3002<\/p>\n<h2>standalone\u30e2\u30fc\u30c9\u3067\u8d77\u52d5<\/h2>\n<p>http:\/\/$(master_node_IP):8080\u3067master\u304c\u8d77\u52d5\u3067\u304d\u3066\u3044\u3066\u3001http:\/\/(slave_node_IP):8081\u3067slave\u304c\u8d77\u52d5\u3067\u304d\u3066\u3044\u3066\u3001master\u306e\u7ba1\u7406\u753b\u9762\u306bslave\u304c\u30af\u30e9\u30b9\u30bf\u306b\u53c2\u52a0\u3057\u3066\u3044\u308b\u72b6\u614b\u3067\u4ee5\u4e0b\u306e\u30b3\u30de\u30f3\u30c9\u3092\u305f\u305f\u304f\u3002<\/p>\n<pre class=\"post-pre\"><code>$ \/usr\/local\/spark\/bin\/pyspark --master spark:\/\/(master_node_IP):7077 --packages org.apache.hadoop:hadoop-aws:2.7.0\r\n\r\n<\/code><\/pre>\n<ul class=\"post-ul\">\n<li style=\"list-style-type: none;\">\n<ul class=\"post-ul\">&#8211;master<\/ul>\n<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<ul class=\"post-ul\">\n<li style=\"list-style-type: none;\">\n<ul class=\"post-ul\">master\u306eIP\u3092\u6307\u5b9a\u3002<\/ul>\n<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<ul class=\"post-ul\">\n<li style=\"list-style-type: none;\">\n<ul class=\"post-ul\">&#8211;packages<\/ul>\n<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<ul class=\"post-ul\">hasoop\u306e\u30d0\u30fc\u30b8\u30e7\u30f3\u3092\u6307\u5b9a\u3057\u305f\u30d1\u30c3\u30b1\u30fc\u30b8\u3092\u6307\u5b9a\u3002<\/ul>\n<div><img decoding=\"async\" class=\"post-images\" title=\"\" src=\"https:\/\/cdn.silicloud.com\/blog-img\/blog\/img\/657d62c337434c4406d00a70\/55-0.png\" alt=\"\u30b9\u30af\u30ea\u30fc\u30f3\u30b7\u30e7\u30c3\u30c8 2017-10-16 12.01.06.png\" \/><\/div>\n<div><img decoding=\"async\" class=\"post-images\" title=\"\" src=\"https:\/\/cdn.silicloud.com\/blog-img\/blog\/img\/657d62c337434c4406d00a70\/56-0.png\" alt=\"\u30b9\u30af\u30ea\u30fc\u30f3\u30b7\u30e7\u30c3\u30c8 2017-10-16 12.06.28.png\" \/><\/div>\n<p>\u3061\u306a\u307f\u306b\u4e0a\u306e\u30bf\u30b0\u306f\u3001<\/p>\n<ul class=\"post-ul\">\n<li style=\"list-style-type: none;\">\n<ul class=\"post-ul\">Jobs:1\u5b9f\u884c\u30d7\u30ed\u30bb\u30b9\u306e\u9032\u884c\u5177\u5408\u3092\u8868\u793a<\/ul>\n<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<ul class=\"post-ul\">\n<li style=\"list-style-type: none;\">\n<ul class=\"post-ul\">Stages:Jobs\u304c\u8907\u6570\u306estage\u3067\u69cb\u6210\u3055\u308c\u3066\u3044\u308b\u5834\u5408\u30011stage\u306e\u9032\u884c\u5177\u5408\u3092\u8868\u793a<\/ul>\n<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<ul class=\"post-ul\">\n<li style=\"list-style-type: none;\">\n<ul class=\"post-ul\">Storages:pyspark\u3067RDD\u3092\u6c38\u7d9a\u5316\u3057\u305f\u3068\u304d\u306b\u5bfe\u8c61\u30c7\u30fc\u30bf\u304c\u3069\u308c\u3060\u3051\u30ad\u30e3\u30c3\u30b7\u30e5\u3055\u308c\u3066\u3044\u308b\u304b\u3092\u8868\u793a<\/ul>\n<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<ul class=\"post-ul\">Enviroment:pyspark\u306e\u74b0\u5883\u60c5\u5831<\/ul>\n<h2>\u30a2\u30d7\u30ea\u30b1\u30fc\u30b7\u30e7\u30f3\u306e\u5b9f\u884c<\/h2>\n<div><img decoding=\"async\" class=\"post-images\" title=\"\" src=\"https:\/\/cdn.silicloud.com\/blog-img\/blog\/img\/657d62c337434c4406d00a70\/60-0.png\" alt=\"\u30b9\u30af\u30ea\u30fc\u30f3\u30b7\u30e7\u30c3\u30c8 2017-10-16 12.17.48.png\" \/><\/div>\n<p>pysparkAPI\u306e\u4f7f\u3044\u65b9\u306f\u3053\u308c\u306b\u307e\u3068\u307e\u3063\u3066\u3044\u308b\u3002<br \/>\nSpark API \u30c1\u30fc\u30c8\u30b7\u30fc\u30c8<\/p>\n<p>slave\u304c\u3061\u3083\u3093\u3068\u52d5\u4f5c\u3057\u3066\u308b\u306e\u3092\u78ba\u8a8d\u3059\u308b\u306b\u306f\u3001http:\/\/$(slave_node_IP):8081\u3067Running Executer\u306elogs\u306estderr\u3067\u30a8\u30e9\u30fc\u304c\u3067\u306a\u3044\u3067log\u304c\u8868\u793a\u3055\u308c\u3066\u3044\u308b\u304b\u3092\u307f\u308c\u3070\u3044\u3044\u3002<\/p>\n<h1>\u30b4\u30fc\u30eb\u2463 jupyter\u3067pyspark\u3092\u5b9f\u884c\u3059\u308b<\/h1>\n<p>\u30a4\u30f3\u30bf\u30e9\u30af\u30c6\u30a3\u30d6\u30b7\u30a7\u30eb\u3067pyspark\u3092\u6271\u3046\u306b\u306f\u5c11\u3057\u9762\u5012\u3060\u304b\u3089\u3001jupyter\u3067\u64cd\u4f5c\u3067\u304d\u308b\u3088\u3046\u306b\u3059\u308b\u3002<br \/>\njupyter\u5165\u3063\u3066\u306a\u3044\u5834\u5408\u306f\u3001pip\u306a\u308aanaconda\u306a\u308a\u3067\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\u3059\u308b\u3002<br \/>\nconfig\u30d5\u30a1\u30a4\u30eb\u306f\u521d\u671f\u72b6\u614b\u3060\u3068template\u3057\u304b\u306a\u3044\u305f\u3081\u3001cp\u3067spar-env.sh\u3092\u751f\u6210\u3059\u308b\u3002<\/p>\n<pre class=\"post-pre\"><code>$ cp \/usr\/local\/spark\/conf\/spark-env.sh.template \/usr\/local\/spark\/conf\/spark-env.sh\r\n<\/code><\/pre>\n<p>spark-env.sh\u306b\u4ee5\u4e0b\u3092\u66f8\u304d\u8fbc\u3080\u3002<\/p>\n<pre class=\"post-pre\"><code>export PYSPARK_DRIVER_PYTHON=\/$(jupyter_path)\/jupyter\r\nexport PYSPARK_DRIVER_PYTHON_OPTS=\"notebook\"\r\n<\/code><\/pre>\n<p>\u3053\u308c\u3092\u66f8\u304d\u8fbc\u3080\u3068\u3001<\/p>\n<pre class=\"post-pre\"><code>$ \/usr\/local\/spark\/bin\/pyspark --master spark:\/\/(master_node_IP):7077 --packages org.apache.hadoop:hadoop-aws:2.7.0\r\n\r\n<\/code><\/pre>\n<p>\u3067\u8d77\u52d5\u3057\u305f\u6642\u306b\u3001jupyter\u3067\u8d77\u52d5\u3059\u308b\u3088\u3046\u306b\u306a\u308b\u3002sc\u3068\u6253\u3063\u3066\u3001SparkContext\u60c5\u5831\u304c\u51fa\u529b\u3055\u308c\u308c\u3070sparkAPI\u304c\u3064\u304b\u3048\u308b\u3002<\/p>\n<div><img decoding=\"async\" class=\"post-images\" title=\"\" src=\"https:\/\/cdn.silicloud.com\/blog-img\/blog\/img\/657d62c337434c4406d00a70\/71-0.png\" alt=\"\u56f31.png\" \/><\/div>\n<h1>\u304a\u308f\u308a\u306b<\/h1>\n<p>\u4ee5\u4e0a\u3067pyspark\u3092jupyter\u3067\u52d5\u304b\u3057\u3066\u3001\u5206\u6563\u51e6\u7406\u3092\u5b9f\u884c\u3055\u305b\u308b\u3053\u3068\u304c\u5b9f\u73fe\u3067\u304d\u305f\u3002<br \/>\n\u3068\u308a\u3042\u3048\u305a\u3001\u52d5\u304b\u305b\u308b\u304cspark\u306f\u30c1\u30e5\u30fc\u30cb\u30f3\u30b0\u304c\u5927\u4e8b\u306b\u306a\u3063\u3066\u304f\u308b\u3002<br \/>\n\u91cd\u3081\u306e\u51e6\u7406\u3092\u4f55\u5ea6\u3082\u8a66\u3057\u3066\u3001slave\u3067\u30a8\u30e9\u30fc\u304c\u51fa\u3066\u3044\u306a\u3044\u304b\u3084\u30d7\u30ed\u30bb\u30b9\u306e\u9014\u4e2d\u3067task\u304c\u5931\u6557\u3057\u3066\u3044\u306a\u3044\u306a\u304b\u3001\u3057\u3066\u308b\u306a\u3089config\u30d5\u30a1\u30a4\u30eb\u3067\u5024\u3092\u8abf\u6574\u3057\u3066\u3001\u3001\u3001\u306a\u3069\u306e\u8a66\u884c\u932f\u8aa4\u304c\u5fc5\u8981\u3060\u3068\u601d\u3046\u3002<\/p>\n<p>\u5c11\u3057\u3067\u3082spark\u306e\u74b0\u5883\u4f5c\u308a\u306e\u53c2\u8003\u306b\u306a\u308c\u3070\u5b09\u3057\u3044\u3002<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u306f\u3058\u3081\u306b \u7814\u7a76\u3067\u30c7\u30fc\u30bf\u30b5\u30a4\u30a8\u30f3\u30b9\u3084\u3063\u3066\u308b\u3051\u3069\u3001\u7814\u7a76\u5ba4\u306b\u30c7\u30fc\u30bf\u5206\u6790\u57fa\u76e4\u7684\u306a\u306e\u304c\u306a\u3044\u3002 \u8a08\u7b97\u7528\u306e\u30b5\u30fc\u30d0\u30fc\u304c\u3044\u304f\u3064\u304b [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-46118","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v21.5 (Yoast SEO v21.5) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>- Blog - Silicon Cloud<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.silicloud.com\/zh\/blog\/46118-2\/\" \/>\n<meta property=\"og:locale\" content=\"zh_CN\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:description\" content=\"\u306f\u3058\u3081\u306b \u7814\u7a76\u3067\u30c7\u30fc\u30bf\u30b5\u30a4\u30a8\u30f3\u30b9\u3084\u3063\u3066\u308b\u3051\u3069\u3001\u7814\u7a76\u5ba4\u306b\u30c7\u30fc\u30bf\u5206\u6790\u57fa\u76e4\u7684\u306a\u306e\u304c\u306a\u3044\u3002 \u8a08\u7b97\u7528\u306e\u30b5\u30fc\u30d0\u30fc\u304c\u3044\u304f\u3064\u304b [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.silicloud.com\/zh\/blog\/46118-2\/\" \/>\n<meta property=\"og:site_name\" content=\"Blog - Silicon Cloud\" \/>\n<meta property=\"article:published_time\" content=\"2023-11-30T11:01:32+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-05-03T17:30:20+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/cdn.silicloud.com\/blog-img\/blog\/img\/657d62c337434c4406d00a70\/21-0.png\" \/>\n<meta name=\"author\" content=\"\u97f5, \u79d1\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"\u4f5c\u8005\" \/>\n\t<meta name=\"twitter:data1\" content=\"\u97f5, \u79d1\" \/>\n\t<meta name=\"twitter:label2\" content=\"\u9884\u8ba1\u9605\u8bfb\u65f6\u95f4\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 \u5206\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/46118-2\/\",\"url\":\"https:\/\/www.silicloud.com\/zh\/blog\/46118-2\/\",\"name\":\"- Blog - Silicon Cloud\",\"isPartOf\":{\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/#website\"},\"datePublished\":\"2023-11-30T11:01:32+00:00\",\"dateModified\":\"2024-05-03T17:30:20+00:00\",\"author\":{\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/6530331a63adef3b3443a1fab53a0e6e\"},\"inLanguage\":\"zh-Hans\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.silicloud.com\/zh\/blog\/46118-2\/\"]}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/#website\",\"url\":\"https:\/\/www.silicloud.com\/zh\/blog\/\",\"name\":\"Blog - Silicon Cloud\",\"description\":\"\",\"inLanguage\":\"zh-Hans\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/6530331a63adef3b3443a1fab53a0e6e\",\"name\":\"\u97f5, \u79d1\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"zh-Hans\",\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/429ccb39b3fff5188bc17986222cfb0936cbadb8cc933cff04ab5ca01bd30a08?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/429ccb39b3fff5188bc17986222cfb0936cbadb8cc933cff04ab5ca01bd30a08?s=96&d=mm&r=g\",\"caption\":\"\u97f5, \u79d1\"},\"url\":\"https:\/\/www.silicloud.com\/zh\/blog\/author\/yunke\/\"},{\"@type\":\"ImageObject\",\"inLanguage\":\"zh-Hans\",\"@id\":\"https:\/\/www.silicloud.com\/zh\/blog\/46118-2\/#local-main-organization-logo\",\"url\":\"\",\"contentUrl\":\"\",\"caption\":\"Blog - Silicon Cloud\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"- Blog - Silicon Cloud","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.silicloud.com\/zh\/blog\/46118-2\/","og_locale":"zh_CN","og_type":"article","og_description":"\u306f\u3058\u3081\u306b \u7814\u7a76\u3067\u30c7\u30fc\u30bf\u30b5\u30a4\u30a8\u30f3\u30b9\u3084\u3063\u3066\u308b\u3051\u3069\u3001\u7814\u7a76\u5ba4\u306b\u30c7\u30fc\u30bf\u5206\u6790\u57fa\u76e4\u7684\u306a\u306e\u304c\u306a\u3044\u3002 \u8a08\u7b97\u7528\u306e\u30b5\u30fc\u30d0\u30fc\u304c\u3044\u304f\u3064\u304b [&hellip;]","og_url":"https:\/\/www.silicloud.com\/zh\/blog\/46118-2\/","og_site_name":"Blog - Silicon Cloud","article_published_time":"2023-11-30T11:01:32+00:00","article_modified_time":"2024-05-03T17:30:20+00:00","og_image":[{"url":"https:\/\/cdn.silicloud.com\/blog-img\/blog\/img\/657d62c337434c4406d00a70\/21-0.png"}],"author":"\u97f5, \u79d1","twitter_card":"summary_large_image","twitter_misc":{"\u4f5c\u8005":"\u97f5, \u79d1","\u9884\u8ba1\u9605\u8bfb\u65f6\u95f4":"2 \u5206"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.silicloud.com\/zh\/blog\/46118-2\/","url":"https:\/\/www.silicloud.com\/zh\/blog\/46118-2\/","name":"- Blog - Silicon Cloud","isPartOf":{"@id":"https:\/\/www.silicloud.com\/zh\/blog\/#website"},"datePublished":"2023-11-30T11:01:32+00:00","dateModified":"2024-05-03T17:30:20+00:00","author":{"@id":"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/6530331a63adef3b3443a1fab53a0e6e"},"inLanguage":"zh-Hans","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.silicloud.com\/zh\/blog\/46118-2\/"]}]},{"@type":"WebSite","@id":"https:\/\/www.silicloud.com\/zh\/blog\/#website","url":"https:\/\/www.silicloud.com\/zh\/blog\/","name":"Blog - Silicon Cloud","description":"","inLanguage":"zh-Hans"},{"@type":"Person","@id":"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/6530331a63adef3b3443a1fab53a0e6e","name":"\u97f5, \u79d1","image":{"@type":"ImageObject","inLanguage":"zh-Hans","@id":"https:\/\/www.silicloud.com\/zh\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/429ccb39b3fff5188bc17986222cfb0936cbadb8cc933cff04ab5ca01bd30a08?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/429ccb39b3fff5188bc17986222cfb0936cbadb8cc933cff04ab5ca01bd30a08?s=96&d=mm&r=g","caption":"\u97f5, \u79d1"},"url":"https:\/\/www.silicloud.com\/zh\/blog\/author\/yunke\/"},{"@type":"ImageObject","inLanguage":"zh-Hans","@id":"https:\/\/www.silicloud.com\/zh\/blog\/46118-2\/#local-main-organization-logo","url":"","contentUrl":"","caption":"Blog - Silicon Cloud"}]}},"_links":{"self":[{"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/posts\/46118","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/comments?post=46118"}],"version-history":[{"count":2,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/posts\/46118\/revisions"}],"predecessor-version":[{"id":95299,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/posts\/46118\/revisions\/95299"}],"wp:attachment":[{"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/media?parent=46118"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/categories?post=46118"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.silicloud.com\/zh\/blog\/wp-json\/wp\/v2\/tags?post=46118"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}