{"id":20362,"date":"2024-03-15T20:20:30","date_gmt":"2024-03-15T20:20:30","guid":{"rendered":"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/"},"modified":"2024-03-21T17:54:21","modified_gmt":"2024-03-21T17:54:21","slug":"how-does-hive-automatically-merge-small-files","status":"publish","type":"post","link":"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/","title":{"rendered":"How does Hive automatically merge small files?"},"content":{"rendered":"<p>To automatically merge small files in Hive, you can use the following methods:<\/p>\n<ol>\n<li>The task of merging small files is automated by Hive: using the hive.merge.smallfiles.avgsize parameter in Hive to set the threshold for the average size of small files, when the average size of small files is below this threshold, Hive will automatically merge them into larger files.<\/li>\n<li>Utilize Hive&#8217;s Dynamic Partition feature: When creating a table, partition the data using partition fields and set the appropriate partition fields. This allows merging small files into larger ones through dynamic partitioning.<\/li>\n<li>Utilizing Hive compression functionality can reduce the need for small file merging by storing data in a compressed format like Snappy or LZO when creating tables.<\/li>\n<li>Utilize Hive&#8217;s merge tools: Hive offers various merge tools such as Hive-5881 and Hive-5317 which can be used to manually merge small files. These tools allow for merging small files into larger files by executing specific HiveQL statements or running corresponding scripts.<\/li>\n<\/ol>\n<p>Regardless of the method used, the process of merging small files requires adjusting the configuration of Hive, such as changing the value of the hive.merge.smallfiles.avgsize parameter and setting the compression format. It is also important to choose the appropriate merging strategy based on the actual situation in order to achieve the goal of merging small files.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>To automatically merge small files in Hive, you can use the following methods: The task of merging small files is automated by Hive: using the hive.merge.smallfiles.avgsize parameter in Hive to set the threshold for the average size of small files, when the average size of small files is below this threshold, Hive will automatically merge [&hellip;]<\/p>\n","protected":false},"author":8,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_import_markdown_pro_load_document_selector":0,"_import_markdown_pro_submit_text_textarea":"","footnotes":""},"categories":[1],"tags":[],"class_list":["post-20362","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v21.5 (Yoast SEO v21.5) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>How does Hive automatically merge small files? - Blog - Silicon Cloud<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How does Hive automatically merge small files?\" \/>\n<meta property=\"og:description\" content=\"To automatically merge small files in Hive, you can use the following methods: The task of merging small files is automated by Hive: using the hive.merge.smallfiles.avgsize parameter in Hive to set the threshold for the average size of small files, when the average size of small files is below this threshold, Hive will automatically merge [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/\" \/>\n<meta property=\"og:site_name\" content=\"Blog - Silicon Cloud\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/SiliCloudGlobal\/\" \/>\n<meta property=\"article:published_time\" content=\"2024-03-15T20:20:30+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-03-21T17:54:21+00:00\" \/>\n<meta name=\"author\" content=\"William Carter\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@SiliCloudGlobal\" \/>\n<meta name=\"twitter:site\" content=\"@SiliCloudGlobal\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"William Carter\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/\"},\"author\":{\"name\":\"William Carter\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/f697031891aacefc4b681d139781d3c0\"},\"headline\":\"How does Hive automatically merge small files?\",\"datePublished\":\"2024-03-15T20:20:30+00:00\",\"dateModified\":\"2024-03-21T17:54:21+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/\"},\"wordCount\":230,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\"},\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/\",\"url\":\"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/\",\"name\":\"How does Hive automatically merge small files? - Blog - Silicon Cloud\",\"isPartOf\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#website\"},\"datePublished\":\"2024-03-15T20:20:30+00:00\",\"dateModified\":\"2024-03-21T17:54:21+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.silicloud.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"How does Hive automatically merge small files?\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#website\",\"url\":\"https:\/\/www.silicloud.com\/blog\/\",\"name\":\"Silicon Cloud Blog\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\"},\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\",\"name\":\"Silicon Cloud Blog\",\"url\":\"https:\/\/www.silicloud.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png\",\"contentUrl\":\"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png\",\"width\":1024,\"height\":1024,\"caption\":\"Silicon Cloud Blog\"},\"image\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/SiliCloudGlobal\/\",\"https:\/\/twitter.com\/SiliCloudGlobal\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/f697031891aacefc4b681d139781d3c0\",\"name\":\"William Carter\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/1786698071dd8d74bec894b512f9e3c610c3a2a32985f67e688976cee3c8bbef?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/1786698071dd8d74bec894b512f9e3c610c3a2a32985f67e688976cee3c8bbef?s=96&d=mm&r=g\",\"caption\":\"William Carter\"},\"url\":\"https:\/\/www.silicloud.com\/blog\/author\/williamcarter\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"How does Hive automatically merge small files? - Blog - Silicon Cloud","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/","og_locale":"en_US","og_type":"article","og_title":"How does Hive automatically merge small files?","og_description":"To automatically merge small files in Hive, you can use the following methods: The task of merging small files is automated by Hive: using the hive.merge.smallfiles.avgsize parameter in Hive to set the threshold for the average size of small files, when the average size of small files is below this threshold, Hive will automatically merge [&hellip;]","og_url":"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/","og_site_name":"Blog - Silicon Cloud","article_publisher":"https:\/\/www.facebook.com\/SiliCloudGlobal\/","article_published_time":"2024-03-15T20:20:30+00:00","article_modified_time":"2024-03-21T17:54:21+00:00","author":"William Carter","twitter_card":"summary_large_image","twitter_creator":"@SiliCloudGlobal","twitter_site":"@SiliCloudGlobal","twitter_misc":{"Written by":"William Carter","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/#article","isPartOf":{"@id":"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/"},"author":{"name":"William Carter","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/f697031891aacefc4b681d139781d3c0"},"headline":"How does Hive automatically merge small files?","datePublished":"2024-03-15T20:20:30+00:00","dateModified":"2024-03-21T17:54:21+00:00","mainEntityOfPage":{"@id":"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/"},"wordCount":230,"commentCount":0,"publisher":{"@id":"https:\/\/www.silicloud.com\/blog\/#organization"},"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/","url":"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/","name":"How does Hive automatically merge small files? - Blog - Silicon Cloud","isPartOf":{"@id":"https:\/\/www.silicloud.com\/blog\/#website"},"datePublished":"2024-03-15T20:20:30+00:00","dateModified":"2024-03-21T17:54:21+00:00","breadcrumb":{"@id":"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.silicloud.com\/blog\/how-does-hive-automatically-merge-small-files\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.silicloud.com\/blog\/"},{"@type":"ListItem","position":2,"name":"How does Hive automatically merge small files?"}]},{"@type":"WebSite","@id":"https:\/\/www.silicloud.com\/blog\/#website","url":"https:\/\/www.silicloud.com\/blog\/","name":"Silicon Cloud Blog","description":"","publisher":{"@id":"https:\/\/www.silicloud.com\/blog\/#organization"},"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.silicloud.com\/blog\/#organization","name":"Silicon Cloud Blog","url":"https:\/\/www.silicloud.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png","contentUrl":"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png","width":1024,"height":1024,"caption":"Silicon Cloud Blog"},"image":{"@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/SiliCloudGlobal\/","https:\/\/twitter.com\/SiliCloudGlobal"]},{"@type":"Person","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/f697031891aacefc4b681d139781d3c0","name":"William Carter","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/1786698071dd8d74bec894b512f9e3c610c3a2a32985f67e688976cee3c8bbef?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/1786698071dd8d74bec894b512f9e3c610c3a2a32985f67e688976cee3c8bbef?s=96&d=mm&r=g","caption":"William Carter"},"url":"https:\/\/www.silicloud.com\/blog\/author\/williamcarter\/"}]}},"_links":{"self":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/20362","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/users\/8"}],"replies":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/comments?post=20362"}],"version-history":[{"count":1,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/20362\/revisions"}],"predecessor-version":[{"id":54156,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/20362\/revisions\/54156"}],"wp:attachment":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/media?parent=20362"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/categories?post=20362"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/tags?post=20362"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}