{"id":2158,"date":"2024-03-12T09:20:54","date_gmt":"2024-03-12T09:20:54","guid":{"rendered":"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/"},"modified":"2024-04-08T15:23:16","modified_gmt":"2024-04-08T15:23:16","slug":"how-to-handle-imbalanced-datasets-in-keras","status":"publish","type":"post","link":"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/","title":{"rendered":"How to handle imbalanced datasets in Keras?"},"content":{"rendered":"<p>There are several methods for handling imbalanced datasets in <a href=\"https:\/\/keras.io\/\">Keras<\/a>.<\/p>\n<ol>\n<li>Weighting of classes in a classification model.<\/li>\n<li>train the model<\/li>\n<li>weight distribution within a class<\/li>\n<\/ol>\n<pre class=\"post-pre\"><code>class_weight = {<span class=\"hljs-number\">0<\/span>: <span class=\"hljs-number\">1<\/span>, <span class=\"hljs-number\">1<\/span>: <span class=\"hljs-number\">10<\/span>}  <span class=\"hljs-comment\"># \u8bbe\u7f6e\u7c7b\u522b\u6743\u91cd\uff0c\u4f8b\u5982\u5c11\u6570\u7c7b\u522b\u8bbe\u7f6e\u66f4\u5927\u7684\u6743\u91cd<\/span>\r\nmodel.fit(X_train, y_train, class_weight=class_weight)\r\n<\/code><\/pre>\n<ol>\n<li>Over-sampling\/under-sampling: Balancing a dataset can be achieved by either over-sampling (increasing samples of minority class) or under-sampling (reducing samples of majority class). This can be done using RandomOverSampler and RandomUnderSampler from the imbalanced-learn library to over-sample and under-sample, respectively, before using the processed data for model training.<\/li>\n<li>By utilizing a custom loss function, you can define your own loss function based on the specific situation, allowing it to place more emphasis on samples from minority classes. Using the backend module in Keras, you can define a custom loss function and then specify it during model compilation.<\/li>\n<\/ol>\n<pre class=\"post-pre\"><code><span class=\"hljs-keyword\">import<\/span> keras.backend <span class=\"hljs-keyword\">as<\/span> K\r\n\r\n<span class=\"hljs-keyword\">def<\/span> <span class=\"hljs-title function_\">custom_loss<\/span>(<span class=\"hljs-params\">y_true, y_pred<\/span>):\r\n    <span class=\"hljs-comment\"># \u81ea\u5b9a\u4e49\u635f\u5931\u51fd\u6570\uff0c\u4f8b\u5982\u5c06\u635f\u5931\u51fd\u6570\u5728\u5c11\u6570\u7c7b\u522b\u6837\u672c\u4e0a\u52a0\u6743<\/span>\r\n    loss = K.binary_crossentropy(y_true, y_pred)  <span class=\"hljs-comment\"># \u4e8c\u5206\u7c7b\u4ea4\u53c9\u71b5\u635f\u5931<\/span>\r\n    <span class=\"hljs-keyword\">return<\/span> loss\r\n\r\nmodel.<span class=\"hljs-built_in\">compile<\/span>(loss=custom_loss, optimizer=<span class=\"hljs-string\">'adam'<\/span>)\r\n<\/code><\/pre>\n<p>Using the above methods can effectively handle imbalanced datasets and improve the model&#8217;s performance on minority classes.<\/p>\n<p>&nbsp;<\/p>\n<p>More tutorials<\/p>\n<p><a class=\"LinkSuggestion__Link-sc-1gewdgc-4 cLBplk\" href=\"https:\/\/www.silicloud.com\/blog\/how-to-use-custom-loss-functions-in-keras\/\" target=\"_blank\" rel=\"noopener\">How to use custom loss functions in Keras.<span class=\"sc-gswNZR eASTkv\">(Opens in a new browser tab)<\/span><\/a><\/p>\n<p><a class=\"LinkSuggestion__Link-sc-1gewdgc-4 cLBplk\" href=\"https:\/\/www.silicloud.com\/blog\/how-to-evaluate-and-test-models-in-keras\/\" target=\"_blank\" rel=\"noopener\">How to evaluate and test models in Keras?<span class=\"sc-gswNZR eASTkv\">(Opens in a new browser tab)<\/span><\/a><\/p>\n<p><a class=\"LinkSuggestion__Link-sc-1gewdgc-4 cLBplk\" href=\"https:\/\/www.silicloud.com\/blog\/how-to-implement-sequence-to-sequence-learning-in-keras\/\" target=\"_blank\" rel=\"noopener\">How to implement sequence-to-sequence learning in Keras?<span class=\"sc-gswNZR eASTkv\">(Opens in a new browser tab)<\/span><\/a><\/p>\n<p><a class=\"LinkSuggestion__Link-sc-1gewdgc-4 cLBplk\" href=\"https:\/\/www.silicloud.com\/blog\/how-to-import-a-custom-python-file\/\" target=\"_blank\" rel=\"noopener\">How to import a custom Python file?<span class=\"sc-gswNZR eASTkv\">(Opens in a new browser tab)<\/span><\/a><\/p>\n<p><a class=\"LinkSuggestion__Link-sc-1gewdgc-4 cLBplk\" href=\"https:\/\/www.silicloud.com\/blog\/what-are-the-scenarios-where-the-tostring-function-is-used-in-c\/\" target=\"_blank\" rel=\"noopener\">What are the scenarios where the tostring function is used in C++?<span class=\"sc-gswNZR eASTkv\">(Opens in a new browser tab)<\/span><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>There are several methods for handling imbalanced datasets in Keras. Weighting of classes in a classification model. train the model weight distribution within a class class_weight = {0: 1, 1: 10} # \u8bbe\u7f6e\u7c7b\u522b\u6743\u91cd\uff0c\u4f8b\u5982\u5c11\u6570\u7c7b\u522b\u8bbe\u7f6e\u66f4\u5927\u7684\u6743\u91cd model.fit(X_train, y_train, class_weight=class_weight) Over-sampling\/under-sampling: Balancing a dataset can be achieved by either over-sampling (increasing samples of minority class) or under-sampling (reducing samples [&hellip;]<\/p>\n","protected":false},"author":5,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_import_markdown_pro_load_document_selector":0,"_import_markdown_pro_submit_text_textarea":"","footnotes":""},"categories":[1],"tags":[],"class_list":["post-2158","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v21.5 (Yoast SEO v21.5) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>How to handle imbalanced datasets in Keras? - Blog - Silicon Cloud<\/title>\n<meta name=\"description\" content=\"There are several methods for handling imbalanced datasets in Keras.Weighting of classes in a classification model\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How to handle imbalanced datasets in Keras?\" \/>\n<meta property=\"og:description\" content=\"There are several methods for handling imbalanced datasets in Keras.Weighting of classes in a classification model\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/\" \/>\n<meta property=\"og:site_name\" content=\"Blog - Silicon Cloud\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/SiliCloudGlobal\/\" \/>\n<meta property=\"article:published_time\" content=\"2024-03-12T09:20:54+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-04-08T15:23:16+00:00\" \/>\n<meta name=\"author\" content=\"Emily Johnson\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@SiliCloudGlobal\" \/>\n<meta name=\"twitter:site\" content=\"@SiliCloudGlobal\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Emily Johnson\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/\"},\"author\":{\"name\":\"Emily Johnson\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/3b041b19cffc258705478ecfab895378\"},\"headline\":\"How to handle imbalanced datasets in Keras?\",\"datePublished\":\"2024-03-12T09:20:54+00:00\",\"dateModified\":\"2024-04-08T15:23:16+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/\"},\"wordCount\":222,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\"},\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/\",\"url\":\"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/\",\"name\":\"How to handle imbalanced datasets in Keras? - Blog - Silicon Cloud\",\"isPartOf\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#website\"},\"datePublished\":\"2024-03-12T09:20:54+00:00\",\"dateModified\":\"2024-04-08T15:23:16+00:00\",\"description\":\"There are several methods for handling imbalanced datasets in Keras.Weighting of classes in a classification model\",\"breadcrumb\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.silicloud.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"How to handle imbalanced datasets in Keras?\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#website\",\"url\":\"https:\/\/www.silicloud.com\/blog\/\",\"name\":\"Silicon Cloud Blog\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\"},\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\",\"name\":\"Silicon Cloud Blog\",\"url\":\"https:\/\/www.silicloud.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png\",\"contentUrl\":\"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png\",\"width\":1024,\"height\":1024,\"caption\":\"Silicon Cloud Blog\"},\"image\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/SiliCloudGlobal\/\",\"https:\/\/twitter.com\/SiliCloudGlobal\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/3b041b19cffc258705478ecfab895378\",\"name\":\"Emily Johnson\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/a5cb4e73d02ab1d79f2dfe919389ff7c1de072baa97686392031c03d858cc358?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/a5cb4e73d02ab1d79f2dfe919389ff7c1de072baa97686392031c03d858cc358?s=96&d=mm&r=g\",\"caption\":\"Emily Johnson\"},\"url\":\"https:\/\/www.silicloud.com\/blog\/author\/emilyjohnson\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"How to handle imbalanced datasets in Keras? - Blog - Silicon Cloud","description":"There are several methods for handling imbalanced datasets in Keras.Weighting of classes in a classification model","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/","og_locale":"en_US","og_type":"article","og_title":"How to handle imbalanced datasets in Keras?","og_description":"There are several methods for handling imbalanced datasets in Keras.Weighting of classes in a classification model","og_url":"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/","og_site_name":"Blog - Silicon Cloud","article_publisher":"https:\/\/www.facebook.com\/SiliCloudGlobal\/","article_published_time":"2024-03-12T09:20:54+00:00","article_modified_time":"2024-04-08T15:23:16+00:00","author":"Emily Johnson","twitter_card":"summary_large_image","twitter_creator":"@SiliCloudGlobal","twitter_site":"@SiliCloudGlobal","twitter_misc":{"Written by":"Emily Johnson","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/#article","isPartOf":{"@id":"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/"},"author":{"name":"Emily Johnson","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/3b041b19cffc258705478ecfab895378"},"headline":"How to handle imbalanced datasets in Keras?","datePublished":"2024-03-12T09:20:54+00:00","dateModified":"2024-04-08T15:23:16+00:00","mainEntityOfPage":{"@id":"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/"},"wordCount":222,"commentCount":0,"publisher":{"@id":"https:\/\/www.silicloud.com\/blog\/#organization"},"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/","url":"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/","name":"How to handle imbalanced datasets in Keras? - Blog - Silicon Cloud","isPartOf":{"@id":"https:\/\/www.silicloud.com\/blog\/#website"},"datePublished":"2024-03-12T09:20:54+00:00","dateModified":"2024-04-08T15:23:16+00:00","description":"There are several methods for handling imbalanced datasets in Keras.Weighting of classes in a classification model","breadcrumb":{"@id":"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.silicloud.com\/blog\/how-to-handle-imbalanced-datasets-in-keras\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.silicloud.com\/blog\/"},{"@type":"ListItem","position":2,"name":"How to handle imbalanced datasets in Keras?"}]},{"@type":"WebSite","@id":"https:\/\/www.silicloud.com\/blog\/#website","url":"https:\/\/www.silicloud.com\/blog\/","name":"Silicon Cloud Blog","description":"","publisher":{"@id":"https:\/\/www.silicloud.com\/blog\/#organization"},"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.silicloud.com\/blog\/#organization","name":"Silicon Cloud Blog","url":"https:\/\/www.silicloud.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png","contentUrl":"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png","width":1024,"height":1024,"caption":"Silicon Cloud Blog"},"image":{"@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/SiliCloudGlobal\/","https:\/\/twitter.com\/SiliCloudGlobal"]},{"@type":"Person","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/3b041b19cffc258705478ecfab895378","name":"Emily Johnson","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/a5cb4e73d02ab1d79f2dfe919389ff7c1de072baa97686392031c03d858cc358?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/a5cb4e73d02ab1d79f2dfe919389ff7c1de072baa97686392031c03d858cc358?s=96&d=mm&r=g","caption":"Emily Johnson"},"url":"https:\/\/www.silicloud.com\/blog\/author\/emilyjohnson\/"}]}},"_links":{"self":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/2158","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/comments?post=2158"}],"version-history":[{"count":3,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/2158\/revisions"}],"predecessor-version":[{"id":77617,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/2158\/revisions\/77617"}],"wp:attachment":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/media?parent=2158"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/categories?post=2158"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/tags?post=2158"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}