{"id":15434,"date":"2024-03-15T11:09:41","date_gmt":"2024-03-15T11:09:41","guid":{"rendered":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/"},"modified":"2025-08-06T18:34:59","modified_gmt":"2025-08-06T18:34:59","slug":"what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns","status":"publish","type":"post","link":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/","title":{"rendered":"What is the method in pandas for removing duplicates ba&#8230;"},"content":{"rendered":"<p>The method pandas uses to remove duplicate rows is by using the drop_duplicates() function. This function can generate a new DataFrame with the duplicate rows removed.<\/p>\n<p>Here is the specific usage:<\/p>\n<pre class=\"post-pre\"><code>df.drop_duplicates(subset=[\u5217\u540d], keep=<span class=\"hljs-string\">'first'<\/span>, inplace=<span class=\"hljs-literal\">True<\/span>)\r\n<\/code><\/pre>\n<ol>\n<li>The subset parameter is used to specify the column name or list of column names to check for duplicates. By default, this parameter is set to None, indicating that all columns will be checked.<\/li>\n<li>The parameter &#8216;keep&#8217; is used to specify which duplicate value to retain. The options are first, last, and False. By default, it is set to first, which means to keep the first occurring duplicate value; last means to keep the last occurring duplicate value; False means to delete all duplicate values.<\/li>\n<li>The inplace parameter is used to specify whether to make changes on the original DataFrame. By default, it is set to False, which means it will return a new DataFrame after removing duplicate values. If set to True, it will modify the original DataFrame and return None.<\/li>\n<\/ol>\n<p>The shop was closed due to renovations.<br \/>\nParaphrase: The store was shut down because it was being renovated.<\/p>\n<pre class=\"post-pre\"><code><span class=\"hljs-keyword\">import<\/span> pandas <span class=\"hljs-keyword\">as<\/span> pd\r\n\r\n<span class=\"hljs-comment\"># \u521b\u5efa\u4e00\u4e2a\u5305\u542b\u91cd\u590d\u503c\u7684DataFrame<\/span>\r\ndata = {<span class=\"hljs-string\">'A'<\/span>: [<span class=\"hljs-number\">1<\/span>, <span class=\"hljs-number\">2<\/span>, <span class=\"hljs-number\">2<\/span>, <span class=\"hljs-number\">3<\/span>, <span class=\"hljs-number\">4<\/span>, <span class=\"hljs-number\">4<\/span>],\r\n        <span class=\"hljs-string\">'B'<\/span>: [<span class=\"hljs-string\">'a'<\/span>, <span class=\"hljs-string\">'b'<\/span>, <span class=\"hljs-string\">'b'<\/span>, <span class=\"hljs-string\">'c'<\/span>, <span class=\"hljs-string\">'d'<\/span>, <span class=\"hljs-string\">'d'<\/span>]}\r\ndf = pd.DataFrame(data)\r\n\r\n<span class=\"hljs-comment\"># \u6839\u636e\u5217'A'\u53bb\u91cd<\/span>\r\ndf.drop_duplicates(subset=[<span class=\"hljs-string\">'A'<\/span>], keep=<span class=\"hljs-string\">'first'<\/span>, inplace=<span class=\"hljs-literal\">True<\/span>)\r\n<span class=\"hljs-built_in\">print<\/span>(df)\r\n<\/code><\/pre>\n<p>Result output:<\/p>\n<pre class=\"post-pre\"><code>   A  B\r\n0  1  a\r\n1  2  b\r\n3  3  c\r\n4  4  d\r\n<\/code><\/pre>\n<p>In the above example, duplicates were removed based on column &#8216;A&#8217;, with only the first occurrence being kept.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The method pandas uses to remove duplicate rows is by using the drop_duplicates() function. This function can generate a new DataFrame with the duplicate rows removed. Here is the specific usage: df.drop_duplicates(subset=[\u5217\u540d], keep=&#8217;first&#8217;, inplace=True) The subset parameter is used to specify the column name or list of column names to check for duplicates. By default, [&hellip;]<\/p>\n","protected":false},"author":5,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_import_markdown_pro_load_document_selector":0,"_import_markdown_pro_submit_text_textarea":"","footnotes":""},"categories":[1],"tags":[453,1402,299,1404,1403],"class_list":["post-15434","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-development","tag-guide","tag-programming","tag-technology","tag-tutorial"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v21.5 (Yoast SEO v21.5) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>What is the method in pandas for removing duplicates ba... - Blog - Silicon Cloud<\/title>\n<meta name=\"description\" content=\"Learn about what is the method in pandas for removing duplicates based on columns?. Comprehensive guide with examples and best practices.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"What is the method in pandas for removing duplicates ba...\" \/>\n<meta property=\"og:description\" content=\"Learn about what is the method in pandas for removing duplicates based on columns?. Comprehensive guide with examples and best practices.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/\" \/>\n<meta property=\"og:site_name\" content=\"Blog - Silicon Cloud\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/SiliCloudGlobal\/\" \/>\n<meta property=\"article:published_time\" content=\"2024-03-15T11:09:41+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-08-06T18:34:59+00:00\" \/>\n<meta name=\"author\" content=\"Emily Johnson\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@SiliCloudGlobal\" \/>\n<meta name=\"twitter:site\" content=\"@SiliCloudGlobal\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Emily Johnson\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/\"},\"author\":{\"name\":\"Emily Johnson\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/3b041b19cffc258705478ecfab895378\"},\"headline\":\"What is the method in pandas for removing duplicates ba&#8230;\",\"datePublished\":\"2024-03-15T11:09:41+00:00\",\"dateModified\":\"2025-08-06T18:34:59+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/\"},\"wordCount\":212,\"publisher\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\"},\"keywords\":[\"Development\",\"guide\",\"programming\",\"technology\",\"tutorial\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/\",\"url\":\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/\",\"name\":\"What is the method in pandas for removing duplicates ba... - Blog - Silicon Cloud\",\"isPartOf\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#website\"},\"datePublished\":\"2024-03-15T11:09:41+00:00\",\"dateModified\":\"2025-08-06T18:34:59+00:00\",\"description\":\"Learn about what is the method in pandas for removing duplicates based on columns?. Comprehensive guide with examples and best practices.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.silicloud.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"What is the method in pandas for removing duplicates ba&#8230;\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#website\",\"url\":\"https:\/\/www.silicloud.com\/blog\/\",\"name\":\"Silicon Cloud Blog\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\"},\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\",\"name\":\"Silicon Cloud Blog\",\"url\":\"https:\/\/www.silicloud.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png\",\"contentUrl\":\"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png\",\"width\":1024,\"height\":1024,\"caption\":\"Silicon Cloud Blog\"},\"image\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/SiliCloudGlobal\/\",\"https:\/\/twitter.com\/SiliCloudGlobal\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/3b041b19cffc258705478ecfab895378\",\"name\":\"Emily Johnson\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/a5cb4e73d02ab1d79f2dfe919389ff7c1de072baa97686392031c03d858cc358?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/a5cb4e73d02ab1d79f2dfe919389ff7c1de072baa97686392031c03d858cc358?s=96&d=mm&r=g\",\"caption\":\"Emily Johnson\"},\"url\":\"https:\/\/www.silicloud.com\/blog\/author\/emilyjohnson\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"What is the method in pandas for removing duplicates ba... - Blog - Silicon Cloud","description":"Learn about what is the method in pandas for removing duplicates based on columns?. Comprehensive guide with examples and best practices.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/","og_locale":"en_US","og_type":"article","og_title":"What is the method in pandas for removing duplicates ba...","og_description":"Learn about what is the method in pandas for removing duplicates based on columns?. Comprehensive guide with examples and best practices.","og_url":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/","og_site_name":"Blog - Silicon Cloud","article_publisher":"https:\/\/www.facebook.com\/SiliCloudGlobal\/","article_published_time":"2024-03-15T11:09:41+00:00","article_modified_time":"2025-08-06T18:34:59+00:00","author":"Emily Johnson","twitter_card":"summary_large_image","twitter_creator":"@SiliCloudGlobal","twitter_site":"@SiliCloudGlobal","twitter_misc":{"Written by":"Emily Johnson","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/#article","isPartOf":{"@id":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/"},"author":{"name":"Emily Johnson","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/3b041b19cffc258705478ecfab895378"},"headline":"What is the method in pandas for removing duplicates ba&#8230;","datePublished":"2024-03-15T11:09:41+00:00","dateModified":"2025-08-06T18:34:59+00:00","mainEntityOfPage":{"@id":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/"},"wordCount":212,"publisher":{"@id":"https:\/\/www.silicloud.com\/blog\/#organization"},"keywords":["Development","guide","programming","technology","tutorial"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/","url":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/","name":"What is the method in pandas for removing duplicates ba... - Blog - Silicon Cloud","isPartOf":{"@id":"https:\/\/www.silicloud.com\/blog\/#website"},"datePublished":"2024-03-15T11:09:41+00:00","dateModified":"2025-08-06T18:34:59+00:00","description":"Learn about what is the method in pandas for removing duplicates based on columns?. Comprehensive guide with examples and best practices.","breadcrumb":{"@id":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.silicloud.com\/blog\/what-is-the-method-in-pandas-for-removing-duplicates-based-on-columns\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.silicloud.com\/blog\/"},{"@type":"ListItem","position":2,"name":"What is the method in pandas for removing duplicates ba&#8230;"}]},{"@type":"WebSite","@id":"https:\/\/www.silicloud.com\/blog\/#website","url":"https:\/\/www.silicloud.com\/blog\/","name":"Silicon Cloud Blog","description":"","publisher":{"@id":"https:\/\/www.silicloud.com\/blog\/#organization"},"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.silicloud.com\/blog\/#organization","name":"Silicon Cloud Blog","url":"https:\/\/www.silicloud.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png","contentUrl":"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png","width":1024,"height":1024,"caption":"Silicon Cloud Blog"},"image":{"@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/SiliCloudGlobal\/","https:\/\/twitter.com\/SiliCloudGlobal"]},{"@type":"Person","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/3b041b19cffc258705478ecfab895378","name":"Emily Johnson","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/a5cb4e73d02ab1d79f2dfe919389ff7c1de072baa97686392031c03d858cc358?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/a5cb4e73d02ab1d79f2dfe919389ff7c1de072baa97686392031c03d858cc358?s=96&d=mm&r=g","caption":"Emily Johnson"},"url":"https:\/\/www.silicloud.com\/blog\/author\/emilyjohnson\/"}]}},"_links":{"self":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/15434","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/comments?post=15434"}],"version-history":[{"count":2,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/15434\/revisions"}],"predecessor-version":[{"id":159037,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/15434\/revisions\/159037"}],"wp:attachment":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/media?parent=15434"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/categories?post=15434"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/tags?post=15434"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}