{"id":18678,"date":"2024-03-15T17:25:40","date_gmt":"2024-03-15T17:25:40","guid":{"rendered":"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/"},"modified":"2024-03-21T13:53:00","modified_gmt":"2024-03-21T13:53:00","slug":"how-is-the-groupby-function-used-in-pandas","status":"publish","type":"post","link":"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/","title":{"rendered":"How is the &#8216;groupby&#8217; function used in Pandas?"},"content":{"rendered":"<p>In Pandas, the groupby() function is used to group data. It allows for data to be grouped based on specified columns, and then operations can be applied to each group, such as calculating statistics, aggregating, or transforming.<\/p>\n<p>The basic usage of groupby() is:<\/p>\n<pre class=\"post-pre\"><code>df.groupby(by=<span class=\"hljs-literal\">None<\/span>, axis=<span class=\"hljs-number\">0<\/span>, level=<span class=\"hljs-literal\">None<\/span>, as_index=<span class=\"hljs-literal\">True<\/span>, sort=<span class=\"hljs-literal\">True<\/span>, group_keys=<span class=\"hljs-literal\">True<\/span>, squeeze=<span class=\"hljs-literal\">False<\/span>, observed=<span class=\"hljs-literal\">False<\/span>, dropna=<span class=\"hljs-literal\">True<\/span>)\r\n<\/code><\/pre>\n<p>Parameter Explanation:<\/p>\n<ol>\n<li>Specify which columns to group by, which can be a single column name, a list of multiple column names, a Series, a dictionary, a function, etc. The default is None, which means grouping by the entire DataFrame.<\/li>\n<li>Axis: Specifies the axis of grouping, where 0 represents grouping by rows and 1 represents grouping by columns, with the default being 0.<\/li>\n<li>Level: If a DataFrame has multiple index levels, you can specify which level to group by, with the default being None.<\/li>\n<li>as_index: Specifies whether the results of the grouping should use the group column as an index, with the default being True.<\/li>\n<li>sort: specifies whether the results after grouping will be sorted by the grouping column, with the default being True.<\/li>\n<li>group_keys: Specifies whether to display group keys in the grouped result, default is True.<\/li>\n<li>Squeeze: Specifies whether to squeeze the results of a specific group after grouping, default is False.<\/li>\n<li>observed: Specifies whether to use all observed values of the groups for grouping, with a default setting of False.<\/li>\n<li>dropna: Specifies whether to exclude group keys containing missing values, with a default value of True.<\/li>\n<\/ol>\n<p>The groupby() function returns a GroupBy object, which can be used to perform various operations such as applying aggregation functions (like sum, mean, etc.), filtering data, and transforming data.<\/p>\n<p>Specific operations can be achieved through the methods of the GroupBy object, such as:<\/p>\n<ol>\n<li>agg(): Apply aggregation functions to each group.<\/li>\n<li>apply(): Applying a custom function to each group.<\/li>\n<li>transform(): Apply a transformation function to each group.<\/li>\n<li>filter(): select data based on certain conditions.<\/li>\n<\/ol>\n<p>Sample code:<\/p>\n<pre class=\"post-pre\"><code><span class=\"hljs-keyword\">import<\/span> pandas <span class=\"hljs-keyword\">as<\/span> pd\r\n\r\n<span class=\"hljs-comment\"># \u521b\u5efa\u4e00\u4e2aDataFrame<\/span>\r\ndata = {<span class=\"hljs-string\">'Name'<\/span>: [<span class=\"hljs-string\">'Tom'<\/span>, <span class=\"hljs-string\">'Nick'<\/span>, <span class=\"hljs-string\">'John'<\/span>, <span class=\"hljs-string\">'Tom'<\/span>, <span class=\"hljs-string\">'Nick'<\/span>, <span class=\"hljs-string\">'John'<\/span>],\r\n        <span class=\"hljs-string\">'Subject'<\/span>: [<span class=\"hljs-string\">'Math'<\/span>, <span class=\"hljs-string\">'English'<\/span>, <span class=\"hljs-string\">'Math'<\/span>, <span class=\"hljs-string\">'English'<\/span>, <span class=\"hljs-string\">'Math'<\/span>, <span class=\"hljs-string\">'English'<\/span>],\r\n        <span class=\"hljs-string\">'Score'<\/span>: [<span class=\"hljs-number\">85<\/span>, <span class=\"hljs-number\">90<\/span>, <span class=\"hljs-number\">92<\/span>, <span class=\"hljs-number\">78<\/span>, <span class=\"hljs-number\">82<\/span>, <span class=\"hljs-number\">88<\/span>]}\r\ndf = pd.DataFrame(data)\r\n\r\n<span class=\"hljs-comment\"># \u6309\u7167Name\u5217\u8fdb\u884c\u5206\u7ec4\uff0c\u5e76\u8ba1\u7b97\u6bcf\u4e2a\u5206\u7ec4\u7684\u5e73\u5747\u5206\u6570<\/span>\r\nresult = df.groupby(<span class=\"hljs-string\">'Name'<\/span>)[<span class=\"hljs-string\">'Score'<\/span>].mean()\r\n<span class=\"hljs-built_in\">print<\/span>(result)\r\n<\/code><\/pre>\n<p>Result output:<\/p>\n<pre class=\"post-pre\"><code>Name\r\nJohn    90.0\r\nNick    86.0\r\nTom     81.5\r\nName: Score, dtype: float64\r\n<\/code><\/pre>\n<p>In this example, the data is first grouped based on the Name column, and then the average score for each group is calculated. The result is a Series with the unique values of the groups (values of the Name column) as the index and the average score of each group as the values.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In Pandas, the groupby() function is used to group data. It allows for data to be grouped based on specified columns, and then operations can be applied to each group, such as calculating statistics, aggregating, or transforming. The basic usage of groupby() is: df.groupby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False, observed=False, dropna=True) Parameter Explanation: Specify [&hellip;]<\/p>\n","protected":false},"author":7,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_import_markdown_pro_load_document_selector":0,"_import_markdown_pro_submit_text_textarea":"","footnotes":""},"categories":[1],"tags":[],"class_list":["post-18678","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v21.5 (Yoast SEO v21.5) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>How is the &#039;groupby&#039; function used in Pandas? - Blog - Silicon Cloud<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How is the &#039;groupby&#039; function used in Pandas?\" \/>\n<meta property=\"og:description\" content=\"In Pandas, the groupby() function is used to group data. It allows for data to be grouped based on specified columns, and then operations can be applied to each group, such as calculating statistics, aggregating, or transforming. The basic usage of groupby() is: df.groupby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False, observed=False, dropna=True) Parameter Explanation: Specify [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/\" \/>\n<meta property=\"og:site_name\" content=\"Blog - Silicon Cloud\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/SiliCloudGlobal\/\" \/>\n<meta property=\"article:published_time\" content=\"2024-03-15T17:25:40+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-03-21T13:53:00+00:00\" \/>\n<meta name=\"author\" content=\"Sophia Anderson\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@SiliCloudGlobal\" \/>\n<meta name=\"twitter:site\" content=\"@SiliCloudGlobal\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Sophia Anderson\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/\"},\"author\":{\"name\":\"Sophia Anderson\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/19a24313de9c988db3d69226b4a40a30\"},\"headline\":\"How is the &#8216;groupby&#8217; function used in Pandas?\",\"datePublished\":\"2024-03-15T17:25:40+00:00\",\"dateModified\":\"2024-03-21T13:53:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/\"},\"wordCount\":368,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\"},\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/\",\"url\":\"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/\",\"name\":\"How is the 'groupby' function used in Pandas? - Blog - Silicon Cloud\",\"isPartOf\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#website\"},\"datePublished\":\"2024-03-15T17:25:40+00:00\",\"dateModified\":\"2024-03-21T13:53:00+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.silicloud.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"How is the &#8216;groupby&#8217; function used in Pandas?\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#website\",\"url\":\"https:\/\/www.silicloud.com\/blog\/\",\"name\":\"Silicon Cloud Blog\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\"},\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#organization\",\"name\":\"Silicon Cloud Blog\",\"url\":\"https:\/\/www.silicloud.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png\",\"contentUrl\":\"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png\",\"width\":1024,\"height\":1024,\"caption\":\"Silicon Cloud Blog\"},\"image\":{\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/SiliCloudGlobal\/\",\"https:\/\/twitter.com\/SiliCloudGlobal\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/19a24313de9c988db3d69226b4a40a30\",\"name\":\"Sophia Anderson\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/c726c09aa40e37115fb5c62d0c3ed62c16ca255d3763e2e3ae83a70ddf8c2175?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/c726c09aa40e37115fb5c62d0c3ed62c16ca255d3763e2e3ae83a70ddf8c2175?s=96&d=mm&r=g\",\"caption\":\"Sophia Anderson\"},\"url\":\"https:\/\/www.silicloud.com\/blog\/author\/sophiaanderson\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"How is the 'groupby' function used in Pandas? - Blog - Silicon Cloud","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/","og_locale":"en_US","og_type":"article","og_title":"How is the 'groupby' function used in Pandas?","og_description":"In Pandas, the groupby() function is used to group data. It allows for data to be grouped based on specified columns, and then operations can be applied to each group, such as calculating statistics, aggregating, or transforming. The basic usage of groupby() is: df.groupby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False, observed=False, dropna=True) Parameter Explanation: Specify [&hellip;]","og_url":"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/","og_site_name":"Blog - Silicon Cloud","article_publisher":"https:\/\/www.facebook.com\/SiliCloudGlobal\/","article_published_time":"2024-03-15T17:25:40+00:00","article_modified_time":"2024-03-21T13:53:00+00:00","author":"Sophia Anderson","twitter_card":"summary_large_image","twitter_creator":"@SiliCloudGlobal","twitter_site":"@SiliCloudGlobal","twitter_misc":{"Written by":"Sophia Anderson","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/#article","isPartOf":{"@id":"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/"},"author":{"name":"Sophia Anderson","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/19a24313de9c988db3d69226b4a40a30"},"headline":"How is the &#8216;groupby&#8217; function used in Pandas?","datePublished":"2024-03-15T17:25:40+00:00","dateModified":"2024-03-21T13:53:00+00:00","mainEntityOfPage":{"@id":"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/"},"wordCount":368,"commentCount":0,"publisher":{"@id":"https:\/\/www.silicloud.com\/blog\/#organization"},"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/","url":"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/","name":"How is the 'groupby' function used in Pandas? - Blog - Silicon Cloud","isPartOf":{"@id":"https:\/\/www.silicloud.com\/blog\/#website"},"datePublished":"2024-03-15T17:25:40+00:00","dateModified":"2024-03-21T13:53:00+00:00","breadcrumb":{"@id":"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.silicloud.com\/blog\/how-is-the-groupby-function-used-in-pandas\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.silicloud.com\/blog\/"},{"@type":"ListItem","position":2,"name":"How is the &#8216;groupby&#8217; function used in Pandas?"}]},{"@type":"WebSite","@id":"https:\/\/www.silicloud.com\/blog\/#website","url":"https:\/\/www.silicloud.com\/blog\/","name":"Silicon Cloud Blog","description":"","publisher":{"@id":"https:\/\/www.silicloud.com\/blog\/#organization"},"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.silicloud.com\/blog\/#organization","name":"Silicon Cloud Blog","url":"https:\/\/www.silicloud.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png","contentUrl":"https:\/\/www.silicloud.com\/blog\/wp-content\/uploads\/2023\/11\/EN-SILICON-Full.png","width":1024,"height":1024,"caption":"Silicon Cloud Blog"},"image":{"@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/SiliCloudGlobal\/","https:\/\/twitter.com\/SiliCloudGlobal"]},{"@type":"Person","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/19a24313de9c988db3d69226b4a40a30","name":"Sophia Anderson","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.silicloud.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/c726c09aa40e37115fb5c62d0c3ed62c16ca255d3763e2e3ae83a70ddf8c2175?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/c726c09aa40e37115fb5c62d0c3ed62c16ca255d3763e2e3ae83a70ddf8c2175?s=96&d=mm&r=g","caption":"Sophia Anderson"},"url":"https:\/\/www.silicloud.com\/blog\/author\/sophiaanderson\/"}]}},"_links":{"self":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/18678","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/comments?post=18678"}],"version-history":[{"count":1,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/18678\/revisions"}],"predecessor-version":[{"id":52374,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/posts\/18678\/revisions\/52374"}],"wp:attachment":[{"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/media?parent=18678"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/categories?post=18678"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.silicloud.com\/blog\/wp-json\/wp\/v2\/tags?post=18678"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}