GROUP_CONCAT in Hive: Usage Guide

In Hive, you can use the GROUP_CONCAT function to concatenate the values of grouped data into a single string.

The syntax for GROUP_CONCAT is as follows:

GROUP_CONCAT(expr [, sep])

expr indicates the expression to be combined, which can be a column name, constant, or a more complex expression. sep is an optional parameter used to specify the separator string for the combination, with a default of a comma (,).

Here is an example demonstrating how to use the GROUP_CONCAT function in Hive.

Suppose there is a table named “students” that contains the following data:

+------+--------+
| name | course |
+------+--------+
| John | Math   |
| John | Science|
| John | English|
| Mary | Math   |
| Mary | Science|
| Alice| Math   |
+------+--------+

You can utilize the following query to use the GROUP_CONCAT function:

SELECT name, GROUP_CONCAT(course) AS courses
FROM students
GROUP BY name;

This will result in the following:

+------+----------------------+
| name |       courses        |
+------+----------------------+
| John | Math,Science,English |
| Mary | Math,Science         |
| Alice| Math                 |
+------+----------------------+

In this example, the GROUP_CONCAT function combines each name’s course into a comma-separated string and returns the result as a column called courses.

bannerAds