How do I import data into Hive from a table?

2 years ago

Isabella Edwards

2 minutes

There are various ways to import data into a table in Hive, the specific method depends on the data’s source and format. Here are some common methods for importing data:

By using the LOAD DATA statement, you can import data from a local file into a Hive table. For example, if you have a Hive table named my_table, you can use the following command to load data from a local file into that table:

LOAD DATA LOCAL INPATH '/path/to/local/data/file' INTO TABLE my_table;

Please note that the LOAD DATA statement, by default, will move the data to the Hive data directory. If you want to retain the data locally, use the LOCAL keyword.

To insert data from another Hive table or query results into an existing Hive table, you can use the INSERT statement. For example, if you have a Hive table called my_table and another table named source_table, you can use the following command to insert the data from source_table into my_table:

INSERT INTO TABLE my_table SELECT * FROM source_table;

Specific conditions can also be used to filter the data to be inserted, for example:

INSERT INTO TABLE my_table SELECT col1, col2 FROM source_table WHERE col3 > 10;

By using Hive’s Hadoop InputFormat, you can import data from Hadoop Distributed File System (HDFS). Simply upload the data files to HDFS, then create a table using Hive’s CREATE TABLE statement, specifying the Hadoop InputFormat and the location of the data files. For example:

CREATE TABLE my_table (col1 INT, col2 STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION '/path/to/hdfs/data/file';

In this example, the location of the data file is /path/to/hdfs/data/file, which can be a single file or a directory. Next, use the LOAD DATA statement to import the data into the table:

LOAD DATA INPATH '/path/to/hdfs/data/file' INTO TABLE my_table;

These are some common methods for importing data into tables in Hive, and the specific method can be chosen based on the source and format of the data.