How to remove duplicates from Teradata table data?

2 years ago

Liam

2 minutes

In Teradata databases, there are two methods available for deduplicating table data.

1. Using the DISTINCT keyword: You can use the SELECT statement with the DISTINCT keyword to select unique records. Here is an example:

SELECT DISTINCT column1, column2, ...

FROM your_table;

In the example above, you need to replace `column1, column2, …` with the actual column names you want to deduplicate based on, and replace `your_table` with the actual table name from which you want to select data.

2. By using the QUALIFY clause and the ROW_NUMBER() function, you can select the first record in each group to achieve deduplication. Here is an example:

   SELECT column1, column2, ...
   FROM (
       SELECT column1, column2, ..., ROW_NUMBER() OVER (PARTITION BY column1, column2, ... ORDER 
       BY column1, column2, ...) AS row_num
       FROM your_table
   ) AS subquery
   WHERE row_num = 1;

Both methods can be used to remove duplicate data in Teradata. The choice between the two will depend on your specific requirements and performance needs.