How is the CROSS operation implemented in Pig?

In Pig, the CROSS operation is implemented using the CROSS keyword. The CROSS operation performs a Cartesian product on two relations, combining each record from one relation with every record from another relation to create a new relation.

For example, if we have two relationships A and B, we can use the CROSS operation to perform a Cartesian product on them.

A = LOAD 'data1.txt' AS (id: int, name: chararray);
B = LOAD 'data2.txt' AS (id: int, age: int);

C = CROSS A, B;

DUMP C;

In the example above, the relationships A and B each have two fields (id and name, and id and age), which are combined using a CROSS operation to create a new relationship C. Finally, the records from the newly generated relationship C are displayed on the console using the DUMP command.

Leave a Reply 0

Your email address will not be published. Required fields are marked *


广告
Closing in 10 seconds
bannerAds