How to merge two datasets in R and remove duplicates?

In R, you can combine two datasets using the merge() function and remove duplicates using the unique() function.

Here is an example code for merging two datasets and removing duplicates.

# 创建两个数据集
df1 <- data.frame(id = c(1, 2, 3), name = c("Alice", "Bob", "Charlie"))
df2 <- data.frame(id = c(2, 3, 4), age = c(25, 30, 35))

# 使用merge()函数合并数据集
merged_df <- merge(df1, df2, by = "id", all = TRUE)

# 使用unique()函数去除重复项
unique_df <- unique(merged_df)

In the code above, two datasets df1 and df2 are first created. Then, the merge() function is used to combine the two datasets based on the id column, with the parameter all set to TRUE, indicating that all rows should be kept. The merged result is stored in merged_df.

Finally, the unique() function is used to remove duplicates from merged_df, and the resulting data is stored in unique_df.

bannerAds