How can two datasets be merged in R language?

2 years ago

Liam

2 minutes

In R language, you can combine two datasets using either the merge() function or the join() function.

Use the merge() function:
Syntax: merge(x, y, by, by.x, by.y)
Explanation of parameters:

x and y: two datasets to be merged.
The “by” argument specifies the column name(s) to merge on, defaulting to NULL which automatically matches columns with the same name.
by.x and by.y: specify the column names in x and y to merge on, these parameters can be used if the column names are not the same.

Sample code:

# 创建两个数据集
df1 <- data.frame(ID = 1:5, Name = c("A", "B", "C", "D", "E"))
df2 <- data.frame(ID = 3:7, Age = c(20, 30, 40, 50, 60))

# 使用merge函数合并数据集
merged_df <- merge(df1, df2, by = "ID")
merged_df

You can use the join() function in the dplyr package to perform merging operations with functions like left_join(), right_join(), inner_join(), and full_join().

Sample code:

# 导入dplyr包
library(dplyr)

# 创建两个数据集
df1 <- data.frame(ID = 1:5, Name = c("A", "B", "C", "D", "E"))
df2 <- data.frame(ID = 3:7, Age = c(20, 30, 40, 50, 60))

# 使用left_join函数合并数据集
merged_df <- left_join(df1, df2, by = "ID")
merged_df

Here are the methods for combining two datasets in R language. You can choose to use either the merge() function or the join() function for merging, depending on your specific requirements.