How to filter duplicate data in R language?

2 years ago

Ava Mitchell

1 minute

In R language, the duplicated() function can be used to filter out duplicated data. This function returns a logical vector indicating whether each element is duplicated. By using this logical vector as an index, duplicated data can be filtered out.

Here is an example:

# 创建一个包含重复数据的向量
x <- c(1, 2, 3, 2, 4, 3, 5)

# 使用duplicated()函数找出重复的数据
duplicated_indices <- duplicated(x)

# 使用逻辑向量作为索引，筛选出重复的数据
duplicated_values <- x[duplicated_indices]

# 输出结果
print(duplicated_values)

The output is:

[1] 2 3

In the example mentioned above, the original vector x contains duplicate data. After using the duplicated() function to identify the duplicates and getting a logical vector called duplicated_indices, we then use this logical vector as an index to filter out the duplicate data. The final output shows the duplicate values 2 and 3.