How to filter duplicate data in R language?
In R language, the duplicated() function can be used to filter out duplicated data. This function returns a logical vector indicating whether each element is duplicated. By using this logical vector as an index, duplicated data can be filtered out.
Here is an example:
# 创建一个包含重复数据的向量
x <- c(1, 2, 3, 2, 4, 3, 5)
# 使用duplicated()函数找出重复的数据
duplicated_indices <- duplicated(x)
# 使用逻辑向量作为索引,筛选出重复的数据
duplicated_values <- x[duplicated_indices]
# 输出结果
print(duplicated_values)
The output is:
[1] 2 3
In the example mentioned above, the original vector x contains duplicate data. After using the duplicated() function to identify the duplicates and getting a logical vector called duplicated_indices, we then use this logical vector as an index to filter out the duplicate data. The final output shows the duplicate values 2 and 3.