文章背景:在工作生活中,有时需要进行删除重复行的操作。比如样品测试时,难免存在复测数据,一般需要保留最后测试的数据。之前通过拷贝行的方式保留最后一行的数据(参见文末的延伸阅读1
),但运行效率较低。目前通过借助数组和字典达到删除重复行的效果。
1 基于指定列,保留最后一行的数据2 基于指定列,保留最后一行的数据,同时剔除不需要的列3 效果演示
想要实现的效果:在原来测试数据的基础上,基于B列,如果存在重复的数据,保留最后一行的数据。
VBA代码如下:
Sub Delete_Duplicate1()
'基于指定列,删除重复行,保留最后出现的行数据。
Dim tar_sheet As Worksheet
Dim flag_r As Long, ii As Long, jj As Long, lastRow As Long
Dim dic As Object, arrIn, arrOut, sample
Set tar_sheet = ThisWorkbook.Sheets("post1")
Set dic = CreateObject("scripting.dictionary")
tar_sheet.Activate
lastRow = 7
With tar_sheet
'store source range region to Array
arrIn = .Range("A3:F" & lastRow).Value2
Debug.Print UBound(arrIn)
For ii = 1 To UBound(arrIn)
sample = Trim(arrIn(ii, 2))
'使用字典,达到去重效果,保留最后一个序号。
dic(sample) = ii
Next
ReDim arrOut(1 To dic.Count, 1 To 5)
ii = 0
For Each sample In dic.keys
flag_r = dic(sample)
ii = ii + 1
'column A to D
For jj = 1 To 5
arrOut(ii, jj) = arrIn(flag_r, jj)
Next
Next
End With
'清空旧数据
tar_sheet.Range("A3:E7").ClearContents
'导入新数据
tar_sheet.Range("A3").Resize(UBound(arrOut), 5) = arrOut
Set dic = Nothing
MsgBox "Done!"
End Sub
(1) 借助数组arrIn
,存放原有的测试数据;借助字典dic
,保存Sample对应的序号。由于字典的键值具有唯一性,因此,对于同一个样品,如果重复出现,保留最后一次出现的序号。
(2)关于Range.Value2: The only difference between this property and the Value property is that the Value2 property doesn't use the Currency and Date data types. You can return values formatted with these data types as floating-point numbers by using the Double data type.
想要实现的效果:针对原有的测试数据,基于B列,如果存在重复的数据,保留最后一行的数据;这里不需要E列的数据。将选取的数据拷贝到指定区域。
VBA代码如下:
Sub Delete_Duplicate2()
'基于指定列,保留唯一行(若重复),同时剔除不需要的列。
Dim tar_sheet As Worksheet
Dim flag_r As Long, ii As Long, jj As Long, lastRow As Long
Dim dic As Object, arrIn, arrOut, sample
Set tar_sheet = ThisWorkbook.Sheets("post2")
Set dic = CreateObject("scripting.dictionary")
tar_sheet.Activate
lastRow = 7
With tar_sheet
'store source range region to Array
arrIn = .Range("A3:F" & lastRow).Value2
Debug.Print UBound(arrIn)
For ii = 1 To UBound(arrIn)
sample = Trim(arrIn(ii, 2))
'使用字典,达到去重效果,保留最后一个序号。
dic(sample) = ii
Next
ReDim arrOut(1 To dic.Count, 1 To 5)
ii = 0
For Each sample In dic.keys
flag_r = dic(sample)
ii = ii + 1
'column A to D
For jj = 1 To 4
arrOut(ii, jj) = arrIn(flag_r, jj)
Next
'column F, 跳过不需要的E列
For jj = 5 To 5
arrOut(ii, jj) = arrIn(flag_r, jj + 1)
Next
Next
End With
'导入新数据
tar_sheet.Range("A12").Resize(UBound(arrOut), 5) = arrOut
Set dic = Nothing
MsgBox "Done!"
End Sub
http://mpvideo.qpic.cn/0bc3kqaaeaaaeqagwshynvrfavgdajkaaaqa.f10002.mp4?dis_k=adc545cd0915aa231b719acaaf26af6b&dis_t=1671335492&play_scene=0&vid=wxv_2530923864449761283&format_id=10002&support_redirect=0&mmversion=false
参考资料:
[1] How to use VBA dictionary to store discontinuous range(https://stackoverflow.com/questions/73343156/how-to-use-vba-dictionary-to-store-discontinuous-range)
[2] Range.Value2 property(https://docs.microsoft.com/en-us/office/vba/api/excel.range.value2)
延伸阅读
[1] 根据指定列删除重复行