One day, a friend from Department of Data Management asked me for help. She has a worksheet which contains more than 10k rows of data and she’d like to select 30% (or 3k rows) of data randomly and copy those 3k rows entirely from the worksheet into another.
Design the workbook
If you read above text closely, you will find that we have to complete the task following two steps.
- Create 1k random numbers which represent for row No.
- Copy each row from the first sheet into the second sheet
With the idea, I designed a workbook which consists of three woksheets. The first worksheet is to be used for source data. The second worksheet is for selected data. While the third data is for us to fill certain information and then call VBA macro to fullfill the task. Sometimes, there are more than 1 row to contain header information. Therefore, I designed a cell B2 to contain this information. As for cell B2, if we want to select only 10% data, we have to put 10 in cell B2. By clicking on button “Copy”, excel will do what as we expect.
Develop vba code
As I wrote above, we need to create random numbers at first. The random number should be within a certain range. If there are 100 rows of source data and only 1 row is to contain header information, the random numbers should be greater than or equal to 2 while less than or equal to 101. Therefore, a function to create a certain number of random integers within a certain range was created as below.
Public Function UniqueRandom(Minimum As Long, Maximum As Long, _
Number As Long, Optional ArrayBase As Long = 1, _ Optional Dummy As Variant) As Variant Dim SourceArr() As Long Dim ResultArr() As Long Dim SourceN As Long Dim ResultN As Long Dim TopN As Long Dim Temp As Long If Minimum > Maximum Then UniqueRandom = Null Exit Function End If If Number > (Maximum – Minimum + 1) Then UniqueRandom = Null Exit Function End If If Number <= 0 Then UniqueRandom = Null Exit Function End If Randomize ReDim SourceArr(Minimum To Maximum) ReDim ResultArr(ArrayBase To (ArrayBase + Number – 1))
For SourceN = Minimum To Maximum SourceArr(SourceN) = SourceN Next SourceN TopN = UBound(SourceArr) For ResultN = LBound(ResultArr) To UBound(ResultArr) SourceN = Int((TopN – Minimum + 1) * Rnd + Minimum) ResultArr(ResultN) = SourceArr(SourceN) Temp = SourceArr(SourceN) SourceArr(SourceN) = SourceArr(TopN) SourceArr(TopN) = Temp TopN = TopN – 1 Next ResultN UniqueRandom = ResultArr End Function |
With the above function, we can now apply below SUB to copy and paste rows. First of all, Excel need to determine the min and max of the range. By minus min from max, it will get the total number of rows of source data and by multiplying the percert number to get the total number of random numbers which we need to create with help of above function. After restoring those random numbers into an array, we can loop through the array to copy rows one by one.
Sub Copy()
Dim Res As Variant Dim Min As Long Dim Max As Long Dim N As Long ThisWorkbook.Worksheets(2).Cells.Clear For i = 1 To ThisWorkbook.Worksheets(3).Cells(1, 2) ThisWorkbook.Worksheets(1).Cells(i, 1).EntireRow.Copy ThisWorkbook.Worksheets(2).Cells(i, 1) Next Min = ThisWorkbook.Worksheets(3).Cells(1, 2) + 1 Max = ThisWorkbook.Worksheets(1).UsedRange.Rows.Count M = Max – ThisWorkbook.Worksheets(3).Cells(1, 2) N = Application.WorksheetFunction.Floor(ThisWorkbook.Worksheets(3).Cells(2, 2) * M / 100, 1) Res = UniqueRandom(Minimum:=Min, Maximum:=Max, Number:=N) j = Min For i = LBound(Res) To UBound(Res) x = Res(i) ThisWorkbook.Worksheets(1).Cells(x, 1).EntireRow.Copy ThisWorkbook.Worksheets(2).Cells(j, 1) j = j + 1 Next i End Sub |
Full program for your reference
Sub Copy()
Dim Res As Variant Dim Min As Long Dim Max As Long Dim N As Long
ThisWorkbook.Worksheets(2).Cells.Clear For i = 1 To ThisWorkbook.Worksheets(3).Cells(1, 2) ThisWorkbook.Worksheets(1).Cells(i, 1).EntireRow.Copy ThisWorkbook.Worksheets(2).Cells(i, 1) Next
Min = ThisWorkbook.Worksheets(3).Cells(1, 2) + 1 Max = ThisWorkbook.Worksheets(1).UsedRange.Rows.Count M = Max – ThisWorkbook.Worksheets(3).Cells(1, 2) N = Application.WorksheetFunction.Floor(ThisWorkbook.Worksheets(3).Cells(2, 2) * M / 100, 1)
Res = UniqueRandom(Minimum:=Min, Maximum:=Max, Number:=N)
If IsArrayAllocated(Res) = False Then MsgBox “Error from UniqueRandom.” Else j = Min For i = LBound(Res) To UBound(Res) x = Res(i) ThisWorkbook.Worksheets(1).Cells(x, 1).EntireRow.Copy ThisWorkbook.Worksheets(2).Cells(j, 1) j = j + 1 Next i End If
End Sub Function IsArrayAllocated(V As Variant) As Boolean On Error Resume Next IsArrayAllocated = Not (IsError(LBound(V)) And _ IsArray(V)) And (LBound(V) <= UBound(V)) End Function Public Function UniqueRandom(Minimum As Long, Maximum As Long, _ Number As Long, Optional ArrayBase As Long = 1, _ Optional Dummy As Variant) As Variant Dim SourceArr() As Long Dim ResultArr() As Long Dim SourceN As Long Dim ResultN As Long Dim TopN As Long Dim Temp As Long
If Minimum > Maximum Then UniqueRandom = Null Exit Function End If If Number > (Maximum – Minimum + 1) Then UniqueRandom = Null Exit Function End If If Number <= 0 Then UniqueRandom = Null Exit Function End If
Randomize
ReDim SourceArr(Minimum To Maximum) ReDim ResultArr(ArrayBase To (ArrayBase + Number – 1))
For SourceN = Minimum To Maximum SourceArr(SourceN) = SourceN Next SourceN
TopN = UBound(SourceArr) For ResultN = LBound(ResultArr) To UBound(ResultArr)
SourceN = Int((TopN – Minimum + 1) * Rnd + Minimum) ResultArr(ResultN) = SourceArr(SourceN)
Temp = SourceArr(SourceN) SourceArr(SourceN) = SourceArr(TopN) SourceArr(TopN) = Temp
TopN = TopN – 1 Next ResultN
UniqueRandom = ResultArr
End Function |