5 Ways to Remove Unwanted Duplicates in Excel

Let Us Help You – We Are For Hire

 

What is a “Duplicate” in Excel, and how can you spot them, and then remove them:  In Excel, a duplicate is considered to be a value that appears more than once in a given range of cells.  Here we cover 5 ways to spot and to remove unwanted duplicates in Excel.

For example, if you have a column of names and the name appears more than once in the column, then those repeated names would be considered duplicates.

Similarly, if you have a column showing the sales regions and if a sales region appears more than once, then that Region would be considered a duplicate.

The Unique function is one of the new Dynamic Array Functions, these Spill Functions simplify Excel programming and are available in Office 365.

 

 

What you need to know to Remove Duplicates in Excel:

  1. What are duplicates.
  2. How to spot duplicates.
    1. Use Conditional Formatting.
  3. How to remove duplicates.
    1. Using the “Remove Duplicates” Feature or the new Unique Function.
  4. How to prevent duplicates in the first place.
    1. They are 100% preventable.

 

 

Microsoft says this about duplicates.


When it comes to Duplicates in Excel, Microsoft has this to say:

“Sometimes duplicate data is useful, sometimes it just makes it harder to understand your data. Use conditional formatting to find and highlight duplicate data. That way you can review the duplicates and decide if you want to remove them, says Microsoft.

 

 


 

 

5 Ways to Remove Unwanted Duplicates in Excel 365

In this post we will go over the 5-ways to Spot and to Remove Duplicates in Microsoft Excel. With these, you can spot duplicates, and if desired, you can easily get rid of them.

Please note, most “duplicates cells” in Excel are okay, even expected. So here we explain all that you need to know about Excel duplicates.

  1. Manually, w/ formulas or strictly visually
  2. Advanced Filter
  3. VBA Code w/ Formulas
  4. Remove Duplicates Button
  5. Unique Function

 

Which Method Works Best:  What method(s) you use to spot, to reference, or to remove duplicates depends on the situation, there is not one acceptable method.  We hope this post helps you to understand what duplicates are, how you can easily spot them, how to leverage them, and of course, how to remove them, when needed.

 

If you have any questions on this, or if you need help programming your Excel files, our team of Excel Consultants are here to help.  Give us a call.

 

Toll-Free: 877.392.3539Irvine, California: 949.612.3366Manhattan, New York: 646.205.3261

 


 

Two Types of Duplicates in Microsoft Excel

There are two types of duplicates in Excel.  There are duplicate cells, and there are duplicate records.  They are not the same thing; not all duplicates are bad.

A “duplicate cell” is usually identified when you are looking at a column, for say sales regions.  If that region is listed more than once in that column, then you have duplicate regions in your data, which in and of itself is not a bad thing.

Duplicate Cells can be a good thing.  Duplicate Cells are usually not a problem. KPI Pivot Table data is based on duplicate cell values, i.e., criteria. (Product, Region, Division, Doctor, etc.). Learn to use them.

On the other hand, a “duplicate record” means that there are two or more rows with exactly the same data in each and every cell across the row (range/record).  An exact match of the record results in a “duplicate record“.

Duplicate Records are a Problem:  This is an example of when duplicates are an actual problem that you need to address.  You should not have the same exact row more than once in your data, as it will cause issues with accuracy.

Duplicate Records Cause Inaccurate Results:  If you have duplicate records, your XLookups will most likely not work, and your SumIfs will return overstated results. If you have duplicate records, you will want to remove them, and you will want to keep future records away.

 

It is common, even expected to have certain types of duplicates, such as Sales Region, but here we are talking about when it is not okay to have duplicates, and how to first spot them, and then to remove them.

 

 


 

Not all Duplicates are Bad or Unwanted 

But not all duplicates are bad.  Often Excel users will say they have duplicates, meaning that there are certain cells within the range that are repeated.  But they are typically not looking at the entire record, when they discuss having duplicates.

So you can have duplicate records, and you can have duplicate values, similar, but not exactly the same thing.  Duplicate Cells might be okay, Duplicate Records are not okay. You need to remove duplicate records using the 5 methods here.

 

Excel and Access, LLC company logo.Experts Note: It’s important to identify duplicates in your data because they can affect the accuracy of your calculations and analyses. So while having duplicates is not always bad, often it is expected, but when it is not, then you need to know how to find, and how to delete duplicates.

 

An Example of Good and Useful Duplicates in Excel

Duplicates in Excel can be very useful for your overall Excel file, how you use it, even how you program it.  Why do I say that?  The Unique Function is the perfect example.

By using this new Dynamic Array Spill Function against an Excel Sales Table, you can extract the unique list of regions, and you can use that in your file.

Example of Power:  The Unique Function is a great example of these new Spill Array Functions and just how useful they are.  In this simple example we are able to remove the duplicates from a table and then use that list to populate a Validation Control.

Without the Unique Function, this would not be as easy as it is, and as we have been saying, Excel should be easy to use.

=SORT(UNIQUE(tblEployeeData[Region])

 

 

Knowing what the New Spill Array Functions are is one thing, knowing how to best apply them, and how to best integrate them with the other Spill functions is another.

Every Excel expert should know these inside and out, if they do not, well then what kind of expert are they?

 

 

 


 

 

What Causes Unwanted Duplicates in Microsoft Excel

As we said earlier, there are basically two types of duplicates in Excel. The most common form of duplicates are duplicate cells/values.  These are often wanted/allowed.

The second type of duplicate are duplicate records. These are usually not wanted.  For your Excel files to work properly, you will need to remove unwanted duplicates.

Duplicates are caused by a variety of reasons, here we quickly go over them:

  1. Poor workbook design.
    1. Workbooks should be designed to prevent unwanted duplicates.
  2. User error.
    1. Simply not paying attention and appending a record set twice is a common reason for duplicate data.
  3. Importing flawed data containing duplicates
    1. Data from databases often has duplicate data in it.
  4. Desired/Acceptable duplicates.
    1. a sales region in your data is the perfect example of desired duplicate data.
      1. You need this sort of duplication in order to make sense of your data.
      2. Pivot Table KPI reports are based on duplicate values (Region, Product, Manager, etc.).
        1. These are the vary duplicates that Slicers are based on.

 

The key to not having duplicate records is in the prevention of them.  Do not allow them to enter your data.  Properly control what data is entered, and how it is entered.  This will eliminate this problem.

 

 

Unwanted Duplicates Cause Havoc in Excel

Unwanted duplicates often cause problems when working with data in Excel.  Duplicates and are the result of accidents, poor design or data entry errors.  Analyzing data accurately or performing certain operations, like sorting and filtering becomes difficult when there are duplicates.

Unwanted duplicates can cause serious issues when using Excel formulas such as the Xlookup.  if there are duplicate records the Xlookup will only return the first match and ignore the remaining records. As a result you will get inaccurate results.

Use the SumIfs Function against a table with repeated records and you will get very inaccurate results as all matching records will be summed.

The best way to avoid problems and ensure the Excel data is accurate and easy to work with is to prevent the creation of duplicates and not the remove the duplicates.

 

 


 

Example: Duplicate Records Result in Inaccurate Results

In the example (Image) below, there are three duplicate records.  Two of these records should not be there.  Running formulas against this table will cause result errors.

If you use an XLookup or the VLookup, you will get results, but only from the first matching record it finds. While in this case, it will still provide the correct results as these are duplicate records.

But if the “Pay” value for each of the three records were different, then the Lookup would have a 33% chance of returning the correct result.  We see this all the time.  Clients pay us to fix these issues for them.

If you use the SumIf or SumIfs Function against this table, you will get triple the pay for this employee, obviously inaccurate.

To Avoid Inaccurate Results:

  1. Prevent the creation of duplicate records in the first place.
    1. Control how data is entered.
  2. Use Conditional Formatting to spot duplicate records.
  3. If duplicates show up, use one of the 5 methods here to remove them.

 

First spot your duplicates, then remove them with the Remove Duplicates button, or apply the new Unique Function to get a duplicate free result set.


The SumIfs Function would return inaccurate results due to the duplicate records.

 

 

 


 

 

5 Ways of Removing Unwanted Duplicates

 

Five Ways to Remove Unwanted Duplicates in Excel

Here are five ways to remove unwanted duplicates in Excel. Which methods you use depends on your situation and skills. In order of difficulty to do.  Of course, manually is the hardest way to do this, and the Unique Function is the easiest and most powerful.

  • First, use Conditional Formatting to spot duplicates.
  • Manually remove duplicate records, using functions such as CountIf
  • Use Advanced Filter
  • Use VBA Code w/ formulas.
  • Use the “Remove Duplicates” button.
  • “Unique” Function

 

Step 1:  Apply Conditional Formatting to Spot Duplicates

The first step in removing duplicates is to identify them.  The “Conditional Formatting” feature will help you visually pinpoint the duplicate by highlighting them but doesn’t remove them.  However, by spotting them visually, you can then delete them.

 

Conditional Formatting does not remove duplicates, rather it highlights them.

 

This method does not require the use of VBA and is built into Excel.  It is easy to use but it increases file size to undesired limits when used against large datasets.  Use these steps to visually spot the duplicates in your workbook.

  1. Select the range of cells that you want to check for duplicates.
  2. Click on the “Home” tab in the ribbon at the top of the Excel window.
  3. Click on the “Conditional Formatting” button in the “Styles” group.
  4. Select “Highlight Cells Rules” and then “Duplicate Values” from the drop-down menus.
  5. In the “Duplicate Values” dialog box, select the formatting that you want to apply to the duplicates.
  6. Click the “OK” button to highlight the duplicates.

 

In the example below, Conditional Formatting is applied to column R.  These are duplicate cells, not duplicate records.

Duplicates cause issues when you try to extract meaningful data from flawed data sets.


The XLookup and the SumIfs would return incorrect results against this table containing duplicates.
Best to remove them using the Remove Duplicates feature in Excel.

 

 

 


 

 

Use Advanced Filter to Remove Duplicates – Old School

Use the Advanced filter when you want to set your own criteria for eliminating duplicates and generate a filtered list of unique values from a dataset. It will be harder to do than some tried and true methods, like “Remove Duplicates” and the Unique Function.  Of the 5 Ways to Remove Duplicates in Excel, this is not one of our favorites, and I had actually forgotten about this.

This method is more complex and does require familiarity with Excel Filtering. This be difficult for beginner Excel users. This is a 100% manual process. This is old school and we do not recommend this as your first approach.

Here is how to use the Advanced Auto Filter Method to Remove Duplicates:

  1. Copy the range of cells that you want to remove duplicates from to a new location in your worksheet.
  2. Click on the “Data” tab in the ribbon at the top of the Excel window.
  3. Click on the “Advanced” button in the “Sort & Filter” group.
  4. In the “Advanced Filter” dialog box, select “Copy to another location” and then select the range of cells that you want to copy the unique values to.
  5. Check the “Unique records only” checkbox.
  6. Click the “OK” button to copy the unique values to the new location.

 

Graphic shows using Advanced Filter to remove duplicates.


Advanced Filter is a tried-and-true way to remove duplicates from a data range.

 

Experts Advice:  As professional Excel consultants we do not recommend this approach.  It is old school, too manual, too time consuming, and there are far better ways to do this.

 

 


 

 

Use VBA Code w/ Formulas to Remove Duplicates

The most flexible & customizable option to remove duplicates is to use VBA code but advanced knowledge of VBA coding is required, which will make this a more difficult option for beginners. Using VBA code is useful when removing duplicates from large datasets or implementing customized criteria applied to the removal of the duplicates. For example, here is some VBA code that can remove duplicates from a large dataset.

 

Example:  Using VBA Code to remove duplicates:

Sub RemoveDuplicatesCode()

Dim rng As Range

‘ Set the range of cells to check for duplicate records

Set rng = Range(“SalesData”)

‘ Remove duplicates from the named range

rng.RemoveDuplicates Columns:=Array(SalesData), Header:=xlYes

End Sub

 

In this example, the range of cells to check for duplicates is set to cells in the “SalesData” named range. The `Remove Duplicates` method is then called on the range, specifying the columns to check for duplicates, and indicating that the range includes a header row.

The code can be modified to fit your needs. You can use advanced formulas in your code as needed.  The range of cells can be changed, the header row can be changed, or different columns can be specified to check for duplicates.

Another advantage to using VBA code is that code can be incorporated into a larger macro or a user-defined function, which is often the case.  Well, that was before the Unique Function was introduced.

 

 


 

Use the Remove Duplicates Button

As we said, there are 5 ways to remove unwanted duplicates in Excel.  One of the newer ways to remove duplicates within medium to small datasets is the “Remove Duplicates”. This easy-to-use built-in feature is great because it doesn’t require VBA code but you can’t customize the criteria for removing duplicates.

I will remove the entire row of duplicate values, depending on what is needed.

 

This is so easy to use; we love this! It takes seconds to do, no programming experience required.

 


What can be easier?

This only takes seconds, and you do not need to know Intermediate Excel programming skills to do this.

 

 

A great Way to Remove Duplicates is by using the “Remove Duplicates” Button:

  1. Select the range of cells that you want to remove duplicates from.
  2. Click on the “Data” tab in the ribbon at the top of the Excel window.
  3. Click on the “Remove Duplicates” button in the “Data Tools” group.
  4. In the “Remove Duplicates” dialog box, select the columns that you want to check for duplicates.
  5. Click the “OK” button to remove the duplicates.

 

 

Debra knows Excel. On her site she covers several of the 5 Ways to Remove Duplicates in Excel in this post. Great resource, check her site out.


Debra is a top rated Excel MVP.
Here is what Debra says about removing duplicates in Excel.

“With the Microsoft Excel Remove Duplicates feature, it’s easy to remove all the duplicate values from a list, based on the items in one or more columns.”

 

 


 

Use Validation Controls to Prevent Duplicates from being Entered

You can use Excel Data Validation Controls to prevent Duplicates from being entered, in the first place, without the use of VBA.

 

Image shows how to use Validation Controls to Prevent Duplicates from being entered.

Image shows how to use Validation Controls to Prevent Duplicates from being entered.

=COUNTIF($A$1:$C$20,A4)=1

 

 

 

 


 

The Dynamic Array Unique Spill Function

The newest and most powerful way to remove unwanted duplicates in MS Excel is to use the Dynamic Array Unique Spill Function. This function does not delete duplicates from the table, it does not delete anything but rather, produces a query like result without duplicates. The result set is based in memory, unlike a traditional Excel Table based on values.

If you would like, you can copy and paste special as values the result set, if you do not want a dynamic result.

The Unique Function extracts unique values from a selected range of cells and displays those values in a range.  It will return the unique values in the order they appear in the original range. It is fast and easy to use and does not require the use of the “Remove Duplicates” feature to get rid of the duplicates.

Using the “Unique Function” feature

  1. Select the cell where you want to display the unique values.
  2. Enter the formula =UNIQUE(range), where “range” is the range of cells that you want to extract unique values from.
  3. Press Enter to apply the formula.
  4. Look at the results.

 

Here is a visual example of how 5 Ways to Remove Duplicates in Excel.

 

 

Conclusion – 5 Ways to Remove Unwanted Duplicates in Excel

Which of these methods will work best depends on the user’s needs as well as their Excel skills set. Of the five methods mentioned here, in our eyes it comes down to just two methods.   One method works on the table itself, the other method works on an in-memory version result set.

The “Remove Duplicates” method will delete the duplicates from the table while the “Unique” function will give you a query result without deleting your data.

One of the nice things about the Unique Function method is that it preserves the original data set, providing a read-only subset of the original table.

The Remove Duplicates button is a powerful way to quickly remove duplicates, on the fly, without the need to know anything about Excel programming.  If you are new to Excel, and want to remove duplicates, this might be the method for you. We actively use this method, we love it.

 

Best Method to Remove Unwanted Duplicates:

Overall, using the Dynamic Array Unique Spill Function can be a powerful method for ensuring that the data is free of duplicates. It is quick, easy to do, and best of all, it is dynamic. This is our favorite method of removing duplicates from our data.

 

Experts Advice:  Before removing duplicates, you may want to make a backup of your data or create a new sheet before removing duplicates.  Sometimes things do not work out as you expected, and this will save you a lot of time and frustration if things do go wrong.  ALWAYS backup your Excel files before working in them.

 

 

 


 

Other Posts on how to use the new Excel Spill Array Functions for Business.

 

Microsoft Excel Spill Array Functions for Business – what to do with them.

1.Related Post:  Using ChooseCols Function in Excel Reports – Excel Reporting Just Got Easier.

2. Related Post:  Using VStack Function in Excel VStack Function to Consolidate Excel Data

3. Related Post:  Using Unique Function with Sort Function in Validation Controls – Users Love Drop Down Lists

4. Related Post:  Using Transpose Function in Excel – Transposing Excel Then and Now

5. Related Post:  Using Unique Function in Excel to remove duplicates – 5 Ways to Remove Duplicates in Excel

 

 

 

 


 

Contact us to discuss 5 Ways to Remove Duplicates in Excel

Would you like to discuss 5 Ways to Remove Duplicates in Excel?

If you need help with anything Excel, Access, SQL Server, or Azure, our team of Excel Consultants are here to provide both programming and training solutions.

Operators are standing by to take your call.     Get a Smarter Microsoft Solution.

 

Toll-Free: 877.392.3539Irvine, California: 949.612.3366Manhattan, New York: 646.205.3261