A common problem when working with lists is the occurrence of duplicates within the list. In business its commonly required to “clean up” lists, so that there’s exactly one of each entry within the lists or to say it another way, that the values in the list are unique. (Database people might call this value a Unique or Primary Key). If the list has 1000’s of items, this is a very tedious and time-consuming process. Here’s one method to do this fast in Excel. The method will work for other spreadsheet types as well, but the exact formula may be different.
Step 1. Use Excels Sort function to sort the list (a-z or smallest to largest) based on the column you want to be unique.
Step 2. Create a new column to compare the first and second value from your list. Just use an Excel “equals” formula to do this: Cell1=Cell2
Step 2b. Copy this formula down the entire column. (Tip: Double click the tiny solid square in the bottom right corner of a cell to quickly copy down a column)
Step 3. Select the entire second column and Copy the True/False results and do Paste as Values in the same place. (Use your PasteValues shortcut if you are a worksheet ninja!).
Step 4. Now sort the entire list again, this time using the True/False column as the sort order (a-z or Smallest to Largest). When done, all the values with False next to them should be at the top of the list. These are the unique values.
Step 5. Delete the Values at the bottom of the list, starting from the first True entry. Now only unique values are remaining in the list. You can also delete the True/False column, which is no longer needed.
What could go wrong?
Assuming you have done this process and still see duplicates in your list there are a few common reasons why this might not have worked as expected.
- The list could contain values that look the same but aren’t. For example, there might be blank space, or other invisible characters after a name. Solution: Examine the fields closely in the formula editor. Use Find & Replace, or TRIM() to clean up your input list.
- Identical items in the list could be differently formatted, for example as number or text. Excel wont recognise these as equal. Solution: Check the format in your input column (text is left-justified by default, numbers are right-justified). Make sure that the input list is uniformly formatted. I recommend to completely convert to text, unless there is a reason not to.
- You might have resorted the list with some of the equals formulas still in place. Solution: Make sure you copy-paste values on the entire second column before re-sorting.