Duplicate rows are one of the most common data quality problems in spreadsheets. They inflate totals, skew analysis, and cause all sorts of downstream issues. This guide covers multiple approaches — from Excel's built-in button to a more powerful browser-based tool that lets you inspect duplicates before removing them.
Excel's Built-In 'Remove Duplicates' — and Its Limitations
Excel has a built-in Remove Duplicates feature (Data → Data Tools → Remove Duplicates). It works well for simple cases, but has significant gaps:
- It deletes duplicates immediately — you can't review what will be removed before confirming
- It always keeps the first occurrence and silently deletes the rest
- You cannot deduplicate based on a subset of columns while keeping all columns visible
- There is no way to count how many duplicates exist for each unique value
- Large files can be slow to process
Always back up your file before using Excel's Remove Duplicates — there is no undo once you save and close.
When You Need More Control: Group and Count
Often, the question is not just 'are there duplicates?' but 'how many times does each value appear, and are the duplicates truly identical or do they differ in some columns?' The Deduplicate tool handles this with a group-and-count approach:
- Upload your CSV or Excel file to the Deduplicate tool
- Select which columns to use as the 'duplicate key' — rows matching on these columns are treated as duplicates
- The tool groups rows and shows a count for each unique combination
- Review the grouped results — you can see exactly which rows are duplicated and how many times
- Download the deduplicated output (one row per unique group) as CSV
Choosing Your Deduplication Strategy
| Scenario | Recommended Approach |
|---|---|
| Simple exact duplicates across all columns | Excel Remove Duplicates or the Deduplicate tool |
| Duplicates based on a specific ID column only | Deduplicate tool — select only the ID column as the key |
| Need to count occurrences before removing | Deduplicate tool — shows count column in output |
| Complex deduplication with business rules | Power Query or SQL |
| Very large files (1M+ rows) | SQL, Python (pandas), or Power Query |
Finding Duplicates Without Removing Them
Sometimes you want to flag duplicates for review rather than delete them. In Excel, you can do this with conditional formatting: Home → Conditional Formatting → Highlight Cell Rules → Duplicate Values. This highlights cells in a column that appear more than once, but does not remove anything.
The COUNTIF formula is useful for duplicate detection: =COUNTIF($A:$A, A2)>1 returns TRUE for any row where the value in column A appears more than once. Add this as a helper column to filter duplicates manually.
Upload a CSV or Excel file, group by any columns, and download clean deduplicated data instantly.
Try the Deduplicate Tool Free