Duplicate rows are one of the most common data quality problems in spreadsheets. They inflate totals, skew analysis, and cause all sorts of downstream issues. This guide covers multiple approaches — from Excel's built-in button to a more powerful browser-based tool that lets you inspect duplicates before removing them.

Excel's Built-In 'Remove Duplicates' — and Its Limitations

Excel has a built-in Remove Duplicates feature (Data → Data Tools → Remove Duplicates). It works well for simple cases, but has significant gaps:

  • It deletes duplicates immediately — you can't review what will be removed before confirming
  • It always keeps the first occurrence and silently deletes the rest
  • You cannot deduplicate based on a subset of columns while keeping all columns visible
  • There is no way to count how many duplicates exist for each unique value
  • Large files can be slow to process

Always back up your file before using Excel's Remove Duplicates — there is no undo once you save and close.

When You Need More Control: Group and Count

Often, the question is not just 'are there duplicates?' but 'how many times does each value appear, and are the duplicates truly identical or do they differ in some columns?' The Deduplicate tool handles this with a group-and-count approach:

  1. Upload your CSV or Excel file to the Deduplicate tool
  2. Select which columns to use as the 'duplicate key' — rows matching on these columns are treated as duplicates
  3. The tool groups rows and shows a count for each unique combination
  4. Review the grouped results — you can see exactly which rows are duplicated and how many times
  5. Download the deduplicated output (one row per unique group) as CSV

Choosing Your Deduplication Strategy

ScenarioRecommended Approach
Simple exact duplicates across all columnsExcel Remove Duplicates or the Deduplicate tool
Duplicates based on a specific ID column onlyDeduplicate tool — select only the ID column as the key
Need to count occurrences before removingDeduplicate tool — shows count column in output
Complex deduplication with business rulesPower Query or SQL
Very large files (1M+ rows)SQL, Python (pandas), or Power Query

Finding Duplicates Without Removing Them

Sometimes you want to flag duplicates for review rather than delete them. In Excel, you can do this with conditional formatting: Home → Conditional Formatting → Highlight Cell Rules → Duplicate Values. This highlights cells in a column that appear more than once, but does not remove anything.

The COUNTIF formula is useful for duplicate detection: =COUNTIF($A:$A, A2)>1 returns TRUE for any row where the value in column A appears more than once. Add this as a helper column to filter duplicates manually.

Try it free

Upload a CSV or Excel file, group by any columns, and download clean deduplicated data instantly.

Try the Deduplicate Tool Free