Does SELECT DISTINCT apply to one column or the whole row?

It applies to the entire set of selected columns. SELECT DISTINCT city, country returns unique (city, country) pairs, not unique cities. To keep one row per value of a single column, use PostgreSQL’s DISTINCT ON.

What is the difference between DISTINCT and GROUP BY?

Both return unique values. Use DISTINCT when you only need to remove duplicate rows, and GROUP BY when you also want to compute aggregates (COUNT, SUM, AVG) for each group. For plain de-duplication they are equivalent.

Does COUNT(DISTINCT column) count NULLs?

No. COUNT(DISTINCT column) counts only the distinct non-NULL values. If you need NULL to count as its own value you must handle it explicitly, for example with COALESCE.

Beginner

SELECT DISTINCT in SQL: Removing Duplicate Rows

DISTINCT collapses duplicate rows into one. Learn how it works across multiple columns, how it differs from GROUP BY, the PostgreSQL-only DISTINCT ON, and when it hides a real problem.

DISTINCT removes duplicate rows from a result set, returning only unique combinations. It is one of the first keywords beginners reach for — and one of the most over-used, because a duplicate is often a sign of a JOIN problem rather than something to paper over. Keep the SQL cheat sheet open as you read.

What DISTINCT Does

Place DISTINCT right after SELECT. The database compares the selected columns row by row and keeps only the first of each identical group.

-- Every city we ship to, once each
SELECT DISTINCT city
FROM users;

DISTINCT Across Columns

With more than one column, DISTINCT returns unique combinations. Two rows are duplicates only if every selected column matches.

-- Unique (city, country) pairs — not unique cities
SELECT DISTINCT city, country
FROM users;

COUNT(DISTINCT)

COUNT(DISTINCT column) counts how many unique values a column has, ignoring NULLs. This is the standard way to answer "how many different customers ordered?" without listing them.

-- How many distinct customers placed an order?
SELECT COUNT(DISTINCT user_id) AS customers
FROM orders;

DISTINCT vs GROUP BY

SELECT DISTINCT col and SELECT col ... GROUP BY col return the same unique values. The difference is intent: use DISTINCT when you only want unique rows, and GROUP BY when you also want to aggregate within each group.

-- These return the same list of statuses
SELECT DISTINCT status FROM orders;
SELECT status FROM orders GROUP BY status;

-- But GROUP BY can do more — count per status
SELECT status, COUNT(*) FROM orders GROUP BY status;

DISTINCT ON (PostgreSQL)

PostgreSQL’s DISTINCT ON keeps the first row for each value of the listed column(s). Pair it with ORDER BY to control which row "first" means — this is a clean way to get the latest order per customer.

-- The most recent order for each customer
SELECT DISTINCT ON (user_id)
  user_id, id, created_at
FROM orders
ORDER BY user_id, created_at DESC;

Common Mistakes & Tips

Expecting DISTINCT to apply to a single column — it always applies to the whole selected row
Using DISTINCT to hide duplicate rows from a JOIN instead of fixing the join
Forgetting COUNT(DISTINCT col) ignores NULLs — they are never counted
Reaching for DISTINCT when GROUP BY with an aggregate is what you actually need
Assuming DISTINCT is free — it requires a sort or hash and can be slow on large sets

DISTINCT is simple but easy to misuse. Practice on real data to learn when it is the right tool and when a JOIN or GROUP BY is the real fix.