Intermediate

SQL Self Join Explained with Real Examples

A self join is just a table joined to itself. It is how you compare rows within one table — employees to managers, prices to previous prices, finding duplicates and pairs.

A self join sounds exotic but it is ordinary: you join a table to itself. There is no special SELF JOIN keyword — you write a normal JOIN where both sides are the same table, distinguished by aliases. It is the standard way to compare rows within a single table. Keep the SQL cheat sheet open as you read.

What Is a Self Join?

Whenever a table references itself — an employee row pointing at a manager row, a category pointing at a parent category — the relationship lives inside one table. To follow it, you join the table to a second copy of itself and match the foreign key to the primary key.

Aliases Make It Work

Because both sides are the same table, you must give each one a distinct alias so the database (and you) can tell them apart. Every column reference then specifies which copy it comes from.

-- Two aliases for the same table
SELECT e.name AS employee, m.name AS manager
FROM employees e
JOIN employees m ON e.manager_id = m.id;

A table joined to itself — Two aliases turn one table into two logical copies, so each employee row can be matched to its manager row in the same table.

Employee–Manager Hierarchies

The classic self join pairs each employee with their manager. Use an INNER JOIN to list only employees who have a manager, or a LEFT JOIN to also include the top of the chain (whose manager_id is NULL).

-- Include the CEO, whose manager_id is NULL
SELECT e.name AS employee,
       COALESCE(m.name, '— top level —') AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id
ORDER BY manager, employee;

Finding Pairs

Self joins generate pairs of related rows — two users in the same city, two products at the same price. The trick is the join condition: match on the shared attribute, and use an inequality on the id to avoid pairing a row with itself and to keep each pair only once.

-- Pairs of users in the same city, each pair once
SELECT a.name AS user_a, b.name AS user_b, a.city
FROM users a
JOIN users b ON a.city = b.city
            AND a.id < b.id;

Detecting Duplicates

A self join can surface duplicate data — two rows that should be unique but share an email or phone. Match on the duplicated column and require different ids.

-- Accounts sharing an email address
SELECT a.id, b.id, a.email
FROM users a
JOIN users b ON a.email = b.email
            AND a.id < b.id;

Common Mistakes & Tips

  • Forgetting aliases — a self join is ambiguous and invalid without them
  • Using a.id <> b.id instead of a.id < b.id, which returns every pair twice
  • Using INNER JOIN when you meant to keep the top of a hierarchy — use LEFT JOIN for that
  • Expecting one self join to traverse a whole tree — it only follows a single level
  • Joining on a non-indexed column on large tables, which can be slow

Once the alias trick clicks, self joins feel natural. Practice them on real tables to build hierarchies and find pairs with confidence.