[SQL] 쿼리테스트 3

Hyunjun Kim·2026년 2월 9일

SQL

목록 보기
93/98

Problem

Problem — Reporting across customers and orders

You maintain a simple e-commerce data model with two tables: customers and orders. Over time some customer records have been removed (for example, due to privacy requests), but historical order rows might still reference those now-deleted customer IDs.

Your manager needs a single report that contains all columns from both tables and must satisfy these two requirements:

  1. Every customer currently present in the customers table must appear in the output. If a customer never placed an order, the order-related columns for that row should be NULL.
  2. Every order in the orders table must appear in the output. If an order references a customer ID that no longer exists, the customer-related columns for that row should be NULL.

You first tried using an INNER JOIN, but it did not meet the requirements. Explain why INNER JOIN is insufficient and describe which JOIN type best satisfies the two requirements. If the target RDBMS does not support FULL OUTER JOIN (for example, MySQL), explain a practical alternative that produces the same result.

Example schemas (for illustration only):

-- customers
id INT PRIMARY KEY
first_name VARCHAR(50)
email VARCHAR(100)

-- orders
id INT PRIMARY KEY
customer_id INT       -- FK when customer exists
order_date DATE
total_amount INT

Model answer

Short answer

INNER JOIN returns only rows that match in both tables, so it excludes customers without orders and orders with missing customer records; the correct join type to include all rows from both sides is FULL OUTER JOIN. If the RDBMS doesn’t support FULL OUTER JOIN, emulate it using a combination of LEFT JOIN and RIGHT JOIN (or LEFT JOIN + UNION) to achieve the same effect.


Explanation

Why INNER JOIN fails

  • INNER JOIN returns only rows where the join condition matches on both sides.
  • Requirement (1) asks to include customers who have never placed an order. These customers have no matching orders rows, so they would be omitted by an INNER JOIN.
  • Requirement (2) asks to include orders whose referenced customer record has been deleted. Those orders have no matching customers row and would also be omitted by an INNER JOIN.
    Therefore, INNER JOIN does not satisfy either requirement.

Which JOIN to use and why

  • FULL OUTER JOIN returns all rows from both tables. For rows that do not have a match, the missing side’s columns are filled with NULL.

  • Using FULL OUTER JOIN satisfies both requirements:

    • customers without orders → customer columns populated, order columns NULL;
    • orphan orders (orders whose customer row is missing) → order columns populated, customer columns NULL.

SQL examples

1) If your RDBMS supports FULL OUTER JOIN (PostgreSQL, SQL Server, etc.)

SELECT
  c.id   AS customer_id,
  c.first_name,
  c.email,
  o.id   AS order_id,
  o.order_date,
  o.total_amount
FROM customers c
FULL OUTER JOIN orders o
  ON c.id = o.customer_id;

2) If your RDBMS does not support FULL OUTER JOIN (MySQL) — UNION approach

Use a LEFT JOIN to get all customers (including those without orders) and then add orders that have no matching customer.

-- Step 1: all customers with their orders (orders may be NULL)
SELECT
  c.id   AS customer_id,
  c.first_name,
  c.email,
  o.id   AS order_id,
  o.order_date,
  o.total_amount
FROM customers c
LEFT JOIN orders o
  ON c.id = o.customer_id

UNION ALL

-- Step 2: orders that do not match any customer (to include orphan orders)
SELECT
  NULL           AS customer_id,
  NULL           AS first_name,
  NULL           AS email,
  o.id           AS order_id,
  o.order_date,
  o.total_amount
FROM orders o
LEFT JOIN customers c
  ON o.customer_id = c.id
WHERE c.id IS NULL;

Notes about the MySQL pattern

  • The first query returns all customers and their orders (if any).
  • The second query finds orders with no matching customer and adds them.
  • UNION ALL is used for performance (no deduplication needed because we deliberately select non-overlapping rows in the second part).
  • If you prefer to preserve the original customer columns for orphan rows as NULL, you can select NULL placeholders as shown; if you want to show the orphan customer_id value from the order row, include o.customer_id in the customer_id column (but be careful: that ID does not refer to an existing customer row).
profile
Data Analytics Engineer 가 되

0개의 댓글