SQL-5

Eddie·2025년 5월 14일
0
post-thumbnail

📘 SQL 정리 - 5주차: 예외값 처리부터 Pivot, Window Function까지

🎯 학습 키워드

  • Subquery, Join 복습
  • Null 처리 및 이상치 정제
  • SQL로 Pivot Table 만들기
  • Window Function (RANK, SUM)
  • 날짜 포맷 가공

🧠 나만의 정리

✅ Subquery & JOIN 복습

  • Subquery: 쿼리 안에 쿼리를 넣어 결과를 재활용
  • JOIN: 테이블 간 관계 결합 (LEFT JOIN, INNER JOIN 등)
SELECT *
FROM table1 a
LEFT JOIN table2 b ON a.id = b.id;

✅ Null 처리 & 이상치 정제

1. 값이 없을 때 (NULL)

  • avg(if(rating<>'Not given', rating, null))‘Not given’ 제거
  • coalesce(age, 20) 으로 null → 20으로 대체
SELECT name, coalesce(age, 20) AS filled_age
FROM customers;

2. 이상한 값이 있을 때 (ex. 2세, 1970년)

  • CASE WHEN age<15 THEN 15 WHEN age>80 THEN 80 ELSE age END 로 정제

🔄 Pivot Table 만들기

1. 음식점별 시간대별 주문건수

SELECT restaurant_name,
       MAX(IF(hh='15', cnt_order, 0)) AS "15",
       ...
FROM (
    SELECT a.restaurant_name, SUBSTRING(b.time, 1, 2) AS hh, COUNT(*) AS cnt_order
    FROM food_orders a
    INNER JOIN payments b ON a.order_id=b.order_id
    WHERE SUBSTRING(b.time, 1, 2) BETWEEN 15 AND 20
    GROUP BY 1, 2
) a
GROUP BY 1
ORDER BY "20" DESC;

2. 성별/연령별 주문건수

SELECT age_group,
       MAX(IF(gender='male', order_count, 0)) AS male,
       MAX(IF(gender='female', order_count, 0)) AS female
FROM (
    SELECT gender,
           CASE 
             WHEN age BETWEEN 10 AND 19 THEN 10
             ...
           END AS age_group,
           COUNT(*) AS order_count
    FROM food_orders a
    INNER JOIN customers b ON a.customer_id = b.customer_id
    WHERE age BETWEEN 10 AND 59
    GROUP BY 1, 2
) t
GROUP BY 1
ORDER BY age_group;

🪄 Window Function (RANK, SUM)

1. 음식 타입별 상위 3개 음식점

SELECT cuisine_type, restaurant_name, order_count,
       RANK() OVER (PARTITION BY cuisine_type ORDER BY order_count DESC) AS rn
FROM (
    SELECT cuisine_type, restaurant_name, COUNT(*) AS order_count
    FROM food_orders
    GROUP BY 1, 2
) a
WHERE rn <= 3
ORDER BY cuisine_type, rn;

2. 비율 & 누적합

SELECT cuisine_type, restaurant_name, order_count,
       SUM(order_count) OVER (PARTITION BY cuisine_type) AS total,
       SUM(order_count) OVER (PARTITION BY cuisine_type ORDER BY order_count) AS cum_sum
FROM (
    SELECT cuisine_type, restaurant_name, COUNT(*) AS order_count
    FROM food_orders
    GROUP BY 1, 2
) a;

📅 날짜 포맷 변환

1. 날짜 데이터 가공

SELECT DATE(date) AS date_type,
       DATE_FORMAT(DATE(date), '%Y') AS "년",
       DATE_FORMAT(DATE(date), '%m') AS "월",
       DATE_FORMAT(DATE(date), '%d') AS "일",
       DATE_FORMAT(DATE(date), '%w') AS "요일"
FROM payments;

2. 3월 주문건수만 보기

SELECT DATE_FORMAT(DATE(date), '%Y') AS y,
       DATE_FORMAT(DATE(date), '%m') AS m,
       COUNT(*) AS order_count
FROM food_orders a
INNER JOIN payments b ON a.order_id = b.order_id
WHERE DATE_FORMAT(DATE(date), '%m') = '03'
GROUP BY 1, 2
ORDER BY 1;

⚠️ 문제점과 에러

  • NULL, 이상값이 섞여 있으면 AVG, JOIN 결과가 왜곡됨
  • Pivot Table은 직접 가공이 필요한 경우가 많아 조건문 연습 필수

🔍 새롭게 알게 된 점

  • SQL에서도 Pivot, RANK, 누적합이 가능하다는 점
  • COALESCE, IF, CASE, DATE_FORMAT이 실무에서 매우 유용함

📌 다음 학습 계획

  • Window Function 더 연습해보기 (ROW_NUMBER, LEAD, LAG)
  • 날짜 데이터 기준의 월별, 주별, 요일별 트렌드 분석 연습

📌 숙제


select cuisine_type,
max(if(age=10, order_count, 0)) "10대",
max(if(age=20, order_count, 0)) "20대",
max(if(age=30, order_count, 0)) "30대",
max(if(age=40, order_count, 0)) "40대",
max(if(age=50, order_count, 0)) "50대"
from
(
select a.cuisine_type,
case when age between 10 and 19 then 10
when age between 20 and 29 then 20
when age between 30 and 39 then 30
when age between 40 and 49 then 40
when age between 50 and 59 then 50 end age,
count(1) order_count
from food_orders a inner join customers b on a.customer_id=b.customer_id
where age between 10 and 59
group by 1, 2
) t
group by 1

profile
잘 부탁드립니다

0개의 댓글