특정 범위(윈도우) 내에서 행 단위 연산 수행
행을 그룹으로 묶지 않고 개별 행 유지한 채 계산 수행
일반 집계 함수와의 차이
| 구분 | 일반 집계 함수 | 윈도우 함수 |
|---|---|---|
| 처리 단위 | 그룹 단위 | 행 단위 |
| 결과 | 그룹당 1행 | 각 행별 결과 유지 |
| 대표 예시 | SUM(), AVG() | SUM() OVER(), RANK(), LAG() |
SELECT column,
SUM(value) OVER (
PARTITION BY category
ORDER BY date
) AS running_sum
FROM table;
구성 요소
PARTITION BY: 분석 범위(그룹) 지정ORDER BY: 계산 순서 지정ROWS BETWEEN: 윈도우 프레임(범위) 제어| 함수 | 설명 |
|---|---|
ROW_NUMBER() | 각 행에 고유한 순번 부여 |
RANK() | 동일 값은 같은 순위, 다음 순위 건너뜀 |
DENSE_RANK() | 동일 값은 같은 순위, 다음 순위 연속됨 |
NTILE(n) | 파티션을 n개 구간으로 분할 |
SELECT amount,
ROW_NUMBER() OVER(ORDER BY amount),
RANK() OVER(ORDER BY amount),
DENSE_RANK() OVER(ORDER BY amount),
NTILE(10) OVER(ORDER BY amount)
FROM payment;
SUM(), AVG(), COUNT(), MIN(), MAX()
GROUP BY와 달리 각 행별 결과 유지SELECT customer_id, amount,
SUM(amount) OVER (PARTITION BY customer_id ORDER BY payment_date) AS running_total,
AVG(amount) OVER (PARTITION BY customer_id) AS avg_amount,
MIN(amount) OVER (PARTITION BY customer_id) AS min_amount,
MAX(amount) OVER (PARTITION BY customer_id) AS max_amount,
COUNT(*) OVER (PARTITION BY customer_id) AS payment_count
FROM payment;
| 함수 | 설명 |
|---|---|
LAG(col, offset, default) | 이전 행 값 참조 |
LEAD(col, offset, default) | 다음 행 값 참조 |
FIRST_VALUE(col) | 첫 번째 값 반환 |
LAST_VALUE(col) | 마지막 값 반환 |
NTH_VALUE(col, n) | n번째 값 반환 |
SELECT customer_id, payment_date, amount,
LAG(amount) OVER (PARTITION BY customer_id ORDER BY payment_date) AS prev_amount,
LEAD(amount) OVER (PARTITION BY customer_id ORDER BY payment_date) AS next_amount,
FIRST_VALUE(amount) OVER (PARTITION BY customer_id ORDER BY payment_date) AS first_amount,
LAST_VALUE(amount) OVER (
PARTITION BY customer_id ORDER BY payment_date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS last_amount,
NTH_VALUE(amount, 2) OVER (PARTITION BY customer_id ORDER BY payment_date) AS second_amount
FROM payment
WHERE customer_id IN (1, 2);
| 함수 | 설명 |
|---|---|
CUME_DIST() | 누적 분포값 (0~1) |
PERCENT_RANK() | 백분위 순위 (0~1) |
PERCENTILE_CONT(p) | 연속형 백분위수 |
PERCENTILE_DISC(p) | 실제 값 중 가장 가까운 백분위수 |
SELECT customer_id, amount,
CUME_DIST() OVER (PARTITION BY customer_id ORDER BY amount) AS cume_dist,
PERCENT_RANK() OVER (PARTITION BY customer_id ORDER BY amount) AS percent_rank
FROM payment;
SELECT customer_id,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY amount) AS median_cont,
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY amount) AS median_disc
FROM payment
GROUP BY customer_id;
| 구문 | 의미 |
|---|---|
ROWS BETWEEN n PRECEDING AND CURRENT ROW | 현재 행 포함 n행 이전까지 |
ROWS BETWEEN CURRENT ROW AND n FOLLOWING | 현재 행부터 n행 이후까지 |
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING | 전체 범위 |
일반 집계 vs 윈도우 함수 비교
-- 일반 집계
SELECT customer_id, SUM(amount)
FROM payment
GROUP BY customer_id;
-- 윈도우 함수
SELECT customer_id, amount,
SUM(amount) OVER (PARTITION BY customer_id ORDER BY payment_date) AS running_total
FROM payment;
➡️ OVER() 절 사용 시, 행 단위 결과 유지
SELECT
customer_id, payment_date, amount,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY payment_date) AS row_num,
SUM(amount) OVER (PARTITION BY customer_id ORDER BY payment_date) AS running_total,
LAG(amount, 1, 0) OVER (PARTITION BY customer_id ORDER BY payment_date) AS prev_amount,
LEAD(amount, 1, 0) OVER (PARTITION BY customer_id ORDER BY payment_date) AS next_amount,
FIRST_VALUE(amount) OVER (PARTITION BY customer_id ORDER BY payment_date) AS first_amount,
RANK() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS amount_rank
FROM payment
ORDER BY customer_id, payment_date;
분석 항목
ROW_NUMBER)SUM)LAG, LEAD)FIRST_VALUE)RANK)SELECT
customer_id, payment_month, total_amount,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY payment_month) AS row_num,
LAG(total_amount) OVER (PARTITION BY customer_id) AS prev_amount,
LEAD(total_amount) OVER (PARTITION BY customer_id) AS next_amount,
total_amount - LAG(total_amount) OVER (PARTITION BY customer_id) AS growth_amount
FROM (
SELECT
customer_id,
DATE_TRUNC('month', payment_date) AS payment_month,
SUM(amount) AS total_amount
FROM payment
GROUP BY customer_id, DATE_TRUNC('month', payment_date)
) a
ORDER BY customer_id, payment_month;
분석 항목
| 구분 | 주요 함수 | 목적 |
|---|---|---|
| 순위 관련 | ROW_NUMBER, RANK, DENSE_RANK, NTILE | 순서 및 그룹화 |
| 집계 관련 | SUM, AVG, COUNT, MIN, MAX | 누적·평균·집계 분석 |
| 값 참조 | LAG, LEAD, FIRST_VALUE, LAST_VALUE, NTH_VALUE | 이전·다음·특정 행 참조 |
| 통계 | CUME_DIST, PERCENT_RANK, PERCENTILE_CONT, PERCENTILE_DISC | 백분위 및 분포 계산 |