윈도우 함수

김규원·2025년 10월 25일

DB

목록 보기
4/22
post-thumbnail

윈도우 함수란

특정 범위(윈도우) 내에서 행 단위 연산 수행
행을 그룹으로 묶지 않고 개별 행 유지한 채 계산 수행

일반 집계 함수와의 차이

구분일반 집계 함수윈도우 함수
처리 단위그룹 단위행 단위
결과그룹당 1행각 행별 결과 유지
대표 예시SUM(), AVG()SUM() OVER(), RANK(), LAG()

기본 문법

SELECT column,
       SUM(value) OVER (
         PARTITION BY category
         ORDER BY date
       ) AS running_sum
FROM table;

구성 요소

  • PARTITION BY: 분석 범위(그룹) 지정
  • ORDER BY: 계산 순서 지정
  • ROWS BETWEEN: 윈도우 프레임(범위) 제어

윈도우 함수의 중요성

  • 개별 행 정보를 유지하며 분석 가능
  • SQL 레벨에서 누적, 비교, 순위, 변화량 계산 가능
  • BI 및 데이터 분석 핵심 기능

윈도우 함수 종류

순위 관련 함수 (Ranking Functions)

함수설명
ROW_NUMBER()각 행에 고유한 순번 부여
RANK()동일 값은 같은 순위, 다음 순위 건너뜀
DENSE_RANK()동일 값은 같은 순위, 다음 순위 연속됨
NTILE(n)파티션을 n개 구간으로 분할
SELECT amount,
       ROW_NUMBER() OVER(ORDER BY amount),
       RANK() OVER(ORDER BY amount),
       DENSE_RANK() OVER(ORDER BY amount),
       NTILE(10) OVER(ORDER BY amount)
FROM payment;

집계 관련 함수 (Aggregate as Window)

SUM(), AVG(), COUNT(), MIN(), MAX()

  • GROUP BY와 달리 각 행별 결과 유지
  • 누적 합, 평균, 최소·최대값, 개수 등 계산 가능
SELECT customer_id, amount,
       SUM(amount) OVER (PARTITION BY customer_id ORDER BY payment_date) AS running_total,
       AVG(amount) OVER (PARTITION BY customer_id) AS avg_amount,
       MIN(amount) OVER (PARTITION BY customer_id) AS min_amount,
       MAX(amount) OVER (PARTITION BY customer_id) AS max_amount,
       COUNT(*) OVER (PARTITION BY customer_id) AS payment_count
FROM payment;

값 참조 함수 (Value Functions)

함수설명
LAG(col, offset, default)이전 행 값 참조
LEAD(col, offset, default)다음 행 값 참조
FIRST_VALUE(col)첫 번째 값 반환
LAST_VALUE(col)마지막 값 반환
NTH_VALUE(col, n)n번째 값 반환
SELECT customer_id, payment_date, amount,
       LAG(amount) OVER (PARTITION BY customer_id ORDER BY payment_date) AS prev_amount,
       LEAD(amount) OVER (PARTITION BY customer_id ORDER BY payment_date) AS next_amount,
       FIRST_VALUE(amount) OVER (PARTITION BY customer_id ORDER BY payment_date) AS first_amount,
       LAST_VALUE(amount) OVER (
         PARTITION BY customer_id ORDER BY payment_date
         ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
       ) AS last_amount,
       NTH_VALUE(amount, 2) OVER (PARTITION BY customer_id ORDER BY payment_date) AS second_amount
FROM payment
WHERE customer_id IN (1, 2);

통계 함수 (Distribution Functions)

함수설명
CUME_DIST()누적 분포값 (0~1)
PERCENT_RANK()백분위 순위 (0~1)
PERCENTILE_CONT(p)연속형 백분위수
PERCENTILE_DISC(p)실제 값 중 가장 가까운 백분위수
SELECT customer_id, amount,
       CUME_DIST() OVER (PARTITION BY customer_id ORDER BY amount) AS cume_dist,
       PERCENT_RANK() OVER (PARTITION BY customer_id ORDER BY amount) AS percent_rank
FROM payment;

SELECT customer_id,
       PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY amount) AS median_cont,
       PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY amount) AS median_disc
FROM payment
GROUP BY customer_id;

윈도우 프레임 제어 함수

구문의미
ROWS BETWEEN n PRECEDING AND CURRENT ROW현재 행 포함 n행 이전까지
ROWS BETWEEN CURRENT ROW AND n FOLLOWING현재 행부터 n행 이후까지
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING전체 범위

OVER() 절 이해

일반 집계 vs 윈도우 함수 비교

-- 일반 집계
SELECT customer_id, SUM(amount)
FROM payment
GROUP BY customer_id;

-- 윈도우 함수
SELECT customer_id, amount,
       SUM(amount) OVER (PARTITION BY customer_id ORDER BY payment_date) AS running_total
FROM payment;

➡️ OVER() 절 사용 시, 행 단위 결과 유지


복합 예제 ① 고객 결제 히스토리 분석

SELECT 
  customer_id, payment_date, amount,
  ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY payment_date) AS row_num,
  SUM(amount) OVER (PARTITION BY customer_id ORDER BY payment_date) AS running_total,
  LAG(amount, 1, 0) OVER (PARTITION BY customer_id ORDER BY payment_date) AS prev_amount,
  LEAD(amount, 1, 0) OVER (PARTITION BY customer_id ORDER BY payment_date) AS next_amount,
  FIRST_VALUE(amount) OVER (PARTITION BY customer_id ORDER BY payment_date) AS first_amount,
  RANK() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS amount_rank
FROM payment
ORDER BY customer_id, payment_date;

분석 항목

  • 결제 순번 (ROW_NUMBER)
  • 누적 결제액 (SUM)
  • 이전·다음 결제액 (LAG, LEAD)
  • 첫 결제액 (FIRST_VALUE)
  • 결제 순위 (RANK)

복합 예제 ② 월별 렌탈비 증감 분석

SELECT 
  customer_id, payment_month, total_amount,
  ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY payment_month) AS row_num,
  LAG(total_amount) OVER (PARTITION BY customer_id) AS prev_amount,
  LEAD(total_amount) OVER (PARTITION BY customer_id) AS next_amount,
  total_amount - LAG(total_amount) OVER (PARTITION BY customer_id) AS growth_amount
FROM (
  SELECT 
    customer_id,
    DATE_TRUNC('month', payment_date) AS payment_month,
    SUM(amount) AS total_amount
  FROM payment
  GROUP BY customer_id, DATE_TRUNC('month', payment_date)
) a
ORDER BY customer_id, payment_month;

분석 항목

  • 월별 렌탈비 추이
  • 이전·다음 달 대비 증감 금액
  • 고객별 월별 성장 분석

정리

구분주요 함수목적
순위 관련ROW_NUMBER, RANK, DENSE_RANK, NTILE순서 및 그룹화
집계 관련SUM, AVG, COUNT, MIN, MAX누적·평균·집계 분석
값 참조LAG, LEAD, FIRST_VALUE, LAST_VALUE, NTH_VALUE이전·다음·특정 행 참조
통계CUME_DIST, PERCENT_RANK, PERCENTILE_CONT, PERCENTILE_DISC백분위 및 분포 계산
profile
행복한 하루 보내세요

0개의 댓글