[ORACLE] 행렬 변환 SQL, PIVOT

privatekim·2024년 6월 19일

oracle sql

ORACLE

목록 보기

11/38

데이터를 출력할 때 세로로 출력하는 것이 아니라, 가로로 출력을 원할 때도 있다.

일반적으로 데이터는 세로로 출력되지만 집계함수와 조건함수를 사용하면 가로로도 출력할 수 있다. 단, 현재 단계에서는 집계하고자 하는 정보(department_id)에 대한 결과를 알고 있어야 한다.

각 부서에 해당하는 인원들에 대한 정보를 가로로 출력하시오.

일반적으로 세로로 출력한 결과는 다음과 같다.

select department_id, count(*) cnt
from hr.employees
group by department_id;

이를 단순히 집계함수와 조건함수를 사용하면 다음과 같이 표현할 수 있다.

select
    sum(decode(department_id,10,1)) as "10",
    sum(decode(department_id,20,1)) as "20",
    sum(decode(department_id,30,1)) as "30",
    sum(decode(department_id,40,1)) as "40",
    sum(decode(department_id,50,1)) as "50",
    sum(decode(department_id,60,1)) as "60",
    sum(decode(department_id,70,1)) as "70",
    count(decode(department_id,80,0)) as "80"   
from hr.employees;

이렇게 표현할 경우 hr.employees의 모든 데이터(107개)가 모든 deocde 문장(8개)을 거쳐 decode가 과하게 반복되는 단점이 있다.(107*8번)

이를 우리는 inner view를 사용하면 좀더 개선하여 표현할 수 있다.

select
    MAX(decode(department_id,10,cnt)) as "10" ,
    MAX(decode(department_id,20,cnt)) as "20" ,
    MAX(decode(department_id,30,cnt)) as "30" ,
    MAX(decode(department_id,40,cnt)) as "40" ,
    MAX(decode(department_id,50,cnt)) as "50" ,
    MAX(decode(department_id,60,cnt)) as "60" ,
    MAX(decode(department_id,70,cnt)) as "70" ,
    MAX(decode(department_id,80,cnt)) as "80" ,
    MAX(decode(department_id,90,cnt)) as "90" ,
    MAX(decode(department_id,100,cnt)) as "100",
    MAX(decode(department_id,NULL,cnt)) as "부서가 없는 사원",
    sum(cnt) as "전체사원수"
from(
select department_id, count(*) cnt
from hr.employees
group by department_id);

다음과 같은 부분에서 성능이 개선되었다.

innerView에서 계산된 결과를 바탕으로 select문을 진행하기 때문에 비교하는 데이터의 수가 크게 줄었다.(107개 -> 11개) 그 결과 총 계산하는 과정이 매우 단축되었다. (11*12)
사용된 group by함수의 결과를 한 줄로 표현하기 위해 집계함수인 MAX를 사용하였고 이미 count(*)한 결과인 cnt의 데이터를 받아오기 때문에 문제가 없다.

PIVOT

위와같은 기능을 지원하는 함수로는 PIVOT이 있다.
행(세로) 데이터를 열(가로)로 변환하는 기능을 수행한다.

PVIOT 문법

PIVOT( 집계함수() for 기준칼럼 IN (기준 칼럼의 values as 별칭))

pivot 안에는 반드시 그룹함수를 포함해야하며 별칭은 생략할 수 있다.

또한 문자형으로 표현할 경우 칼럼명에 ''가 포함된다는 단점, 여전히 상수값을 사용한다는 단점이 눈에 보인다...

각 부서별로 급여여의 분포를 알고 싶다.

select *
from (select salary, department_id
	from hr.employees e) -- pivot할 데이터 미리 추출
PIVOT(count(*) for department_id in (10,20,30,40,50,60,70,80,90,100,110,null as "부서가 없는 사원")
)
order by 1;
-- 칼럼이 2개가 생기면서
-- null데이터에 대한 corsstable을 진행할 경우 문제가 발생하는 것을 확인
-- 부서가 없는 사원의 데이터가 누락이 된다. 따라서 nvl 함수 사용하여 극복한다.

UNPIVOT

PIVOT과 반대로 열 데이터를 행 데이터로 변환하는 함수이다.

UNPIVOT 문법

UNPIVOT (기준1, 기준2 in (기준 2의 칼럼명));

UNPIVOT 예시

pviot을 통해 행렬변환을 진행한 결과를 다음과 같이 UNPIVOT을 이용해 변환시킬 수 있다.

select *
from (select*
from    (select to_char(hire_date, 'day') as hire_day, count(*) as cnt
    from hr.employees
    group by to_char(hire_date, 'day'))
PIVOT( sum(cnt) for hire_day in ( '월요일' 월,'화요일' 화,'수요일' 수,'목요일' 목,'금요일' 금,'토요일' 토,'일요일' 일))
)
UNPIVOT(인원수 for 요일 in (월,화,수,목,금,토,일));