SQL 8일차

한희수·2023년 3월 26일
0

빅데이터 분석 SQL

목록 보기
8/17

20230323 SQL

파이팅!!

[문제57] Executive 부서이름에 소속된 모든 사원에 대한 employee_id, last_name, job_id 조인을 이용해서 출력해주세요.
1)
SELECT *
FROM departments
WHERE department_name = 'Executive';

2)
SELECT *
FROM employees
WHERE department_id = 90;

1)+2)
SELECT e.employee_id, e.last_name, e.job_id, d.department_name
FROM employees e JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_name = 'Executive';

SELECT e.employee_id, e.last_name, e.job_id, d.department_name
FROM employees e JOIN departments d
USING(department_id)
WHERE d.department_name = 'Executive';

[문제58] Executive 부서이름에 소속된 모든 사원에 대한 employee_id, last_name, job_id 서브쿼리를 이용해서 출력해주세요.

SELECT e.employee_id, e.last_name, e.job_id
FROM (SELECT *
FROM departments
WHERE department_name = 'Executive') dept, employees e

WHERE dept.department_id = e.department_id;

SELECT employee_id, last_name, job_id
FROM employees
WHERE department_id = (SELECT department_id
FROM departments
WHERE department_name = 'Executive');

■ 중첩 서브쿼리(Nested Subquery)
1. inner query(sub query)를 먼저 수행(상수값으로 키핑되어 있기 때문)
2. inner query에서 수행한 값을 가지고 outer query(main query)를 수행

(예시) 관리자 사원 정보 출력
SELECT *
FROM employees
WHERE employee_id IN (SELECT manager_id
FROM departments);

SELECT m.*
FROM employees w, employees m
WHERE w.manager_id = m.employee_id; => 관리자 11개 데이터 정보가 106개 행 나옴
M쪽 1쪽 (조인은 중복 데이터 생략 안 함)

SELECT m.*
FROM (SELECT distinct manager_id => 1쪽 집합 만들기 위한 inline view
FROM employees) w, employees m 내부적으로 sort 발생(부하 심해짐)
WHERE w.manager_id = m.employee_id; (결과 나오지만 이렇게 하면 안 됨)

■ 상호관련 서브쿼리(Correlated Subquery)
1. outer query(main query)를 먼저 수행
2. 첫 번째 행을 후보 행으로 잡고 후보 행 값을 inner query(subquery)에 전달
3. 후보 행 값을 이용하여 inner query 수행
4. inner query 결과 값 사용하여 후보 행과 비교; 참(T)일 경우 별도 메모리에 후보 행 정보 입력, 거짓(F)일 경우 버림
5. 다음 행을 후보 행으로 잡고 2-4번 반복 수행

(예시) 자신의 부서 평균 급여보다 많이 받는 사원
SELECT *
FROM employees e
WHERE salary > (SELECT avg(salary)
                      FROM employees
                      WHERE department_id = e.department_id);

▽▽▽ (INLINE VIEW 사용)

SELECT e2.*
FROM (SELECT department_id dept_id, avg(salary) avgsal
          FROM employees
          GROUP BY department_id) e1, employees e2
WHERE e1.dept_id = e2.department_id
AND e2.salary > e1.avgsal;

■ EXISTS 연산자 ★★★★★

  • 상호관련 서브쿼리절에서 사용하는 연산자
  • 후보 행 값이 서브쿼리에 존재하는지 여부를 찾는 연산자
  • 후보 행 값이 서브쿼리에 존재하면 TRUE, 우리가 찾는 데이터로 별도 메모리에 저장
    (검색 종료!)
  • 후보 행 값이 서브쿼리에 존재하지 않으면 FALSE, 우리가 찾는 데이터 아님
    (찾을 때까지 끝까지 검색!)

(예시)
SELECT *
FROM employees
WHERE employee_id IN (SELECT manager_id
                                FROM employees);

▽▽▽ 

hint : /*+NO_UNREST*/

SELECT *
FROM employees
WHERE employee_id IN (SELECT /*+NO_UNREST*/ manager_id
                                FROM employees);

▽▽▽ 

SELECT *
FROM employees e
WHERE EXISTS(SELECT 'x'
                     FROM employees
                     WHERE manager_id = e.employee_id);

■ NOT EXISTS 연산자 ★★★★★

  • 후보행 값이 서브쿼리에 존재하지 않는 데이터를 찾는 연산자
  • 후보행 값이 서브쿼리에 존재하지 않으면 TRUE, 우리가 찾는 데이터로 별도 메모리에 저장
  • 후보행 값이 서브쿼리에 존재하면 FALSE, 우리가 찾는 데이터 아니므로 검색 종료

(예시)
SELECT *
FROM employees
WHERE employee_id NOT IN (SELECT manager_id
                                       FROM employees
                                       WHERE manager_id IS NOT NULL);

▽▽▽ 

SELECT *
FROM employees e
WHERE NOT EXISTS(SELECT 'x'
                           FROM employees
                           WHERE manager_id = e.employee_id);

[문제59] 소속사원이 있는 부서정보를 출력하세요.

SELECT *
FROM employees e
WHERE EXISTS(SELECT 'X'
                    FROM employees
                    WHERE department_id = e.department_id);

SELECT *
FROM departments
WHERE department_id IN (SELECT department_id
                                   FROM employees);

SELECT *
FROM departments o
WHERE EXISTS(SELECT 'X'
                     FROM employees
                     WHERE department_id = o.department_id);

[문제60] 소속사원이 없는 부서정보를 출력하세요.

SELECT *
FROM employees e
WHERE NOT EXISTS(SELECT 'X'
                           FROM employees
                           WHERE department_id = e.department_id);

SELECT *
FROM departments
WHERE department_id NOT IN (SELECT department_id
                                          FROM employees
                                          WHERE department_id IS NOT NULL);

SELECT *
FROM departments o
WHERE NOT EXISTS(SELECT 'X'
                           FROM employees
                           WHERE department_id = o.department_id);

[문제61] 사원들의 급여 등급에 포함된 등급정보를 출력해주세요.

SELECT *
FROM job_grades
WHERE grade_level IN (SELECT j.grade_level
                              FROM employees e, job_grades j
                              WHERE e.salary BETWEEN j.lowest_sal AND j.highest_sal);

SELECT *
FROM job_grades j
WHERE EXISTS (SELECT 'X'
                     FROM employees
                     WHERE salary BETWEEN j.lowest_sal AND j.highest_sal);

[문제62] 사원들의 급여 등급에 포함되지 않은 등급정보를 출력해주세요.

SELECT *
FROM job_grades
WHERE grade_level NOT IN (SELECT j.grade_level
                     FROM employees e, job_grades j
                     WHERE e.salary BETWEEN j.lowest_sal AND j.highest_sal
AND j.grade_level IS NOT NULL);

SELECT *
FROM job_grades j
WHERE NOT EXISTS (SELECT 'X'
                     FROM employees
                     WHERE salary BETWEEN j.lowest_sal AND j.highest_sal);

[문제63] 사원 수가 가장 많은 부서이름, 도시, 인원수를 출력해주세요.

(과정)
SELECT department_id, count(*)
FROM employees
GROUP BY department_id;

SELECT max(count(*))
FROM employees
GROUP BY department_id;

(최종)
SELECT d.department_name, l.city, count()
FROM departments d, locations l, employees e
WHERE e.department_id = d.department_id
AND d.location_id = l.location_id
GROUP BY d.department_name, l.city
HAVING count(*) = (SELECT max(count(
))
                     FROM employees
                     GROUP BY department_id);

=> 문제가 있음(한 줄의 결과 데이터 출력을 위해 조인 다 하고 그룹핑도 함)

● 조인의 일 양을 줄이는 작업으로 성능을 개선해보자!(INLINE VIEW 이용)

(과정)
SELECT department_id, count(*)
FROM employees
GROUP BY department_id;

(최종)
SELECT d.department_name, l.city, e.cnt
FROM(SELECT department_id, count() cnt
FROM employees
GROUP BY department_id
HAVING count(
) = (SELECT max(count(*))
FROM employees
GROUP BY department_id)) e, departments d, locations l
WHERE e.department_id = d.department_id
AND d.location_id = l.location_id;

[문제64] 부서별로 인원수를 출력해주세요.

102030...부서가 없는 사원
122...1

SELECT department_id, count(*)
FROM employees
GROUP BY department_id
ORDER BY 1;

SELECT count(*) as TOTAL,
COUNT(case when department_id = '10' THEN '01' END) as "10",
COUNT(case when department_id = '20' THEN '01' END) as "20",
COUNT(case when department_id = '30' THEN '01' END) as "30",
COUNT(case when department_id = '40' THEN '01' END) as "40",
COUNT(case when department_id = '50' THEN '01' END) as "50",
COUNT(case when department_id = '60' THEN '01' END) as "60",
COUNT(case when department_id = '70' THEN '01' END) as "70",
COUNT(case when department_id = '80' THEN '01' END) as "80",
COUNT(case when department_id = '90' THEN '01' END) as "90",
COUNT(case when department_id = '100' THEN '01' END) as "100",
COUNT(case when department_id = '110' THEN '01' END) as "110",
COUNT(case when department_id IS NULL THEN '01' END) as "부서가 없는 사원"
FROM employees;

SELECT
count(decode(department_id, 10, 'x')) "10",
count(decode(department_id, 20, 'x')) "20",
count(decode(department_id, 30, 'x')) "30",
count(decode(department_id, 40, 'x')) "40",
count(decode(department_id, 50, 'x')) "50",
count(decode(department_id, 60, 'x')) "60",
count(decode(department_id, 70, 'x')) "70",
count(decode(department_id, 80, 'x')) "80",
count(decode(department_id, 90, 'x')) "90",
count(decode(department_id, 100, 'x')) "100",
count(decode(department_id, 110, 'x')) "110",
count(decode(department_id, null, 'x')) "부서가 없는 사원"
FROM employees;

=> count 한 번마다 107번 돌려야 한다는 것, 성능문제가 위 query 문장의 문제점!

▽▽▽(성능 좀 개선)

SELECT
max(decode(department_id,10,cnt))"10", => 아무 그룹함수나 쓰면 됨
max(decode(department_id,20,cnt))"20", count는 1 나와서 안 됨
max(decode(department_id,30,cnt))"30",
max(decode(department_id,40,cnt))"40",
max(decode(department_id,50,cnt))"50",
max(decode(department_id,60,cnt))"60",
max(decode(department_id,70,cnt))"70",
max(decode(department_id,80,cnt))"80",
max(decode(department_id,90,cnt))"90",
max(decode(department_id,100,cnt))"100",
max(decode(department_id,110,cnt))"110",
max(decode(department_id,null,cnt))"부서가 없는 사원"
FROM (SELECT department_id, count(*) cnt
FROM employees
GROUP BY department_id);

■ PIVOT 함수

  • 행(세로) 방향을 열(가로) 방향으로 변경하는 함수
  • PIVOT 함수 안에는 그룹함수가 들어있어야 함

SELECT *
FROM (SELECT department_id
          FROM employees)
PIVOT(COUNT(*) FOR department_id IN (10,20,30,40,50,60,70,80,90,100,110,null as "부서가 없는 사원"));

(예시) 다음 쿼리문을 가로 방향으로 변경하기

SELECT department_id, sum(salary)
FROM employees
GROUP BY department_id;

▽▽▽

SELECT *
FROM (SELECT department_id, salary
           FROM employees)
PIVOT(SUM(salary) FOR department_id IN (10,20,30,40,50,60,70,80,90,100,110,NULL));

▽▽▽(동일함)

SELECT *
FROM (SELECT department_id, sum(salary) sumsal
           FROM employees
           GROUP BY department_id)
PIVOT(sumsal FOR department_id IN (10,20,30,40,50,60,70,80,90,100,110,NULL));

=> 그룹함수 부분: max, min 등 값에 영향을 주지 않는(count 제외한) 그룹함수 넣으면 됨

[문제65] 연도별 입사한 인원수를 가로방향으로 출력해주세요.

(과정)
SELECT to_char(hire_date, 'yyyy') YEARS, count(*)
FROM employees
GROUP BY to_char(hire_date, 'yyyy');

(최종)
SELECT *
FROM (SELECT hire_date
         FROM employees)
PIVOT(count(*) FOR to_char(hire_date, 'yyyy') IN (2003,2005,2001,2008,2007,2002,2004,2006));

▽▽▽

1)
SELECT
FROM (SELECT to_char(hire_date, 'yyyy') years
FROM employees)
PIVOT(count(
) FOR years IN (2001,2002,2003,2004,2005,2006,2007,2008));

SELECT
FROM (SELECT to_char(hire_date, 'yyyy') YEARS, count(
) cnt
FROM employees
GROUP BY to_char(hire_date, 'yyyy'))
PIVOT(max(cnt) FOR to_char(hire_date, 'yyyy') IN (2001,2002,2003,2004,2005,2006,2007,2008));

▽▽▽

SELECT
FROM (SELECT to_char(hire_date, 'yyyy') YEARS, count(
) cnt
FROM employees
GROUP BY to_char(hire_date, 'yyyy'))
PIVOT(max(cnt) FOR YEARS IN (2001,2002,2003,2004,2005,2006,2007,2008));
=> 이 부분 작따 붙이면 컬럼에 보임, as 붙이기!
‘2001’ as “2001” ???? 안되는데

2)
SELECT count(*) 총인원수 FROM employees;

1)+2)
SELECT *
FROM (SELECT to_char(hire_date, 'yyyy') YEARS, count(*) cnt
           FROM employees
           GROUP BY to_char(hire_date, 'yyyy'))
PIVOT(max(cnt) FOR YEARS IN (2001,2002,2003,2004,2005,2006,2007,2008)),
(SELECT count(*) 총인원수 FROM employees);

1)+2)+α
SELECT *
FROM (SELECT to_char(hire_date, 'yyyy') YEARS, count(*) cnt
           FROM employees
           GROUP BY to_char(hire_date, 'yyyy'))
PIVOT(max(cnt) FOR YEARS IN (2001,2002,2003,2004,2005,2006,2007,2008)),
(SELECT count(*) 총인원수 FROM employees),
(SELECT sum(salary) 총급여 FROM employees);

1)+2)+α+α
SELECT
FROM (SELECT to_char(hire_date, 'yyyy') YEARS, count(
) cnt
           FROM employees
           GROUP BY to_char(hire_date, 'yyyy'))
PIVOT(max(cnt) FOR YEARS IN (2001,2002,2003,2004,2005,2006,2007,2008)),
(SELECT count(*) 총인원수 FROM employees),
(SELECT sum(salary) 총급여 FROM employees),
(SELECT to_char(sum(salary), 'l999,999') 총급여 FROM employees);

[문제66] 요일별 입사 인원수를 가로 방향으로 출력해주세요.
SELECT *
FROM (SELECT to_char(hire_date, 'mm') Month, count(*) cnt
           FROM employees
           GROUP BY to_char(hire_date, 'mm'))
PIVOT(max(cnt) FOR Month IN ('01'as"01",'02','03','04','05','06','07','08','09',10,11,12));
=> 월별 입사 인원수를 출력했음ㅋㅋㅜ

SELECT *
FROM (SELECT to_char(hire_date, 'dy') day, count(*) cnt
           FROM employees
           GROUP BY to_char(hire_date, 'dy'))
PIVOT(max(cnt) FOR day IN ('월'as"월",'화'as"화",'수'as"수",'목'as"목",'금'as"금",'토'as"토",'일'as"일"));
SELECT
FROM (SELECT to_char(hire_date, 'dy') day
           FROM employees)
PIVOT(count(
) FOR day IN ('월'as"월",'화'as"화",'수'as"수",'목'as"목",'금'as"금",'토'as"토",'일'as"일"))

■ UNPIVOT 함수

  • 열(가로) 방향을 행(세로) 방향으로 변경하는 함수

SELECT *
FROM (SELECT *
FROM (SELECT to_char(hire_date, 'dy') day
FROM employees)
PIVOT(count(*) FOR day IN ('월'as"월",'화'as"화",'수'as"수",'목'as"목",'금'as"금",'토'as"토",'일'as"일")))
UNPIVOT(인원수 FOR 요일 IN (월,화,수,목,금,토,일));

=> 컬럼 이름에 작따 있으면 후에 가공 어려우므로 특수문자 사용 자제!

SELECT 요일||'요일' 요일, 인원수
FROM (SELECT
           FROM (SELECT to_char(hire_date, 'dy') day
                     FROM employees)
PIVOT(count(
) FOR day IN ('월'as"월",'화'as"화",'수'as"수",'목'as"목",'금'as"금",'토'as"토",'일'as"일")))
UNPIVOT(인원수 FOR 요일 IN (월,화,수,목,금,토,일));

[문제67] 연도, 분기별 급여의 총액을 구하세요. (교차표 만들기)
년도 1분기 2분기 ...
2001
2002
...

(풀이 과정)
SELECT to_char(hire_date, 'YYYY'), sum(salary)
FROM employees
GROUP BY to_char(hire_date, 'YYYY');

SELECT to_char(hire_date, 'q"분기"'), sum(salary)
FROM employees
GROUP BY to_char(hire_date, 'q"분기"');

SELECT *
FROM (SELECT to_char(hire_date, 'q"분기"') 분기, sum(salary) sumsal
FROM employees
GROUP BY to_char(hire_date, 'q"분기"'))
PIVOT(max(sumsal) FOR 분기 IN ('1분기'as"1분기",'2분기','3분기','4분기')),
(SELECT to_char(hire_date, 'YYYY') 연도
FROM employees
GROUP BY to_char(hire_date, 'YYYY'));

(최종 답안; 옛 방식)
SELECT to_char(hire_date, 'yyyy'), to_char(hire_date, 'q'), sum(salary)
FROM employees
GROUP BY to_char(hire_date, 'yyyy'), to_char(hire_date, 'q')
ORDER BY 1;

▽▽▽

SELECT to_char(hire_date, 'yyyy') 연도,
sum(decode(to_char(hire_date,'q'),1,salary))"1분기",
sum(decode(to_char(hire_date,'q'),2,salary))"2분기",
sum(decode(to_char(hire_date,'q'),3,salary))"3분기",
sum(decode(to_char(hire_date,'q'),4,salary))"4분기"
FROM employees
GROUP BY to_char(hire_date, 'yyyy')
ORDER BY 1;

(최종 답안; 피벗 함수)
SELECT to_char(hire_date, 'yyyy'), to_char(hire_date, 'q'), salary
FROM employees;

SELECT *
FROM (SELECT to_char(hire_date, 'yyyy') 연도, to_char(hire_date, 'q') 분기, salary
           FROM employees)
PIVOT(sum(salary) FOR 분기 IN (1"1분기",2"2분기",3"3분기",4"4분기"))
ORDER BY 1;

SELECT *
FROM (SELECT to_char(hire_date, 'yyyy') 연도, to_char(hire_date, 'q') 분기, sum(salary) 총급여
           FROM employees
           GROUP BY to_char(hire_date, 'yyyy'), to_char(hire_date, 'q'))
PIVOT(max(총급여) FOR 분기 IN (1"1분기",2"2분기",3"3분기",4"4분기"))
ORDER BY 1;

0개의 댓글