20230322 SQL
♥ 인공지능 수학 최고 중요!!
♡ 결과 데이터 행 수 검증하기!!
● 의미분석 오류(Semantic Error): 테이블에 없는 컬럼 이름을 select에 적은 경우 발생
[문제44] 2007년 입사한 사원들의 도시이름별 급여의 총액, 평균을 출력해주세요.
단, 부서 배치를 받지 않은 사원들의 정보도 출력해주세요.
1) 오라클 전용
SELECT l.city, sum(e.salary), round(avg(e.salary))
FROM employees e, departments d, locations l
WHERE hire_date >= to_date('2007/01/01', 'yyyy/mm/dd')
AND hire_date < to_date('2008/01/01', 'yyyy/mm/dd')
AND e.department_id = d.department_id(+)
AND d.location_id = l.location_id(+)
GROUP BY l.city;
2) ANSI 표준
SELECT l.city, sum(e.salary), round(avg(e.salary))
FROM employees e LEFT OUTER JOIN departments d
ON e.department_id = d.department_id
LEFT OUTER JOIN locations l
ON d.location_id = l.location_id => 둘 중 무엇을 먼저 돌려야?
WHERE hire_date >= to_date('2007/01/01', 'yyyy/mm/dd')
AND hire_date < to_date('2008/01/01', 'yyyy/mm/dd') => 해당되는 19개 데이터 먼저!
GROUP BY l.city;
=> 따라서, index를 hire_date쪽에 걸어서 먼저 처리되도록 할 것임!
SELECT l.city, sum(e.salary), round(avg(e.salary))
FROM employees e LEFT OUTER JOIN departments d
USING(department_id)
LEFT OUTER JOIN locations l
USING(location_id)
WHERE hire_date >= to_date('2007/01/01', 'yyyy/mm/dd')
AND hire_date < to_date('2008/01/01', 'yyyy/mm/dd')
GROUP BY l.city; => 내가 한 것
[문제45] 사원들의 last_name, salary, grade_level, department_name을 출력하는데 last_name에 a 문자가 2개 이상 포함되어 있는 사원들을 출력해주세요.
1) 오라클 전용
SELECT e.last_name, e.salary, j.grade_level, d.department_name
FROM employees e, job_grades j, departments d
WHERE e.department_id = d.department_id
AND e.salary BETWEEN j.lowest_sal AND j.highest_sal
AND instr(last_name, 'a', 1, 2) > 0; => 내가 한 것(괜춘)
SELECT *
FROM employees
WHERE instr(last_name, 'a', 1, 2) >= 2; => 10개 나옴
SELECT e.last_name, e.salary, j.grade_level, d.department_name
FROM employees e, job_grades j, departments d
WHERE instr(last_name, 'a', 1, 2) >= 2
AND e.salary BETWEEN j.lowest_sal AND j.highest_sal
AND e.department_id = d.department_id;
2) ANSI 표준
SELECT e.last_name, e.salary, j.grade_level, d.department_name
FROM employees e JOIN departments d
ON e.department_id = d.department_id
JOIN job_grades j
ON e.salary BETWEEN j.lowest_sal AND j.highest_sal
AND instr(last_name, 'a', 1, 2) > 0;
SELECT e.last_name, e.salary, j.grade_level, d.department_name
FROM employees e JOIN departments d
USING(department_id)
JOIN job_grades j
ON e.salary BETWEEN j.lowest_sal AND j.highest_sal
AND instr(last_name, 'a', 1, 2) > 0; => 내가 한 것, 가능한지?
[문제46] 담당 관리자보다(직속상관) 먼저 입사한 사원의 이름과 입사일 및 해당 관리자의 이름과 입사일을 출력해주세요. (SELF-JOIN)
1) 오라클 전용
SELECT m.last_name, m.hire_date, w.last_name, w.hire_date
FROM employees m, employees w, departments d
WHERE w.employee_id = d.manager_id
AND m.hire_date < w.hire_date; => 내가 한 것
SELECT w.employee_id, w.last_name, w.hire_date, w.manager_id
FROM employees w;
SELECT m.employee_id, m.last_name, m.hire_date
FROM employees m; => 이 두 개를 합침!
SELECT w.employee_id, w.last_name, w.hire_date, w.manager_id, m.employee_id, m.last_name, m.hire_date
FROM employees w, employees m
WHERE w.manager_id = m.employee_id
AND w.hire_date < m.hire_date;
2) ANSI 표준
SELECT m.last_name, m.hire_date, w.last_name, w.hire_date
FROM employees m JOIN employees w
ON m.hire_date < w.hire_date
JOIN departments d
ON w.employee_id = d.manager_id; => 내가 한 것
SELECT w.employee_id, w.last_name, w.hire_date, w.manager_id, m.employee_id, m.last_name, m.hire_date
FROM employees w JOIN employees m
ON w.manager_id = m.employee_id
WHERE w.hire_date < m.hire_date;
■ Subquery (서브쿼리)
1 중첩 서브쿼리
단일행 서브쿼리(단일 비교 연산자)
HAVING절 서브쿼리
다중 행 서브쿼리(IN, ANY, ALL)
2 OR, AND 진리표
3 상호관련 서브쿼리
자아분열(내 생각)
INLINE VIEW
(예시) 110번 사원보다 더 많은 급여를 받는 사원?
SELECT *
FROM employees
WHERE salary > 110번 사원의 급여;
1)
SELECT salary
FROM employees
WHERE employee_id = 110;
2)
SELECT *
FROM employees
WHERE salary > 8200;
1) + 2)
SELECT *
FROM employees
WHERE salary > (SELECT salary
FROM employees
WHERE employee_id = 110;)
■ 중첩 서브쿼리(Nested Subquery)
1. sub query(inner query) 먼저 수행 – 한 번만 수행 가능(결과 데이터가 한 개여야 함)
2. 1번에서 수행한 값을 가지고 main query(outer query) 수행
main query(outer query)
SELECT *
FROM employees
WHERE salary > (SELECT salary
FROM employees
WHERE employee_id = 110;)
-------------------------------
sub query(inner query)
SELECT *
FROM employees
WHERE salary > (SELECT salary
FROM employees
WHERE last_name = 'King');
=> 오류 나옴; 단일 비교 연산자를 이용할때는 단일값을 가지고 수행해야 함
“single-row subquery returns more than one row”
● 단일행 서브쿼리
[문제47] 110번 사원의 job_id와 동일한 사원들 중에 110번 사원의 급여보다 더 많이 받는 사원들의 정보를 추출하세요.
SELECT *
FROM employees
WHERE job_id = 110번 사원의 job_id
AND salary > 110번 사원의 salary
SELECT *
FROM employees
WHERE job_id = (SELECT job_id
FROM employees
WHERE employee_id = 110)
AND salary > (SELECT salary
FROM employees
WHERE employee_id = 110);
[문제48] 최고 급여를 받는 사원들의 정보를 출력해주세요.
SELECT *
FROM employees;
WHERE salary = 최고급여;
SELECT *
FROM employees
WHERE salary = (SELECT max(salary)
FROM employees);
◆ having절
● 40번 부서의 최소급여보다 큰 급여총액을 가진 부서번호와 급여총액
SELECT department_id, sum(salary)
FROM employees
GROUP BY department_id
HAVING sum(salary) > (40번 부서의 최소 급여)
SELECT department_id, sum(salary)
FROM employees
GROUP BY department_id
HAVING sum(salary) > (SELECT salary
FROM employees
WHERE department_id = 40); => 오류: 단일 연산자 단일행 필요
SELECT department_id, sum(salary)
FROM employees
GROUP BY department_id
HAVING sum(salary) > (SELECT min(salary)
FROM employees
WHERE department_id = 40);
● 50번 부서의 최고급여보다 큰 급여총액을 가진 부서번호와 급여총액
SELECT department_id, sum(salary)
FROM employees
GROUP BY department_id
HAVING sum(salary) > (SELECT max(salary)
FROM employees
WHERE department_id = 50);
● haivng절 이용 그룹함수 “not a single-group group function” 해결 서브쿼리
SELECT department_id, min(avg(salary))
FROM employees
GROUP BY department_id; => 오류
[문제49] 급여 최소 평균값을 가지고 있는 부서 번호의 평균을 출력해주세요.
SELECT department_id, avg(salary)
FROM employees
GROUP BY department_id
HAVING avg(salary) = (SELECT min(avg(salary))
FROM employees
GROUP BY department_id);
[문제50] 급여 최대 평균값을 가지고 있는 job_id, 평균을 출력해주세요.
SELECT job_id, round(avg(salary))
FROM employees
GROUP BY job_id
HAVING avg(salary) = (SELECT max(avg(salary))
FROM employees
GROUP BY job_id);
■ 다중 행 서브쿼리
■ IN
SELECT *
FROM employees
WHERE salary = (SELECT min(salary)
FROM employees
GROUP BY department_id);
=> 오류: 서브쿼리 결과값이 여러 개 이므로 단일행 비교 연산자 사용 불가
▽▽(해결)
SELECT *
FROM employees
WHERE salary IN (SELECT min(salary) => 각 목록과 일치되는 값 출력
FROM employees
GROUP BY department_id);
■ ANY
SELECT *
FROM employees
WHERE salary > (SELECT salary
FROM employees
WHERE job_id = 'IT_PROG');
=> 오류: 서브쿼리 결과값이 여러 개 이므로 단일행 비교 연산자 사용 불가
▽▽(해결 – max, min으로 제한 주기)
SELECT *
FROM employees
WHERE salary > (SELECT min(salary)
FROM employees
WHERE job_id = 'IT_PROG');
SELECT *
FROM employees
WHERE salary > (SELECT max(salary)
FROM employees
WHERE job_id = 'IT_PROG');
▽▽(해결 – ANY 사용)
SELECT *
FROM employees
WHERE salary > ANY(SELECT salary => “최소값보다 크다”와 같은 의미
FROM employees => OR의 범주를 가짐
WHERE job_id = 'IT_PROG');
( > ANY는 아래 과정을 내포함)
SELECT salary
FROM employees
WHERE job_id = 'IT_PROG';
SELECT *
FROM employees
WHERE salary > 9000
OR salary > 6000
OR salary > 4800
OR salary > 4800
OR salary > 4200;
SELECT *
FROM employees
WHERE salary < ANY(SELECT salary => “최대값보다 작다”와 같은 의미
FROM employees => OR의 범주를 가짐
WHERE job_id = 'IT_PROG');
( < ANY는 아래 과정을 내포함)
SELECT salary
FROM employees
WHERE job_id = 'IT_PROG';
SELECT *
FROM employees
WHERE salary < 9000
OR salary < 6000
OR salary < 4800
OR salary < 4800
OR salary < 4200;
▽▽(동일)
SELECT *
FROM employees
WHERE salary < (SELECT max(salary)
FROM employees
WHERE job_id = 'IT_PROG');
SELECT *
FROM employees
WHERE salary = ANY(SELECT salary => ‘= ANY’ : IN 의미
FROM employees
WHERE job_id = 'IT_PROG');
SELECT *
FROM employees
WHERE salary = 9000
OR salary = 6000
OR salary = 4800
OR salary = 4800
OR salary = 4200;
■ ALL
SELECT *
FROM employees
WHERE salary > ALL(SELECT salary => “최대값보다 크다”와 같은 의미
FROM employees
WHERE job_id = 'IT_PROG');
SELECT *
FROM employees
WHERE salary > 9000
AND salary > 6000
AND salary > 4800
AND salary > 4800
AND salary > 4200;
▽▽(동일)
SELECT *
FROM employees
WHERE salary > (SELECT max(salary)
FROM employees
WHERE job_id = 'IT_PROG');
SELECT *
FROM employees
WHERE salary < ALL(SELECT salary => “최소값보다 작다”와 같은 의미
FROM employees
WHERE job_id = 'IT_PROG');
SELECT *
FROM employees
WHERE salary < 9000
AND salary < 6000
AND salary < 4800
AND salary < 4800
AND salary < 4200;
▽▽(동일)
SELECT *
FROM employees
WHERE salary < (SELECT min(salary)
FROM employees
WHERE job_id = 'IT_PROG');
SELECT *
FROM employees
WHERE salary = ALL(SELECT salary => 공집합, 모든 액수를 만족하는 값 無
FROM employees
WHERE job_id = 'IT_PROG');
SELECT *
FROM employees
WHERE salary = 9000
AND salary = 6000
AND salary = 4800
AND salary = 4800
AND salary = 4200;
[문제51] 2006년도에 입사한 사원들의 job_id,와 동일한 사원들의 job_id별 급여의 총액 중에 50000 이상인 값만 출력해주세요.
(풀이 과정)
SELECT employee_id, hire_date, job_id
FROM employees
WHERE hire_date >= to_date('2006/01/01','yyyy/mm/dd')
AND hire_date < to_date('2007/01/01','yyyy/mm/dd');
SELECT job_id, sum(salary)
FROM employees
GROUP BY job_id;
SELECT job_id, sum(salary)
FROM employees
GROUP BY job_id
HAVING sum(salary)>=50000
(최종)
SELECT job_id, sum(salary)
FROM employees
WHERE job_id = ANY(SELECT job_id
FROM employees
WHERE hire_date >= to_date('2006/01/01','yyyy/mm/dd')
AND hire_date < to_date('2007/01/01','yyyy/mm/dd'))
GROUP BY job_id
HAVING sum(salary)>=50000;
(답안)
SELECT job_id, sum(salary)
FROM employees
WHERE job_id IN (SELECT job_id
FROM employees
WHERE hire_date >= to_date('2006/01/01','yyyy/mm/dd')
AND hire_date < to_date('2007/01/01','yyyy/mm/dd'))
GROUP BY job_id
HAVING sum(salary)>=50000;
[문제52] location_id가 1700인 모든 사원들의 last_name, department_id, job_id를 출력해주세요.
1) 조인
SELECT e.last_name, e.department_id, e.job_id
FROM employees e, departments d
WHERE d.location_id = 1700
AND e.department_id = d.department_id;
SELECT e.last_name, e.department_id, e.job_id
FROM employees e, departments d
WHERE e.department_id = d.department_id
AND d.location_id = 1700;
SELECT e.last_name, e.department_id, e.job_id
FROM employees e JOIN departments d
ON e.department_id = d.department_id
WHERE d.location_id = 1700;
SELECT e.last_name, e.department_id, e.job_id
FROM employees e JOIN departments d
USING(department_id)
WHERE d.location_id = 1700;
2) 서브쿼리 (실무 : 서브쿼리 더 선호하는 개발자 多, 오라클 : 조인절 선호)
SELECT last_name, department_id, job_id
FROM employees
WHERE department_id IN (SELECT department_id
FROM departments
WHERE location_id = 1700);
[문제53] 60번 부서 사원들의 급여보다 더 많은 급여를 받는 사원들의 정보를 출력해주세요.
SELECT *
FROM employees
WHERE salary > ALL(SELECT salary
FROM employees
WHERE department_id = 60);
SELECT *
FROM employees
WHERE salary > (SELECT max(salary)
FROM employees
WHERE department_id = 60);
[문제54] 관리자 사원들의 정보를 출력해주세요.
SELECT *
FROM employees e, departments d
WHERE e.employee_id = d.manager_id;
SELECT *
FROM employees
WHERE employee_id IN (SELECT manager_id
FROM departments);
SELECT *
FROM employees
WHERE employee_id IN (SELECT manager_id
FROM employees);
SELECT distinct manager_id
FROM employees
ORDER BY 1; => 18개 맞음
[문제55] 관리자가 아닌 사원들의 정보를 출력해주세요.
SELECT *
FROM employees
WHERE employee_id NOT IN (SELECT manager_id
FROM employees); => 안 나옴!!!!
■ OR 진리표(truth table)
★ TRUE OR TRUE : TRUE
★ TRUE OR FALSE : TRUE
★ TRUE OR NULL : TRUE
★ FALSE OR NULL : NULL (T 들어오면 T, F 들어오면 F 됨)
(전제 : 현재 null이지만 언젠가 데이터가 입력되어서 T/F될 수 있음)
SELECT *
FROM employees
WHERE employee_id = 1000
OR employee_id = null;
=> false or null로 null임
SELECT *
FROM employees
WHERE employee_id IN (null, 100, 101, 102);
SELECT *
FROM employees
WHERE employee_id = NULL
OR employee_id = 100
OR employee_id = 101
OR employee_id = 102;
=> 쿼리문 결과값 100-102 출력됨 (OR 진리표에서 true or null은 true이기 때문)
■ AND 진리표(truth table)
★ TRUE AND TRUE : TRUE
★ TRUE AND FALSE : FALSE
★ TRUE AND NULL : NULL
★ FALSE AND NULL : FALSE
(전제 : 현재 null이지만 언젠가 데이터가 입력되어서 T/F될 수 있음)
SELECT *
FROM employees
WHERE employee_id NOT IN (null, 100, 101, 102);
SELECT *
FROM employees
WHERE employee_id != NULL
AND employee_id != 100
AND employee_id != 101
AND employee_id != 102;
=> 결과값 출력되지 않음(AND 진리표에서 true and null은 null이기 때문)
(재)
SELECT *
FROM employees
WHERE employee_id NOT IN (SELECT manager_id
FROM employees); => 안 나옴!!!!
▽▽(해결)
SELECT *
FROM employees
WHERE employee_id NOT IN (SELECT manager_id
FROM employees
WHERE manager_id IS NOT NULL);
◎ NOT IN 연산자를 이용할때는 서브쿼리에 NULL 값이 있으면 조회가 안 됨
그래서 NULL값을 제외한 후 수행함
■ 상호관련 서브쿼리(Correlated Subquery) => 자아분열 같은 느낌(내 생각)
존재유무 파악시 매우 자주 사용함
<수행 방식>
1 메인쿼리(outer query)절 먼저 수행
2 첫 번째 행을 후보 행으로 잡고 후보 행 값을 서브쿼리절에 전달함
3 후보 행 값을 사용하여 서브쿼리 수행
4 서브쿼리 결과값을 후보 행 값과 비교하여 참이면 그 행을 결과 집합에 저장(메모리)
5 다음 행을 후보 행으로 잡고 후보 행 값을 서브쿼리절에 전달함
(2-4번 과정을 row 개수만큼 반복 수행)
[문제56] 자신의 부서 평균 급여보다 더 많이 받는 사원들의 정보를 출력해주세요.
(풀이 과정)
SELECT *
FROM employees
WHERE salary > (자신의 부서 평균 급여)
SELECT employee_id, salary, department_id
FROM employees
ORDER BY 3;
SELECT avg(salary)
FROM employees
WHERE department_id = 20; => 전체 사원과 비교
SELECT
FROM employees
WHERE salary > (SELECT avg(salary)
FROM employees
WHERE department_id = 자신의 부서 코드);
SELECT
FROM employees e
WHERE salary > (SELECT avg(salary)
FROM employees
WHERE department_id = e.department_id);
(미지수, 변수, 후보행 값)
=> 서브쿼리가 employees 테이블의 행의 수만큼 돌아감(순번 고정, 내용 T/F 비교)
● 상호관련 서브쿼리의 문제점
SELECT department_id, avg(salary)
FROM employees
GROUP BY department_id
ORDER BY 1;
=> 가상의 테이블을 만들어서 참조하면 효율적
■ INLINE VIEW
(과정)
SELECT *
FROM (SELECT department_id as dept_id, avg(salary) as avgsal
FROM employees
GROUP BY department_id) e1; => 가상 집합 만듦
SELECT *
FROM (SELECT department_id as dept_id, avg(salary) as avgsal
FROM employees
GROUP BY department_id) e1, employees e2
WHERE e1.dept_id = e2.department_id;
SELECT *
FROM (SELECT department_id as dept_id, avg(salary) as avgsal
FROM employees
GROUP BY department_id) e1, employees e2
WHERE e1.dept_id = e2.department_id
AND e2.salary > e1.avgsal;
(최종)
SELECT e2.*, round(e1.avgsal) 부서평균
FROM (SELECT department_id as dept_id, avg(salary) as avgsal
FROM employees
GROUP BY department_id) e1, employees e2
WHERE e1.dept_id = e2.department_id
AND e2.salary > e1.avgsal;