해커랭크 SQL 공부 정리 1

hi_life·2023년 1월 16일

Solve SQL

위 사이트는 해커랭크에서 SQL 문제를 풀어볼 수 있는 사이트다.
문제를 풀면서 조금 헷갈리거나 어려웠던 문제나 기억하고 싶은 내용 위주로 정리를 하려고 한다.

Basic Select

Weather Observation Station 3

Query a list of CITY names from STATION for cities that have an even ID number. Print the results in any order, but exclude duplicates from the answer.

SELECT DISTINCT CITY
FROM STATION
WHERE ID % 2 = 0

연산자 산술 연산자 %: 나머지 연산으로, 왼쪽의 값을 오른쪽 값으로 나눈 나머지를 반환

Query the two cities in STATION with the shortest and longest CITY names, as well as their respective lengths (i.e.: number of characters in the name). If there is more than one smallest or largest city, choose the one that comes first when ordered alphabetically.

(SELECT CITY, LENGTH(CITY) 
FROM STATION 
ORDER BY LENGTH(CITY), CITY
LIMIT 1)
UNION
(SELECT CITY, LENGTH(CITY)
FROM STATION 
ORDER BY LENGTH(CITY) DESC, CITY 
LIMIT 1)

(SELECT CITY, LENGTH(CITY) 
FROM STATION 
WHERE LENGTH(CITY) = (SELECT MIN(LENGTH(CITY)) FROM STATION)
ORDER BY 1
LIMIT 1)
UNION
(SELECT CITY, LENGTH(CITY) 
FROM STATION 
WHERE LENGTH(CITY) = (SELECT MAX(LENGTH(CITY)) FROM STATION)
ORDER BY 1
LIMIT 1)

UNION: 2개 이상 테이블에 존재하는 같은 성격의 값을 하나의 쿼리로 추출
참고
1> CITY 문자 길이와 이름 순으로 정렬해 가장 짧은 문자 길이의 이름 및 길이 추출, 또 문자 길이(내림차순)와 이름 순으로 정렬해 가장 긴 문자 길이의 이름 및 길이 추출한 뒤 UNION으로 합치기
2> WHERE 조건에 CITY 문자 길이가 MIN인 값을 넣어 가장 짧은 문자 길이의 이름 및 길이 추출, 또 WHERE 조건에 CITY 문자 길이가 MAX인 값을 넣어 가장 긴 문자 길이의 이름 및 길이 추출한 뒤 UNION으로 합치기

Weather Observation Station 6

Query the list of CITY names starting with vowels (i.e., a, e, i, o, or u) from STATION. Your result cannot contain duplicates.

SELECT DISTINCT CITY
FROM STATION
WHERE LEFT(CITY, 1) IN ('a','e','i','o','u')

SELECT DISTINCT CITY
FROM STATION
WHERE REGEXP_LIKE(CITY, '^[aeiou]')

SELECT DISTINCT CITY
FROM STATION
WHERE CITY REGEXP '^a|^e|^i|^o|^u'

SELECT DISTINCT CITY
FROM STATION
WHERE CITY REGEXP '^[aeiou]'

LEFT(문자열, 길이): 문자열을 받아서 왼쪽부터 원하는 길이만큼 자르는 함수
REGEXP_LIKE(문자열, 정규식 패턴): 정규식을 비교하여 일치할 경우 추출해주는 함수
REGEXP: 정규 표현식(Regular Expression)을 활용해 기본 연산자보다 더 복잡한 문자열 조건을 걸어 데이터 검색 가능
- |(수직선): |로 구분된 문자에 해당하는 문자열 찾기 (ex> "하이|안녕" -> '하이' 또는 '안녕'에 해당하는 문자열 찾기)
- [] : 안에 나열된 패턴에 해당하는 문자열 찾기 (ex> "[abc]d" -> 'ad' 또는 'bd' 또는 'cd'인 문자열 찾기)
- ^ : 시작하는 문자열을 찾기 (ex> "^폰" -> '폰'으로 시작하는 문자열 찾기)
- $ : 끝나는 문자열을 찾기 (ex> "고기$"-> '고기'로 끝나는 문자열을 찾음)

Weather Observation Station 7

Query the list of CITY names ending with vowels (a, e, i, o, u) from STATION. Your result cannot contain duplicates.

SELECT DISTINCT CITY
FROM STATION
WHERE RIGHT(CITY, 1) IN ('a','e','i','o','u')

SELECT DISTINCT CITY
FROM STATION
WHERE CITY REGEXP '[aeiou]$'

SELECT DISTINCT CITY
FROM STATION
WHERE CITY REGEXP 'a$|e$|i$|o$|u$'

SELECT DISTINCT CITY
FROM STATION
WHERE REGEXP_LIKE(CITY, '[aeiou]$')

RIGHT(문자열, 길이): 문자열을 받아서 오른쪽부터 원하는 길이만큼 자르는 함수

Weather Observation Station 8

Query the list of CITY names from STATION which have vowels (i.e., a, e, i, o, and u) as both their first and last characters. Your result cannot contain duplicates.

SELECT CITY
FROM STATION
WHERE LEFT(CITY, 1) IN ('a', 'e', 'i', 'o', 'u') and RIGHT(CITY, 1) IN ('a', 'e', 'i', 'o', 'u')

SELECT CITY
FROM STATION
WHERE CITY REGEXP '^[aeiou]' and CITY REGEXP '[aeiou]$'

SELECT CITY
FROM STATION
WHERE REGEXP_LIKE (CITY, '^[aeiou]') and REGEXP_LIKE (CITY, '[aeiou]$')

Weather Observation Station 11

Query the list of CITY names from STATION that either do not start with vowels or do not end with vowels. Your result cannot contain duplicates.

SELECT DISTINCT CITY
FROM STATION
WHERE LEFT(CITY, 1) NOT IN ('a','e','i','o','u') or RIGHT(CITY, 1) NOT IN ('a','e','i','o','u')

SELECT DISTINCT CITY
FROM STATION
WHERE CITY REGEXP '^[^aeiou]' or CITY REGEXP '[^aeiou]$'

참고
- ^x: x로 시작하는 문자열을 찾기
- [^xy] : not의 의미로, x및 y를 제외한 문자 의미함
- 위 예시에서 '^[^aeiou]'의 뜻은 aeiou를 제외한 문자로 시작하는 문자열 찾기

Higher Than 75 Marks

Query the Name of any student in STUDENTS who scored higher than 75 Marks. Order your output by the last three characters of each name. If two or more students both have names ending in the same last three characters (i.e.: Bobby, Robby, etc.), secondary sort them by ascending ID.

SELECT NAME
FROM STUDENTS
WHERE MARKS > 75
ORDER BY RIGHT(NAME, 3), ID

Advanced Select

Type of Triangle

Write a query identifying the type of each record in the TRIANGLES table using its three side lengths. Output one of the following statements for each record in the table:

Equilateral: It's a triangle with 3 sides of equal length.
Isosceles: It's a triangle with 2 sides of equal length.
Scalene: It's a triangle with 3 sides of differing lengths.
Not A Triangle: The given values of A, B, and C don't form a triangle.

SELECT CASE WHEN A + B <= C OR B + C <= A OR A + C <= B THEN 'Not A Triangle'
            WHEN A = B AND A = C THEN 'Equilateral'
            WHEN A = B OR B = C OR A = C THEN 'Isosceles'
            WHEN A <> B OR B <> C THEN 'Scalene'
            END
FROM TRIANGLES;

참고: CASE WHEN 구문 사용해 해당 조건에 맞는 값 출력

The PADS

Generate the following two result sets:
1. Query an alphabetically ordered list of all names in OCCUPATIONS, immediately followed by the first letter of each profession as a parenthetical (i.e.: enclosed in parentheses). For example: AnActorName(A), ADoctorName(D), AProfessorName(P), and ASingerName(S).
2. Query the number of ocurrences of each occupation in OCCUPATIONS. Sort the occurrences in ascending order, and output them in the following format:

There are a total of [occupation_count] [occupation]s.

where occupation_count is the number of occurrences of an occupation in OCCUPATIONS and occupation is the lowercase occupation name. If more than one Occupation has the same occupation_count, they should be ordered alphabetically.

SELECT CONCAT(Name, '(', LEFT(OCCUPATION, 1), ')')
FROM OCCUPATIONS
ORDER BY NAME;

SELECT CONCAT('There are a total of ', COUNT(*), ' ', LOWER(OCCUPATION), 's.')
FROM OCCUPATIONS
GROUP BY OCCUPATION
ORDER BY COUNT(*), OCCUPATION;

CONCAT(문자열1, 문자열2,...): 여러 문자열을 하나로 합칠 때 CONCAT 함수 사용, 파라미터를 두 개 이상 사용 가능
LOWER(문자열): LOWER 함수는 모든 문자열을 소문자로 통일시켜주는 함수이고, 이와 반대되는 함수는 UPPER 함수 (즉, UPPER 함수는 컬럼의 문자열을 모두 대문자 알파벳으로 변경)

New Companies

Given the table schemas below, write a query to print the company_code, founder name, total number of lead managers, total number of senior managers, total number of managers, and total number of employees. Order your output by ascending company_code.

Note: The tables may contain duplicate records.
The company_code is string, so the sorting should not be numeric. For example, if the company_codes are C_1, C_2, and C_10, then the ascending company_codes will be C_1, C_10, and C_2.

SELECT C.COMPANY_CODE
    , FOUNDER
    , COUNT(DISTINCT L.LEAD_MANAGER_CODE)
    , COUNT(DISTINCT S.SENIOR_MANAGER_CODE)
    , COUNT(DISTINCT M.MANAGER_CODE)
    , COUNT(DISTINCT E.EMPLOYEE_CODE)
FROM Company C
LEFT JOIN LEAD_MANAGER L
    ON C.COMPANY_CODE = L.COMPANY_CODE
LEFT JOIN SENIOR_MANAGER S 
    ON L.LEAD_MANAGER_CODE = S.LEAD_MANAGER_CODE
LEFT JOIN MANAGER M
    ON M.SENIOR_MANAGER_CODE = S.SENIOR_MANAGER_CODE
LEFT JOIN EMPLOYEE AS E
    ON E.MANAGER_CODE = M.MANAGER_CODE
GROUP BY 1, 2
ORDER BY 1

Binary Tree Nodes

You are given a table, BST, containing two columns: N and P, where N represents the value of a node in Binary Tree, and P is the parent of N.
Write a query to find the node type of Binary Tree ordered by the value of the node. Output one of the following for each node:

	Root: If node is root node.
	Leaf: If node is leaf node.
	Inner: If node is neither root nor leaf node.

SELECT N,
      (CASE WHEN P is null then 'Root'
            WHEN N NOT IN (SELECT DISTINCT P FROM BST WHERE P is not null) THEN 'Leaf'
            ELSE 'Inner' END) Node
FROM BST
ORDER BY 1

참고: Node의 경우 P가 null값이면 Root이고, P 리스트 안에 없으면 Leaf고, 다른 경우는 Inner라고 할 수 있기에 위와 같이 작성

Occupations

Pivot the Occupation column in OCCUPATIONS so that each Name is sorted alphabetically and displayed underneath its corresponding Occupation. The output column headers should be Doctor, Professor, Singer, and Actor, respectively.

Note: Print NULL when there are no more names corresponding to an occupation.

SELECT MIN(CASE WHEN OCCUPATION = 'DOCTOR' THEN NAME END) AS DOCTOR
     , MIN(CASE WHEN OCCUPATION = 'PROFESSOR' THEN NAME END) AS PROFESSOR
     , MIN(CASE WHEN OCCUPATION = 'SINGER' THEN NAME END) AS SINGER
     , MIN(CASE WHEN OCCUPATION = 'ACTOR' THEN NAME END) AS ACTOR
FROM (SELECT NAME, OCCUPATION, RANK() OVER (PARTITION BY OCCUPATION ORDER BY NAME ASC) RK
      FROM OCCUPATIONS) RANKLISTS
GROUP BY RK

참고: OCCUPATION을 기준으로 그룹핑해주어 OCCUPATION별 NAME 순으로 RANK 매겨주는 서브 쿼리 작성 후, 그 서브쿼리에서 DOCTOR, PROFESSOR, SINGER, ACTOR 출력 (이때 MIN을 사용하고 RANK를 GROUP BY 해주어야 함, MAX()를 이용해도 동일한 결과 <- 마지막에 GROUP BY를 사용하기 위해 집계 함수 사용)

Aggregation

The Blunder

Samantha was tasked with calculating the average monthly salaries for all employees in the EMPLOYEES table, but did not realize her keyboard's key was broken until after completing the calculation. She wants your help finding the difference between her miscalculation (using salaries with any zeros removed), and the actual average salary.

Write a query calculating the amount of error, and round it up to the next integer.

SELECT CEILING(AVG(SALARY) - AVG(REPLACE(SALARY, 0, ''))) 
FROM EMPLOYEES

CEILING: 해당 숫자보다 크거나 같은 최소 정수 반환
참고: 반대로 FLOOR 함수는 해당 숫자보다 작거나 같은 최대 정수 반환
- 예시: 11.92 숫자에 대해 각각 CEILING은 12, FLOOR는 11를 반환
REPLACE(Column,'해당문자','바꿀문자'): 어떠한 문자를 다른 문자로 변환하고 싶을 때 사용, 위 문제에서는 SALARY 컬럼에서 0을 ''로 변환(즉, SALARY 컬럼에서 0을 없앤다는 의미)

Top Earners

We define an employee's total earnings to be their SALARY*MONTHS monthly worked, and the maximum total earnings to be the maximum total earnings for any employee in the Employee table. Write a query to find the maximum total earnings for all employees as well as the total number of employees who have maximum total earnings. Then print these values as 2 space-separated integers.

SELECT MAX(SALARY*MONTHS)
	, COUNT(EMPLOYEE_ID) 
FROM EMPLOYEE
WHERE SALARY*MONTHS = (SELECT MAX(SALARY*MONTHS) FROM EMPLOYEE)

SELECT SALARY*MONTHS   
    , COUNT(SALARY*MONTHS) 
FROM EMPLOYEE
GROUP BY 1
ORDER BY 1 DESC
LIMIT 1

Weather Observation Station 15

Query the Western Longitude (LONG_W) for the largest Northern Latitude (LAT_N) in STATION that is less than 137.2345. Round your answer to 4 decimal places.

SELECT ROUND(LONG_W, 4)
FROM STATION
WHERE LAT_N = (SELECT MAX(LAT_N) FROM STATION WHERE LAT_N < 137.2345)

SELECT ROUND(LONG_W, 4) FROM STATION
WHERE LAT_N < 137.2345
ORDER BY LAT_N DESC LIMIT 1;

Weather Observation Station 18

Consider P1(a, b) and P2(c, d) to be two points on a 2D plane.

a happens to equal the minimum value in Northern Latitude (LAT_N in STATION).
b happens to equal the minimum value in Western Longitude (LONG_W in STATION).
c happens to equal the maximum value in Northern Latitude (LAT_N in STATION).
d happens to equal the maximum value in Western Longitude (LONG_W in STATION).
Query the Manhattan Distance between points P1 and P2 and round it to a scale of 4 decimal places.

SELECT ROUND(ABS(MAX(LAT_N)-MIN(LAT_N)) + ABS(MAX(LONG_W)-MIN(LONG_W)), 4)
FROM STATION

ABS: 절대값 구할 때 사용하는 함수로, 위 문제에서는 LAT_N의 최대값과 LAT_N의 최솟값의 차이의 절대값과 LONG_W의 최대값과 LAT_N의 최솟값의 차이의 절댓값을 구할 때 사용

Weather Observation Station 19

Consider P1(a, c) and P2(b, d) to be two points on a 2D plane where (a, b) are the respective minimum and maximum values of Northern Latitude (LAT_N) and (c, d) are the respective minimum and maximum values of Western Longitude (LONG_W) in STATION.

Query the Euclidean Distance between points P1 and P2 and format your answer to display 4 decimal digits.

SELECT ROUND(SQRT(POWER(MAX(LAT_N)-MIN(LAT_N), 2) + POWER(MAX(LONG_W)-MIN(LONG_W), 2)), 4)
FROM STATION

POWER(숫자, 거듭제곱 횟수): 숫자를 제곱 횟수만큼 제곱하고 싶을 때 사용하는 함수
SQRT: 제곱근을 구할 때 사용하는 함수

Weather Observation Station 20

A median is defined as a number separating the higher half of a data set from the lower half. Query the median of the Northern Latitudes (LAT_N) from STATION and round your answer to 4 decimal places.

SELECT ROUND(LAT_N, 4)
FROM (SELECT *, PERCENT_RANK () OVER (ORDER BY LAT_N) RK
      FROM STATION) TBL
WHERE RK = 0.5

PERCENT_RANK: 행의 PERCENTILE RANK를 구하고 싶을 때 사용하는 함수로, 위 문제에서는 MEDIAN값을 구하기 위해 LAT_N 순으로 정렬하고 PERCENTILE RANK를 매기는 서브쿼리 작성, 그 후 그 서브쿼리에서 RANK가 0.5(즉, 중앙에 위치한)인 값을 추출

hi_life

성장 일기

이전 포스트

프로그래머스 SQL 공부 정리 2

다음 포스트