SQL 집계 함수 문제 풀이

robin·2021년 8월 18일
0

SQL 공부

목록 보기
5/6
post-thumbnail

집계 함수 문제 풀이

<예제> HackerRank: Revising Aggregations - The Count Function

[문제]

Query a count of the number of cities in CITY having a Population larger than 100,000.

[풀이]

SELECT COUNT(*)
FROM city
WHERE population > 100000;

출처: https://www.hackerrank.com/challenges/revising-aggregations-the-count-function/problem

<예제> HackerRank: Revising Aggregations - Averages

[문제]

Query the average population of all cities in CITY where District is California.

[풀이]

SELECT AVG(population)
FROM city
WHERE district = 'California';

출처: https://www.hackerrank.com/challenges/revising-aggregations-the-average-function/problem

<예제> HackerRank: Average Population

[문제]

Query the average population for all cities in CITY, rounded down to the nearest integer.

[풀이]

SELECT FLOOR(AVG(population))
FROM city;

출처: https://www.hackerrank.com/challenges/average-population/problem

<예제> HackerRank: Revising Aggregations - The Sum Function

[문제]
Query the total population of all cities in CITY where District is California.

[풀이]

SELECT SUM(population)
FROM city
WHERE district = 'California';

출처: https://www.hackerrank.com/challenges/revising-aggregations-sum/problem

<예제> HackerRank: Weather Observation Station 15

[문제]

Query the Western Longitude (LONG_W) for the largest Northern Latitude (LAT_N) in STATION that is less than 137.2345. Round your answer to decimal places.

[풀이]

SELECT ROUND(LONG_W, 4)
FROM station
WHERE LAT_N = (SELECT MAX(LAT_N) FROM station WHERE LAT_N < 137.2345)

⚠️ 주의점: WHERE절에 바로 집계함수 사용할 수 없음. GROUP BY 하고 HAVING에 집계함수를 쓰거나 아니면 WHERE절의 서브쿼리에 집계함수를 사용할 수 있음.

출처: https://www.hackerrank.com/challenges/weather-observation-station-15/problem

<예제> TestDome: Regional Sales Comparison

[문제]

An insurance company maintains records of sales made by its employees. Each employee is assigned to a state. States are grouped under regions. The following tables contain the data:

TABLE regions
  id INTEGER PRIMARY KEY
  name VARCHAR(50) NOT NULL

TABLE states
  id INTEGER PRIMARY KEY
  name VARCHAR(50) NOT NULL
  regionId INTEGER NOT NULL REFERENCES regions(id)

TABLE employees
  id INTEGER PRIMARY KEY
  name VARCHAR(50) NOT NULL
  stateId INTEGER NOT NULL REFERENCES states(id)

TABLE sales
  id INTEGER PRIMARY KEY
  amount INTEGER NOT NULL
  employeeId INTEGER NOT NULL REFERENCES employees(id)  

Management requires a comparative region sales analysis report.

Write a query that returns:

  • The region name.
  • Average sales per employee for the region (Average sales = Total sales made for the region / Number of employees in the region).
  • The difference between the average sales of the region with the highest average sales, and the average sales per employee for the region (average sales to be calculated as explained above).

A region with no sales should be also returned. Use 0 for average sales per employee for such a region when calculating the 2nd and the 3rd column.

[풀이]

WITH c AS (SELECT r.name AS regionName
      ,(CASE WHEN SUM(s.amount) IS NOT NULL THEN SUM(s.amount) / COUNT(DISTINCT e.id) ELSE 0 END) AS averageSales
FROM regions AS r
LEFT JOIN states AS st ON r.id = st.regionId
LEFT JOIN employees AS e ON st.id = e.stateId
LEFT JOIN sales AS s ON e.id = s.employeeId
GROUP BY r.name)
SELECT regionName
      , averageSales
      , (SELECT MAX(averageSales) FROM c) - averageSales AS difference
FROM c

[결과]

regionNameaverageSalesdifference
East12002800
Midwest04000
North25001500
South40000
West24001600

<주의할 점>

  • Average sales = Total sales made for the region / Number of employees in the region 으로 정의된다. AVG 함수를 사용하면 안되고 지역의 sales의 총합을 구하고 해당 지역의 총 직원 수로 나눠야 한다. 직원의 경우, 중복되어 나타날 수 있기 때문에 직원 수를 셀 때 COUNT DISTINCT를 사용해야 한다.
  • 전체 지역 중 averageSales의 최댓값에서 각 지역의 averageSales를 뺀 값을 구해야한다. 이를 위해 CTE에 averageSales를 컬럼을 만들어 두고, 참조하여 사용한다.
  • average sales의 최댓값을 구할 때는 SELECT내에 다시 SELECT를 사용해서 SELECT MAX(averageSales) FROM c 으로 가져와야 한다. 만약 SELECT 내에 다시 SELECT를 사용하지 않고, MAX(averageSales)를 바로 사용한다면 결과가 하나의 row로 리턴된다.

** 참고
[코드]

WITH c AS (SELECT r.name AS regionName
      ,(CASE WHEN SUM(s.amount) IS NOT NULL THEN SUM(s.amount) / COUNT(DISTINCT e.id) ELSE 0 END) AS averageSales
FROM regions AS r
LEFT JOIN states AS st ON r.id = st.regionId
LEFT JOIN employees AS e ON st.id = e.stateId
LEFT JOIN sales AS s ON e.id = s.employeeId
GROUP BY r.name)
SELECT regionName
      , averageSales
      , MAX(averageSales) - averageSales AS difference
FROM c

[결과]

regionNameaverageSalesdifference
South40000

출처: https://www.testdome.com/questions/sql/regional-sales-comparison/36141

profile
데이터 분석을 공부하는 🌱

0개의 댓글