User's Third Transaction
Write a query to obtain the third transaction of every user. Output the user id, spend and transaction date.
with t as (
SELECT user_id, spend, transaction_date
, row_number() over (partition by user_id order by transaction_date) as rank
FROM transactions
)
select user_id, spend, transaction_date
from t
where rank=3
;
Given a table of tweet data over a specified time period, calculate the 3-day rolling average of tweets for each user. Output the user ID, tweet date, and rolling averages rounded to 2 decimal places.
SELECT user_id, tweet_date
, ROUND(
avg(tweet_count) over (
partition by user_id order by tweet_date
rows between 2 preceding and current row
), 2) as rolling_avg_3d
FROM tweets
;
- avg도 window함수처럼 사용
- rows between으로 행간 계산 - preceding, following 기억
Highest-Grossing Items
write a query to identify the top two highest-grossing products within each category in the year 2022. The output should include the category, product, and total spend.
with p as(
SELECT category, product, sum(spend) as total_spend
, rank() over (partition by category order by sum(spend) desc)
FROM product_spend
where extract(year from transaction_date) = 2022
group by category, product
)
select category, product, total_spend
from p
where rank = 1 or rank = 2
;
- YEAR(컬럼) 안될 때는 EXTRACT(YEAR FROM 컬럼)
Top 5 Artists
Write a query to find the top 5 artists whose songs appear most frequently in the Top 10 of the global_song_rank table. Display the top 5 artist names in ascending order, along with their song appearance ranking.
If two or more artists have the same number of song appearances, they should be assigned the same ranking, and the rank numbers should be continuous (i.e. 1, 2, 2, 3, 4, 5).
with a as (
select a.artist_name
, dense_rank() over (order by count(*) desc) artist_rank
from global_song_rank r
inner join songs s
on r.song_id = s.song_id
inner join artists a
on s.artist_id = a.artist_id
where r.rank <= 10
group by a.artist_name
)
select artist_name, artist_rank
from a
where artist_rank <=5
order by 2, 1
;
Supercloud Customer
Write a query that identifies the customer IDs of these Supercloud customers.
with cust_cate as (
SELECT c.customer_id
, count(distinct p.product_category) cate_counts
from customer_contracts c
inner join products p
on c.product_id = p.product_id
group by c.customer_id
)
select distinct customer_id
from cust_cate
where cate_counts = (select count(distinct product_category) from products)
;
Histogram of Users and Purchases
write a query that retrieve the users along with the number of products they bought.
Output the user's most recent transaction date, user ID, and the number of products, sorted in chronological order by the transaction date.
with max_date as (
select user_id, max(transaction_date) max_date
from user_transactions
group by user_id
)
select t.transaction_date
, t.user_id
, count(t.product_id) purchase_count
from user_transactions t
left join max_date d
on t.user_id = d.user_id
where t.transaction_date = d.max_date
group by t.user_id, t.transaction_date
order by 1
;
Compressed Mode
Write a query to retrieve the mode of the order occurrences. Additionally, if there are multiple item counts with the same mode, the results should be sorted in ascending order.
with rank as (
select item_count, order_occurrences
, rank() over (order by order_occurrences desc) rank
from items_per_order
)
SELECT item_count as MODE
from rank
where rank = 1
;
SELECT item_count AS mode
FROM items_per_order
WHERE order_occurrences = (
SELECT MODE() WITHIN GROUP (ORDER BY order_occurrences DESC)
FROM items_per_order
)
ORDER BY item_count;
- mode : 최빈값
International Call Percentage
What percentage of phone calls are international? Round the result to 1 decimal.
with p as (
SELECT c.caller_id
, c.country_id as caller_con
, r.receiver_id
, r.country_id as receiver_con
from phone_calls p
left join phone_info c
on p.caller_id = c.caller_id
left join phone_info r
on p.receiver_id = r.caller_id
where c.country_id != r.country_id
)
select round(
100 * count(*) /
(select count(*) from phone_calls), 1) international_calls_pct
from p
;
SELECT
ROUND(
100.0 * SUM(CASE
WHEN caller.country_id <> receiver.country_id THEN 1 ELSE NULL END)
/COUNT(*) ,1) AS international_call_pct
FROM phone_calls AS calls
LEFT JOIN phone_info AS caller
ON calls.caller_id = caller.caller_id
LEFT JOIN phone_info AS receiver
ON calls.receiver_id = receiver.caller_id;
- round( 100 * count(*) / 서브쿼리, 1) 값과
round( count(*) / 서브쿼리 * 100, 1) 값이 왜 다르지 ??
-> count(*) / 서브쿼리 값이 정수로 처리되는 경우가 있음 (그러면 0이 되어버리는 것)
- 100을 곱한 것과 100.0을 곱한 것과 결과가 다르다
-> 100과 같이 연산하는 count(*)가 정수이므로 결과가 정수로 나옴
100.0은 소수가 나올 수 있음