[SQL] 실무 데이터 분석 예시

WOOK JONG KIM·2022년 12월 16일
0

mysql기초

목록 보기
11/13
post-thumbnail
  1. 2020년 7월의 총 Revenue를 구하기
select sum(price) as revenue
from fastcampus.tbl_purchase
where purchased_at >= '2020-07-01' # 00시 00분 00초로 인식
	and purchased_at < '2020-08-01';

-> 날짜로 필터 거는 경우 많음

SQL문 실행 순서: 구분 분석 > FROM > WHERE > SELECT 순

WHERE 절에 사용하는 것이 속도 향상(여기서 필요한 데이터를 필터링)

  1. 2020년 7월의 MAU(Monthly Active User)를 구해주세요
select count(distinct customer_id)
from fastcampus.tbl_visit
where visited_at >= '2020-07-01' 
	and visited_at < '2020-08-01';
  1. 2020년 7월에 우리 Active 유저의 구매율(Paying Rate)은 어떻게 되나요?
select count(distinct customer_id)
from fastcampus.tbl_purchase
where purchased_at >= '2020-07-01'
	and purchased_at < '2020-08-01'; ##  11174

select count(distinct customer_id)
from fastcampus.tbl_visit
where visited_at >= '2020-07-01'
	and visited_at < '2020-08-01'; ## 16414
    
select round(11174/16414*100, 2);
  1. 2020년 7월에 구매 유저의 월 평균 구매액은 어떻게 되나요?

    객단가
    ARPPU(Average Revenue per Paying user)

select avg(revenue)
from (select customer_id , sum(price) as revenue
	from fastcampus.tbl_purchase
    where purchased_at >= '2020-07-01'
		and purchased_at < '2020-08-01'
	group by 1) AS A_table;
  1. 2020년 7월에 가장 많이 구매한 Top3 고객과 Top10~15 고객을 뽑아주세요
select customer_id, sum(price) as revenue
from fastcampus.tbl_purchase
where purchased_at >= '2020-07-01'
		and purchased_at < '2020-08-01'
group by 1
order by 2 desc
limit 3;

select customer_id, sum(price) as revenue
from fastcampus.tbl_purchase
where purchased_at >= '2020-07-01'
		and purchased_at < '2020-08-01'
group by 1
order by 2 desc
limit 6 offset 9;

DATE FUNCTION

SELECT NOW();SELECT CURRENT_DATE();SELECT EXTRACT(MONTH FROM ʻ2021-01-01);SELECT DAY(ʻ2021-01-01);SELECT DATE_ADD(ʻ2021-01-01, INTERVAL 7 DAY);SELECT DATE_SUB(ʻ2017-06-15, INTERVAL 7 DAY);SELECT DATEDIFF("2017-06-25", "2017-06-15");SELECT TIMEDIFF("2021-01-25 12:10:00", "2021-01-25 10:10:00");SELECT DATE_FORMAT(NOW(),%Y-%m-%d”);


  1. 2020년 7월의 평균 DAU(Daily Active User)를 구해주세요, Active User 수가 증가하는 추세인가요?

표준시 기준으로 9시간이 자동으로 더해지는 것 같음

SELECT avg(cnt)
FROM (SELECT date_format(visited_at - interval 9 hour, '%Y-%m-%d') as date_at, count(distinct customer_id) as cnt
FROM fastcampus.tbl_visit
WHERE visited_at >= '2020-07-01'
	and visited_at < '2020-08-01'
GROUP BY 1
ORDER BY 1 asc) as foo
  1. 2020년 7월의 평균 WAU를 구해주세요

일주일이 완전히 지나가는 주만 사용 ( 5 ~ 25)

SELECT avg(cnt)
FROM (SELECT date_format(visited_at - interval 9 hour, '%Y-%m-%U') as date_at, count(distinct customer_id) as cnt
FROM fastcampus.tbl_visit
WHERE visited_at >= '2020-07-05'
	and visited_at < '2020-07-26'
GROUP BY 1
ORDER BY 1 asc) as foo;
  1. 2020년 7월의 Daily Revenue는 증가하는 추세인가요? 평균 Daily Revenue도 구해주세요
SELECT date_format(purchased_at - interval 9 hour, '%y-%m-%d') as date_at,
		SUM(price)
FROM fastcampus.tbl_purchase
WHERE purchased_at >= '2020-07-01'
	and purchased_at < '2020-08-01'
GROUP BY 1
ORDER BY 1 asc;

SELECT round(avg(price), 0)
FROM 
(SELECT SUM(price) as price
FROM fastcampus.tbl_purchase
WHERE purchased_at >= '2020-07-01'
	and purchased_at < '2020-08-01'
GROUP BY date_format(purchased_at - interval 9 hour, '%y-%m-%d')) AS foo;
  1. 2020년 7월 평균 Weekly Revenue를 구해주세요
SELECT round(avg(price), 0)
FROM (SELECT date_format(purchased_at - interval 9 hour, '%y-%m-%U') as date_at
			, SUM(price) as price
	FROM fastcampus.tbl_purchase
	WHERE purchased_at >= '2020-07-05'
		and purchased_at < '2020-07-26'
	group by date_at) AS foo;
  1. 2020년 7월 요일별 Daily Revenue를 구해주세요. 어느 요일이 revenue가 가장 높나요?
SELECT date_format(date_at, '%w') as day_of_week
	, date_format(date_at, '%W') as day_name
    , avg(revenue) as daily_revenue
FROM ( SELECT date_format(purchased_at - interval 9 hour, '%Y-%m-%d') as date_at,
		SUM(price) as revenue
        FROM fastcampus.tbl_purchase
        WHERE purchased_at >= '2020-07-01'
			and purchased_at < '2020-08-01'
		group by 1) as foo
GROUP BY 1,2
order by 1 asc;
  1. 2020년 7월 시간대별 시간당 Revenue를 구해주세요, 어느 시간대가 Revenue가 가장 높고 낮나요?
SELECT hour_at, AVG(revenue)
FROM 
	(SELECT date_format(purchased_at - interval 9 hour, '%Y-%m-%d') as date_at  
    ,date_format(purchased_at - interval 9 hour, '%H') as hour_at
	,SUM(price) as revenue
	FROM fastcampus.tbl_purchase
    WHERE purchased_at >= '2020-07-01'
    and purchased_at < '2020-08-01'
    GROUP BY 1,2 ) as foo
GROUP BY 1
ORDER BY 2 asc;
  1. 2020년 7월 요일 및 시간대별 Revenue를 구해주세요
SELECT day_of_week_at, hour_at, avg(revenue)
FROM(select date_format(purchased_at - interval 9 hour, '%y-%m-%d') as date_at
	, date_format(purchased_at - interval 9 hour, '%W') as day_of_week_at
    , date_format(purchased_at - interval 9 hour, '%H') as hour_at
    , SUM(price) as revenue
    FROM fastcampus.tbl_purchase
    WHERE purchased_at >= '2020-07-01'
    and purchased_at < '2020-08-01'
    GROUP BY 1,2,3) as foo
GROUP BY 1,2
order by 3 desc;
  1. 요일 및 시간대별 Active User 수 계산해보기
SELECT day_of_week_at, hour_at, avg(cnt)
FROM
(select date_format(visited_at - interval 9 hour, '%y-%m-%d') as date_at,
	date_format(visited_at - interval 9 hour, '%W') as day_of_week_at,
    date_format(visited_at - interval 9 hour, '%H') as hour_at,
    COUNT(customer_id) as cnt
    FROM fastcampus.tbl_visit
    WHERE visited_at >= '2020-07-01'
    and visited_at < '2020-08-01'
    GROUP BY 1,2,3) as foo
GROUP BY 1,2
order by 3 desc;    
profile
Journey for Backend Developer

0개의 댓글