TIL 25

이예인·2023년 11월 19일
0

TIL

목록 보기
6/10

지난시간 숙제

내가 짠 코드

SELECT 
	A.userid, 
	A.channel, 
	B.ts, ROW_NUMBER() OVER (PARTITION BY A.userid ORDER BY B.ts)
FROM raw_data.user_session_channel A
JOIN raw_data.session_timestamp B ON A.sessionid = B.sessionid
  • 사용차의 첫 번째와 마지막 채널이 아닌, 모든 채널을 시간 순서로 보여주고 있다.

모범 답안

WITH first AS (
   SELECT userid, ts, channel, ROW_NUMBER() OVER(PARTITION BY userid ORDER BY ts) seq
   FROM raw_data.user_session_channel usc
   JOIN raw_data.session_timestamp st ON usc.sessionid = st.sessionid
), last AS (
   SELECT userid, ts, channel, ROW_NUMBER() OVER(PARTITION BY userid ORDER BY ts DESC) seq
   FROM raw_data.user_session_channel usc
   JOIN raw_data.session_timestamp st ON usc.sessionid = st.sessionid
)
SELECT first.userid AS userid, first.channel AS first_channel, last.channel AS last_channel
FROM first
JOIN last ON first.userid = last.userid and last.seq = 1
WHERE first.seq = 1;
SELECT userid,
 MAX(CASE WHEN rn1 = 1 THEN channel END) first_touch,
 MAX(CASE WHEN rn2 = 1 THEN channel END) last_touch
FROM (
  SELECT userid,
    channel,
    (ROW_NUMBER() OVER (PARTITION BY usc.userid ORDER BY  st.ts asc)) AS rn1,
    (ROW_NUMBER() OVER (PARTITION BY usc.userid ORDER BY  st.ts desc)) AS rn2
  FROM raw_data.user_session_channel usc
  JOIN raw_data.session_timestamp st ON usc.sessionid = st.sessionid
)
GROUP BY 1;
SELECT DISTINCT
    A.userid,
    FIRST_VALUE(A.channel) over(partition by A.userid order by B.ts
rows between unbounded preceding and unbounded following) AS First_Channel,
    LAST_VALUE(A.channel) over(partition by A.userid order by B.ts
rows between unbounded preceding and unbounded following) AS Last_Channel
FROM raw_data.user_session_channel A
LEFT JOIN raw_data.session_timestamp B ON A.sessionid = B.sessionid
ORDER BY 1;

내가 짠 코드

SELECT A.userid, SUM(C.amount)
FROM raw_data.user_session_channel A
JOIN raw_data.session_transaction B ON A.sessionid = B.sessionid
GROUP BY A.userid
ORDER BY SUM(C.amount) DESC
LIMIT 10;

모범 답안

SELECT
    userID,
    SUM(amount)
FROM raw_data.session_transaction st
LEFT JOIN raw_data.user_session_channel usc ON st.sessionid = usc.sessionid
GROUP BY 1
ORDER BY 2 DESC
LIMIT 10;
SELECT DISTINCT
    usc.userid,
    SUM(amount) OVER(PARTITION BY usc.userid)
FROM raw_data.user_session_channel AS usc
JOIN raw_data.session_transaction AS revenue ON revenue.sessionid = usc.sessionid
ORDER BY 2 DESC
LIMIT 10;

내가 짠 코드

SELECT 
    TO_CHAR(created_at, 'YYYY-MM'), 
    ((COUNT(CASE WHEN score >= 9 THEN 1 END) - COUNT(CASE WHEN score <= 6 THEN 1 END))::FLOAT / COUNT(1))*100 AS NPS
FROM raw_data.nps
GROUP BY 1;

모범 답안

SELECT month,
  ROUND((promoters-detractors)::float/total_count*100, 2) AS overall_nps
FROM (
  SELECT LEFT(created, 7) AS month,
    COUNT(CASE WHEN score >= 9 THEN 1 END) AS promoters,
    COUNT(CASE WHEN score <= 6 THEN 1 END) AS detractors,
    COUNT(CASE WHEN score > 6 AND score < 9 THEN 1 END) As passives,
    COUNT(1) AS total_count
  FROM raw_data.nps
  GROUP BY 1
  ORDER BY 1
);
SELECT LEFT(created, 7) AS month,
  ROUND(SUM(CASE
    WHEN score >= 9 THEN 1
    WHEN score <= 6 THEN -1 END)::float*100/COUNT(1), 2)
FROM raw_data.nps
GROUP BY 1
ORDER BY 1;

트랜잭션

DELETE FROM vs TRUNCATE

DELETE FROM

  • 테이블에서 모든 레코드 삭제
  • WHERE 지원
  • 속도가 느리다.

TRUNCATE

  • WHERE 지원 x
  • TRANSACTION 지원 x

기타 고급 문법 소개

UNION (합집합)

  • UNION은 중복을 제거
  • UNION ALL은 중복 제거 x
  • 데이터 타입이 같아야 한다.

EXCEPT (MINUS)

  • 하나의 SELECT 결과에서 다른 SELECT 결과를 뺴준다.

INTERSECT (교집합)

  • 여러 개의 SELECT문에서 같은 레코드들만 찾아준다.
  • JOIN가 뭐가 다름?

COALESCE(Expression1, Expression2, ..)

  • 첫 번째 Expression부터 값이 Null이 아닌 것이 나오면 그 값을 return
  • 모두 NULL일 때 NULL return

NULLIF(Expression1, Expression2)

  • Expression1, Expression2의 값이 같으면 NULL return

LISTAGG

  • GROUP BY에서 사용되는 Aggregate 함수 중 하나
  • 모든 값들을 붙인다.
SELECT userid, LISTAGG(channel) WITHIN GROUP (ORDER BY ts) channels
FROM raw_data.user_session_channel usc
JOIN raw_data.session_timestamp st ON usc.sessionid = st.sessionid
GROUP BY 1
LIMIT 10;

SELECT userid, LISTAGG(channel, '->') WITHIN GROUP (ORDER BY ts) channels
FROM raw_data.user_session_channel usc
JOIN raw_data.session_timestamp st ON usc.sessionid = st.sessionid
GROUP BY 1
LIMIT 10;

WINDOW 함수

테이블 행 집합에 대해 계산을 수행하고, 각 행에 대해 결과를 반환하는 함수이다.

function(expression) OVER([PARTITION BY expression] [ORDER BY expression])

LAG

SELECT usc.*, st.ts, LAG(channel, 1) OVER (PARTITION BY userId ORDER BY ts) prev_channel
FROM raw_data.user_session_channel usc
JOIN raw_data.session_timestamp st ON usc.sessionid = st.sessionid
ORDER BY usc.userid, st.ts
LIMIT 100;

JSON PARSING 함수

  • JSON 포맷을 이미 아는 상황에서만 사용가능한 함수
  • JSON String을 입력으로 받아 특정 필드의 값만 추출할 수 있다.

0개의 댓글

관련 채용 정보