๐Ÿ“’ SQL(5)

Kimdongkiยท2024๋…„ 4์›” 26์ผ

DB

๋ชฉ๋ก ๋ณด๊ธฐ
8/33

๐Ÿ“Œ ํŠธ๋žœ์žญ์…˜

  • SQL๋“ค์„ ๋ฌถ์–ด์„œ ํ•˜๋‚˜์˜ ์ž‘์—…์ฒ˜๋Ÿผ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ฐฉ๋ฒ•
  • ๋” ์ด์ƒ ์ชผ๊ฐค ์ˆ˜ ์—†๋Š” ์—…๋ฌด ์ฒ˜๋ฆฌ์˜ ์ตœ์†Œ ๋‹จ์œ„์ด๋‹ค.
  • ๊ฑฐ๋ž˜๋‚ด์—ญ์ด๋ผ๊ณ ๋„ ํ•œ๋‹ค.
    • ์ด๋Š” DDL์ด๋‚˜ DML์ค‘์—์„œ ๋ ˆ์ฝ”๋“œ๋ฅผ ์ˆ˜์ •/์ถ”๊ฐ€/์‚ญ์ œํ•œ ๊ฒƒ์—๋งŒ ์˜๋ฏธ๊ฐ€ ์žˆ๋‹ค.
    • SELECT์—๋Š” ํŠธ๋žœ์žญ์…˜์„ ์‚ฌ์šฉํ•  ์ด์œ ๊ฐ€ ์—†๋‹ค.
    • BEGIN, END ํ˜น์€ BEGIN, COMMIT ์‚ฌ์ด์— ํ•ด๋‹น SQL๋“ค์„ ์‚ฌ์šฉํ•œ๋‹ค.
    • ROLLBACK

๐Ÿ“Œ ํŠธ๋žœ์žญ์…˜(์ปค๋ฐ‹ ๋ชจ๋“œ) - AUTOCOMMIT

  • autocommit = True
    • ๋ชจ๋“  ๋ ˆ์ฝ”๋“œ ์ˆ˜์ •/์ถ”๊ฐ€/์‚ญ์ œ ์ž‘์—…์ด ๊ธฐ๋ณธ์ ์œผ๋กœ ๋ฐ”๋กœ DB์— ์“ฐ์—ฌ์ง„๋‹ค.
    • ์ด๋ฅผ ์ปค๋ฐ‹๋œ๋‹ค๊ณ  ํ•œ๋‹ค.
    • ๋งŒ์•ฝ ํŠน์ • ์ž‘์—…์„ ํŠธ๋žœ์žญ์…˜์œผ๋กœ ๋ฌถ๊ณ  ์‹ถ๋‹ค๋ฉด BEGIN๊ณผ END(COMMIT)/ROLLBACK์œผ๋กœ ์ฒ˜๋ฆฌํ•œ๋‹ค.
  • autocommit = False
    • ๋ชจ๋“  ๋ ˆ์ฝ”๋“œ ์ˆ˜์ •/์‚ญ์ œ/์ถ”๊ฐ€ ์ž‘์—…์ด COMMIT ํ˜ธ์ถœ๋  ๋•Œ๊นŒ์ง€ ์ปค๋ฐ‹๋˜์ง€ ์•Š๋Š”๋‹ค.

๐Ÿ“Œ ํŠธ๋žœ์žญ์…˜ ๋ฐฉ์‹

  • Google Colab์˜ ํŠธ๋žœ์žญ์…˜
    • ๊ธฐ๋ณธ์ ์œผ๋กœ ๋ชจ๋“  SQL statement๊ฐ€ ๋ฐ”๋กœ ์ปค๋ฐ‹๋œ๋‹ค.(autocommit = True)
    • ์ด๋ฅผ ๋ฐ”๊พธ๊ณ  ์‹ถ๋‹ค๋ฉด BEGIN;END; ํ˜น์€ BEGIN;COMMIT;์„ ์‚ฌ์šฉํ•œ๋‹ค.(๋˜๋Š” ROLLBACK)
  • psycopg2์˜ ํŠธ๋žœ์žญ์…˜
    • autocommit์ด๋ผ๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ์กฐ์ ˆ๊ฐ€๋Šฅํ•˜๋‹ค.
    • autocommit=True๊ฐ€ ๋˜๋ฉด ๊ธฐ๋ณธ์ ์œผ๋กœ PostgreSQL์˜ ์ปค๋ฐ‹ ๋ชจ๋“œ์™€ ๋™์ผํ•˜๋‹ค.
    • autocommit=False๊ฐ€ ๋˜๋ฉด ์ปค๋„ฅ์…˜ ๊ฐ์ฒด์˜ .commit()๊ณผ .rollback()ํ•จ์ˆ˜๋กœ ํŠธ๋žœ์žญ์…˜ ์กฐ์ ˆ์ด ๊ฐ€๋Šฅํ•˜๋‹ค.
    • ์–ด๋–ค๊ฒƒ์„ ์‚ฌ์šฉํ• ์ง€๋Š” ๊ฐœ์ธ์˜ ์ทจํ–ฅ์ด๋‹ค.

๐Ÿ“Œ DELETE FROM vs. TRUNCATE

  • DELETE FROM table_name (not DELETE* FROM)
    • ํ…Œ์ด๋ธ”์—์„œ ๋ชจ๋“  ๋ ˆ์ฝ”๋“œ๋ฅผ ์‚ญ์ œ
    • vs. DROP TABLE table_name
    • WHERE์„ ์‚ฌ์šฉํ•˜์—ฌ ํŠน์ • ๋ ˆ์ฝ”๋“œ๋งŒ ์‚ญ์ œ๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ค.
      • DELETE FROM raw_data.user_session_channel WHERE channel = 'Google'
  • TURNCATE table_name
    • ํ…Œ์ด๋ธ”์—์„œ ๋ชจ๋“  ๋ ˆ์ฝ”๋“œ๋ฅผ ์‚ญ์ œํ•œ๋‹ค.
    • DELETE FROM์€ ์†๋„๊ฐ€ ๋А๋ฆฌ๋‹ค.
    • TRUNCATE๋Š” ์ „์ฒด ํ…Œ์ด๋ธ”์˜ ๋‚ด์šฉ ์‚ญ์ œ์‹œ์—๋Š” ์—ฌ๋Ÿฌ๋ชจ๋กœ ์œ ๋ฆฌํ•˜๋‹ค.
    • ๋‹จ์ 
      • WHERE์„ ์ง€์›ํ•˜์ง€ ์•Š๋Š”๋‹ค.
      • Transction์„ ์ง€์›ํ•˜์ง€ ์•Š๋Š”๋‹ค.

๐Ÿ“Œ UNION, EXCEPT, INTERSECT

  • UNION(ํ•ฉ์ง‘ํ•ฉ)
    • ์—ฌ๋Ÿฌ๊ฐœ์˜ ํ…Œ์ด๋ธ”๋“ค์ด๋‚˜ SELECT ๊ฒฐ๊ณผ๋ฅผ ํ•˜๋‚˜์˜ ๊ฒฐ๊ณผ๋กœ ํ•ฉ์ณ์ค€๋‹ค.
    • UNION vs. UNION ALL
      • UNION์€ ์ค‘๋ณต์„ ์ œ๊ฑฐํ•œ๋‹ค.
  • EXCEPT(MINUS)
    • ํ•˜๋‚˜์˜ SELECT ๊ฒฐ๊ณผ์—์„œ ๋‹ค๋ฅธ SELECT ๊ฒฐ๊ณผ๋ฅผ ๋นผ์ฃผ๋Š” ๊ฒƒ์ด ๊ฐ€๋Šฅํ•˜๋‹ค.
  • INTERSECT(๊ต์ง‘ํ•ฉ)
    • ์—ฌ๋Ÿฌ ๊ฐœ์˜ SELECT๋ฌธ์—์„œ ๊ฐ™์€ ๋ ˆ์ฝ”๋“œ๋“ค๋งŒ ์ฐพ์•„์ค€๋‹ค.

๐Ÿ“ŒCOALESCE, NULLIF

  • COALESCE(Expression1, Expression2,...)
    • ์ฒซ ๋ฒˆ์งธ Expression๋ถ€ํ„ฐ ๊ฐ’์ด NULL์ด ์•„๋‹Œ ๊ฒƒ์ด ๋‚˜์˜ค๋ฉด ๊ทธ ๊ฐ’์„ ๋ฆฌํ„ดํ•˜๊ณ  ๋ชจ๋‘ NULL์ด๋ฉด NULL์„ ๋ฆฌํ„ดํ•œ๋‹ค.
    • NULL๊ฐ’์„ ๋‹ค๋ฅธ ๊ฐ’์œผ๋กœ ๋ฐ”๊พธ๊ณ  ์‹ถ์„ ๋•Œ ์‚ฌ์šฉํ•œ๋‹ค.
  • NULLIF(Expression1, Expression2)
    • Expression1, Expression2์˜ ๊ฐ’์ด ๊ฐ™์œผ๋ฉด NULL์„ ๋ฆฌํ„ดํ•œ๋‹ค.

๐Ÿ“ŒLISTAGG

  • GROUP BY์—์„œ ์‚ฌ์šฉ๋˜๋Š” Aggregateํ•จ์ˆ˜ ์ค‘์˜ ํ•˜๋‚˜์ด๋‹ค.
  • ์‚ฌ์šฉ์ž ID๋ณ„๋กœ ์ฑ„๋„์„ ์ˆœ์„œ๋Œ€๋กœ ๋‚˜์—ด
SELECT
	userid,
    LISTAGG(channel) WITHIN GROUP (ORDER BY ts) channels
FROM raw_data.user_session_channel usc
JOIN raw_data.session_timestamp st ON usc.sessionid = st.sessionid
GROUP BY 1
LIMIT 10;
  • ๋ฆฌ์ŠคํŠธ๋ฅผ ,๋กœ ๋‚˜์—ดํ•˜๊ธฐ
SELECT
	userid,
    LISTAGG(channel, ',') WITHIN GROUP (ORDER BY ts) channels
FROM raw_data.user_session_channel usc
JOIN raw_data.session_timestamp st ON usc.sessionid = st.sessionid
GROUP BY 1
LIMIT 10;

๐Ÿ“Œ WINDOW

  • Syntax
    • function(expression) OVER([PARTITION BY expression][ORDER BY expression])
  • Useful functions
    • ROW_NUMBER, FIRST_VALUE, LAST_VALUE, LAG
    • Math functions : AVG, SUM, COUNT, MAX, MIN, MEDIAN, NTH_VALUE

๐Ÿ“Œ WINDOW - LAG

  • ์–ด๋–ค ์‚ฌ์šฉ์ž ์„ธ์…˜์—์„œ ์‹œ๊ฐ„์ˆœ์œผ๋กœ ๋ดค์„ ๋•Œ
    • ์•ž ์„ธ์…˜์˜ ์ฑ„๋„์ด ๋ฌด์—‡์ธ์ง€ ์•Œ๊ณ  ์‹ถ๋‹ค๋ฉด?
    • ํ˜น์€ ๋‹ค์Œ ์„ธ์…˜์˜ ์ฑ„๋„์ด ๋ฌด์—‡์ธ์ง€ ์•Œ๊ณ  ์‹ถ๋‹ค๋ฉด?
SELECT 
	usc.*,
    st.ts,
	LAG(channel,1) OVER(PARTITION BY userid ORDER BY ts) prev_channel
FROM raw_data.user_session_channel usc
JOIN raw_data.session_timestamp st ON usc.sessionid = st.sessionid
ORDER BY usc.userid, st.ts

๐Ÿ“Œ JSON Parsing Functions

  • JSON์˜ ํฌ๋งท์„ ์ด๋ฏธ ์•„๋Š” ์ƒํ™ฉ์—์„œ๋งŒ ์‚ฌ์šฉ๊ฐ€๋Šฅํ•œ ํ•จ์ˆ˜
SELECT JSON_EXTRACT_PATH_TEXT('{"f2":{"f3":"1"},"f4":{"f5":"99","f6":"star"}}','f4','f6');
{
  	"f2":{
      	"f3":"1"
    },
    "f4":{
      	"f5":"99",
        "f6":"star"
    }
}

0๊ฐœ์˜ ๋Œ“๊ธ€