๐Ÿงฑ Databricks Unity Catalog(UC) ์™„๋ฒฝ ์ •๋ฆฌ ๊ฐ€์ด๋“œ

NewNewDaddyยท5์ผ ์ „

DATABRICKS

๋ชฉ๋ก ๋ณด๊ธฐ
1/1
post-thumbnail

0. INTRO

๋ฐ์ดํ„ฐ ํ”Œ๋žซํผ์ด ์ปค์งˆ์ˆ˜๋ก ๊ฐ€์žฅ ๋จผ์ € ๋ณต์žกํ•ด์ง€๋Š” ๊ฒƒ์€ ๋ฐ์ดํ„ฐ ์ž์ฒด๊ฐ€ ์•„๋‹ˆ๋ผ ๋ฐ์ดํ„ฐ์˜ ๊ด€๋ฆฌ ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค.

์—ฌ๋Ÿฌ ํŒ€์ด ๊ฐ™์€ ๋ฐ์ดํ„ฐ ๋ ˆ์ดํฌ๋ฅผ ๊ณต์œ ํ•˜๊ณ , ์ˆ˜๋งŽ์€ ํ…Œ์ด๋ธ”ยทํŒŒ์ผยท๋ชจ๋ธ์ด ์Œ“์ด๊ธฐ ์‹œ์ž‘ํ•˜๋ฉด
โ€œ์ด ๋ฐ์ดํ„ฐ๋Š” ๋ˆ„๊ฐ€ ๋งŒ๋“ค์—ˆ๋Š”๊ฐ€?โ€, โ€œ๋ˆ„๊ฐ€ ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ๋Š”๊ฐ€?โ€, โ€œ์–ด๋””์— ์ €์žฅ๋˜์–ด ์žˆ๋Š”๊ฐ€?โ€ ๊ฐ™์€ ์งˆ๋ฌธ์— ๋ช…ํ™•ํžˆ ๋‹ตํ•˜๊ธฐ ์–ด๋ ค์›Œ์ง€์ฃ .๐Ÿ˜…

Databricks Unity Catalog๋Š” ์ด๋Ÿฐ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋“ฑ์žฅํ•œ ํ†ตํ•ฉ ๋ฐ์ดํ„ฐ ๊ฑฐ๋ฒ„๋„Œ์Šค ๊ณ„์ธต์ž…๋‹ˆ๋‹ค.
๋‹จ์ˆœํžˆ ํ…Œ์ด๋ธ” ๊ถŒํ•œ์„ ๊ด€๋ฆฌํ•˜๋Š” ๊ธฐ๋Šฅ์„ ๋„˜์–ด,
๐Ÿ‘‰ ๋ฐ์ดํ„ฐยทํŒŒ์ผยท๋ชจ๋ธ ์ „๋ฐ˜์— ๋Œ€ํ•ด ์ค‘์•™ ์ง‘์ค‘์‹์œผ๋กœ ์ ‘๊ทผ ์ œ์–ด, ๊ฐ์‚ฌ, ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ ๊ด€๋ฆฌ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

๊ธฐ์กด Hive Metastore ๊ธฐ๋ฐ˜ ํ™˜๊ฒฝ์—์„œ๋Š” ์›Œํฌ์ŠคํŽ˜์ด์Šค๋งˆ๋‹ค ๋ฉ”ํƒ€์Šคํ† ์–ด๊ฐ€ ๋ถ„๋ฆฌ๋˜๊ฑฐ๋‚˜, ํด๋ผ์šฐ๋“œ ์Šคํ† ๋ฆฌ์ง€ ์ ‘๊ทผ ๊ถŒํ•œ๊ณผ Databricks ๊ถŒํ•œ์ด ์ด์ค‘์œผ๋กœ ๊ด€๋ฆฌ๋˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์•˜์Šต๋‹ˆ๋‹ค. Unity Catalog๋Š” ์ด๋ฅผ Account ๋‹จ์œ„์˜ ๋‹จ์ผ Metastore ๊ตฌ์กฐ๋กœ ํ†ตํ•ฉํ•˜์—ฌ ๋ณด์•ˆ๊ณผ ์šด์˜ ๋ณต์žก๋„๋ฅผ ํฌ๊ฒŒ ์ค„์˜€์Šต๋‹ˆ๋‹ค.

๋˜ํ•œ Unity Catalog๋Š”

  • SQL ๊ธฐ๋ฐ˜์˜ ์ผ๊ด€๋œ ๊ถŒํ•œ ๋ชจ๋ธ
  • ๋ฉ”์ด์ € ํด๋ผ์šฐ๋“œ ๋ฒค๋”(AWS / Azure / GCP)์— ๋…๋ฆฝ์ ์ธ ์„ค๊ณ„
  • ํ…Œ์ด๋ธ”๋ฟ ์•„๋‹ˆ๋ผ External Location, Volume, ML ๋ชจ๋ธ๊นŒ์ง€ ๊ด€๋ฆฌ ๊ฐ€๋Šฅ

์ด๋ผ๋Š” ์ ์—์„œ,
Databricks๋ฅผ ๋‹จ์ˆœํ•œ ๋ถ„์„ ๋„๊ตฌ๊ฐ€ ์•„๋‹Œ ์—”ํ„ฐํ”„๋ผ์ด์ฆˆ ๋ฐ์ดํ„ฐ ํ”Œ๋žซํผ์œผ๋กœ ํ™•์žฅ์‹œํ‚ค๋Š” ํ•ต์‹ฌ ๊ตฌ์„ฑ ์š”์†Œ๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด ๊ธ€์—์„œ๋Š” Unity Catalog์˜ ํ•ต์‹ฌ ๊ฐœ๋…๊ณผ ๊ตฌ์กฐ๋ฅผ ์ค‘์‹ฌ์œผ๋กœ,
์™œ ํ•„์š”ํ•œ์ง€, ๊ทธ๋ฆฌ๊ณ  ๊ธฐ์กด ๋ฐฉ์‹๊ณผ ๋ฌด์—‡์ด ๋‹ค๋ฅธ์ง€ ์ฐจ๊ทผ์ฐจ๊ทผ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.


1. Unity Catalog ๊ณ„์ธต ๊ตฌ์กฐ

UC๋Š” Metastore > Catalog > Schema > Table/Volume์˜ 4๋‹จ๊ณ„ ๊ณ„์ธต ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง‘๋‹ˆ๋‹ค. ํŠนํžˆ ์ €์žฅ ์œ„์น˜(Storage Location)๋Š” ์ƒ์œ„ ๊ณ„์ธต์—์„œ ํ•˜์œ„ ๊ณ„์ธต์œผ๋กœ ์ƒ์†๋˜๋Š” ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง‘๋‹ˆ๋‹ค.

1-1) Metastore

  • UC์˜ ์ตœ์ƒ์œ„ ์ปจํ…Œ์ด๋„ˆ๋กœ, ๋ชจ๋“  ๊ถŒํ•œ ๊ด€๋ฆฌ์™€ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ์˜ ์ค‘์‹ฌ์ง€์ž…๋‹ˆ๋‹ค.
  • Account Console ์ˆ˜์ค€์—์„œ ์ƒ์„ฑํ•˜๋ฉฐ, AWS/GCP/Azure ๊ฐ์ฒด ์ €์žฅ์†Œ๋ฅผ ๊ธฐ๋ณธ ์œ„์น˜๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • ํ•˜๋‚˜์˜ Metastore๋Š” ์—ฌ๋Ÿฌ Workspace์— ์—ฐ๊ฒฐ๋  ์ˆ˜ ์žˆ์–ด ์กฐ์ง ์ „์ฒด์˜ ํ†ตํ•ฉ ๊ฑฐ๋ฒ„๋„Œ์Šค๋ฅผ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.
  • Workspace > Catalog Explorer > External Data > External Locations์—์„œ ์—ฐ๊ฒฐ๋œ ์ €์žฅ์†Œ ์ •๋ณด๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

1-2) Catalog

  • ๋ฐ์ดํ„ฐ ์ž์‚ฐ์„ ๊ทธ๋ฃนํ™”ํ•˜๋Š” ์ฒซ ๋ฒˆ์งธ ๋‹จ์œ„์ž…๋‹ˆ๋‹ค.
  • ์นดํƒˆ๋กœ๊ทธ ์ƒ์„ฑ ์‹œ ์‹ค์ œ ๋ฐ์ดํ„ฐ๊ฐ€ ์ €์žฅ๋  MANAGED LOCATION์„ ์ง€์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ง€์ •ํ•˜์ง€ ์•Š์œผ๋ฉด Metastore์˜ ๊ฒฝ๋กœ์— ์ €์žฅ๋ฉ๋‹ˆ๋‹ค.

1-3) Schema (Database)

  • ์นดํƒˆ๋กœ๊ทธ ๋‚ด์˜ ํ•˜์œ„ ๋‹จ์œ„๋กœ ํƒœ์ด๋ธ”, ๋ทฐ, ๋ณผ๋ฅจ์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.
  • ์Šคํ‚ค๋งˆ ์ƒ์„ฑ ์‹œ LOCATION์„ ์ง€์ •ํ•˜๋ฉด ํ•ด๋‹น ๊ฒฝ๋กœ๊ฐ€ ํ•˜์œ„ Managed ๊ฐ์ฒด์˜ ๊ธฐ๋ณธ ๊ฒฝ๋กœ๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.

1-4) Objects (Table / Volume)

  • ์‹ค์ œ ๋ฐ์ดํ„ฐ๊ฐ€ ๋‹ด๊ธฐ๋Š” ์ตœ์ข… ๋‹จ์œ„์ž…๋‹ˆ๋‹ค.
  • ์ตœํ•˜์œ„ ๊ฐ์ฒด๋Š” ์ƒ์œ„ ์Šคํ‚ค๋งˆ๋‚˜ ์นดํƒˆ๋กœ๊ทธ์— ์„ค์ •๋œ ์œ„์น˜๋ฅผ ๋”ฐ๋ผ๊ฐ€๊ฑฐ๋‚˜, ์ง์ ‘ LOCATION์„ ์ง€์ •(External)ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

2. Managed vs External ํ•ต์‹ฌ ๊ฐœ๋…

UC์—์„œ ๊ฐ€์žฅ ์ค‘์š”ํ•œ ์ฐจ์ด๋Š” ๋ฐ์ดํ„ฐ์˜ ์†Œ์œ ๊ถŒ(Lifecycle Management)์ž…๋‹ˆ๋‹ค.

๊ตฌ๋ถ„Managed (๊ด€๋ฆฌํ˜•)External (์™ธ๋ถ€ํ˜•)
์ •์˜UC๊ฐ€ ๋ฐ์ดํ„ฐ์˜ ์œ„์น˜์™€ ์ƒ๋ช…์ฃผ๊ธฐ๋ฅผ ๋ชจ๋‘ ๊ด€๋ฆฌ์‚ฌ์šฉ์ž๊ฐ€ ๋ฐ์ดํ„ฐ ์ €์žฅ ์œ„์น˜๋ฅผ ์ง€์ •
์ €์žฅ ์œ„์น˜Metastore/Catalog/Schema์— ์„ค์ •๋œ ๊ธฐ๋ณธ ๊ฒฝ๋กœDDL ์ž‘์„ฑ ์‹œ ๋ช…์‹œํ•œ LOCATION ๊ฒฝ๋กœ
DROP ์‹œ ๋™์ž‘๋ฉ”ํƒ€๋ฐ์ดํ„ฐ + ๋ฌผ๋ฆฌ์  ๋ฐ์ดํ„ฐ ๋ชจ๋‘ ์‚ญ์ œ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋งŒ ์‚ญ์ œ (์‹ค์ œ ๋ฐ์ดํ„ฐ ์œ ์ง€)
UNDROP ์‹œ ๋™์ž‘๋ฉ”ํƒ€๋ฐ์ดํ„ฐ + ๋ฌผ๋ฆฌ์  ๋ฐ์ดํ„ฐ ๋ชจ๋‘ ๋ณต๊ตฌ (7์ผ ์ด๋‚ด)๋ฉ”ํƒ€๋ฐ์ดํ„ฐ ๋ณต๊ตฌ ๋ฐ ๊ธฐ์กด ๋ฐ์ดํ„ฐ ์žฌ์—ฐ๊ฒฐ
์šฉ๋„์ผ๋ฐ˜์ ์ธ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์›Œํฌ๋กœ๋“œ๊ธฐ์กด ๋ฐ์ดํ„ฐ์˜ ์—ฐ๊ฒฐ ๋˜๋Š” ์™ธ๋ถ€ ์‹œ์Šคํ…œ ๊ณต์œ ์šฉ

3. ์‹œ๋‚˜๋ฆฌ์˜ค๋ณ„ ๋ฌผ๋ฆฌ ์ €์žฅ & ์‚ญ์ œ ๊ทœ์น™

์Šคํ‚ค๋งˆ๊ฐ€ External(Location ์ง€์ •)์ด๋ผ ํ•˜๋”๋ผ๋„ ๊ทธ ์•ˆ์˜ ๊ฐ์ฒด๊ฐ€ Managed์ธ ๊ฒฝ์šฐ, UC์˜ ๊ด€๋ฆฌ ์›์น™์ด ์šฐ์„  ์ ์šฉ๋ฉ๋‹ˆ๋‹ค.

#Schema ํƒ€์ž…ObjectObject ํƒ€์ž…๋ฌผ๋ฆฌ ์ €์žฅ ์œ„์น˜ (Path Logic)DROP ์‹œ ๋ฐ์ดํ„ฐ ์‚ญ์ œ ์—ฌ๋ถ€
1ManagedTableManageds3://root/catalog/schema/table_name/์‚ญ์ œ (O)
2ManagedTableExternalgs://external/path/to/table/์œ ์ง€ (X)
3ManagedVolumeManageds3://root/catalog/schema/volume_name/์‚ญ์ œ (O)
4ManagedVolumeExternalgs://external/path/to/volume/์œ ์ง€ (X)
5ExternalTableManagedSchema Location ๋˜๋Š” Managed Storage์‚ญ์ œ (O)
6ExternalTableExternalํ…Œ์ด๋ธ” DDL์— ์ง€์ •ํ•œ LOCATION์œ ์ง€ (X)
7ExternalVolumeManagedSchema Location ๋˜๋Š” Managed Storage์‚ญ์ œ (O)
8ExternalVolumeExternal๋ณผ๋ฅจ DDL์— ์ง€์ •ํ•œ LOCATION์œ ์ง€ (X)

๐Ÿ’ก ํ•ต์‹ฌ ํฌ์ธํŠธ:
1. ํ…Œ์ด๋ธ” ์„ ์–ธ์ด Managed๋ผ๋ฉด, ํ•˜์œ„ ์ €์žฅ์†Œ๊ฐ€ ์–ด๋””๋“  DROP ์‹œ ๋ฐ์ดํ„ฐ๋Š” ์‚ญ์ œ๋ฉ๋‹ˆ๋‹ค.
2. External Schema ๋‚ด์˜ Managed Table์€ ์Šคํ‚ค๋งˆ๊ฐ€ ๊ฐ€์ง„ LOCATION ๊ฒฝ๋กœ ์•„๋ž˜์— ์ƒ์„ฑ๋˜๋”๋ผ๋„ 'Managed' ํŠน์„ฑ์ƒ ์‚ญ์ œ ๊ถŒํ•œ์ด UC์— ์žˆ์Šต๋‹ˆ๋‹ค.


4. DDL ๋ช…๋ น์–ด ๋ ˆํผ๋Ÿฐ์Šค

4-1) Catalog ์ƒ์„ฑ

์นดํƒˆ๋กœ๊ทธ ์ˆ˜์ค€์—์„œ ๊ฒฉ๋ฆฌ๋œ ์ €์žฅ ๊ณต๊ฐ„์„ ์‚ฌ์šฉํ•˜๊ณ  ์‹ถ์„ ๋•Œ MANAGED LOCATION์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

-- 1. ๊ธฐ๋ณธํ˜• ์นดํƒˆ๋กœ๊ทธ (๋ฉ”ํƒ€์Šคํ† ์–ด ์ €์žฅ์†Œ ์ƒ์†)
CREATE CATALOG IF NOT EXISTS prod_catalog;

-- 2. ๊ด€๋ฆฌํ˜• ์œ„์น˜๋ฅผ ์ง€์ •ํ•œ ์นดํƒˆ๋กœ๊ทธ (๋ถ„๋ฆฌ๋œ ๋ฒ„ํ‚ท ์‚ฌ์šฉ)
CREATE CATALOG IF NOT EXISTS dev_catalog
MANAGED LOCATION 's3://my-dev-bucket/uc-managed/';

4-2) Schema ์ƒ์„ฑ

  • Schema๋Š” Catalog ํ•˜์œ„์— ์ƒ์„ฑ๋˜๋ฉฐ, ํ•ด๋‹น Schema์—์„œ ์ƒ์„ฑ๋  Managed ๊ฐ์ฒด๋“ค์˜ ๊ธฐ๋ณธ ์ €์žฅ ๊ฒฝ๋กœ๋ฅผ ๊ฒฐ์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
-- Managed Schema (์ƒ์œ„ Catalog์˜ ๊ฒฝ๋กœ ์ƒ์†)
CREATE SCHEMA IF NOT EXISTS catalog.managed_schema;

-- External Schema (Managed ๊ฐ์ฒด๋“ค์ด ์ €์žฅ๋  ํŠน์ • ๊ฒฝ๋กœ ์ง€์ •)
CREATE SCHEMA IF NOT EXISTS catalog.external_schema
MANAGED LOCATION 'gs://my-bucket/external-schema-path/';

4-3) Table ์ƒ์„ฑ

  • Managed Table: LOCATION์„ ์ง€์ •ํ•˜์ง€ ์•Š์œผ๋ฉด ์ƒ์œ„ Schema/Catalog์˜ ๊ฒฝ๋กœ ํ•˜์œ„์— ๋ฐ์ดํ„ฐ๊ฐ€ ์ €์žฅ๋ฉ๋‹ˆ๋‹ค.
  • External Table: LOCATION์„ ๋ช…์‹œํ•ด์•ผ ํ•˜๋ฉฐ, External Location์œผ๋กœ ๋“ฑ๋ก๋œ ๊ฒฝ๋กœ๋ผ๋ฉด ์–ด๋””๋“  ์ €์žฅ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. (๋ฐ˜๋“œ์‹œ ์ƒ์œ„ ์Šคํ‚ค๋งˆ ๊ฒฝ๋กœ ์•„๋ž˜์ผ ํ•„์š”๋Š” ์—†์Œ)
-- Managed Table
CREATE TABLE catalog.schema.man_tbl (id INT, name STRING) USING DELTA;

-- External Table (External Location ๋“ฑ๋ก ์„ ํ–‰ ํ•„์š”)
CREATE TABLE catalog.schema.ext_tbl (id INT) 
LOCATION 'gs://my-bucket/data/ext_tbl/';

4-4) Volume ์ƒ์„ฑ

  • Managed Volume: UC๊ฐ€ ๊ด€๋ฆฌํ•˜๋Š” ๊ธฐ๋ณธ ๊ฒฝ๋กœ์— ํŒŒ์ผ์ด ์ €์žฅ๋ฉ๋‹ˆ๋‹ค.
  • External Volume: ๋“ฑ๋ก๋œ ์™ธ๋ถ€ ๊ฒฝ๋กœ๋ฅผ ์ง์ ‘ ์ฐธ์กฐํ•˜์—ฌ ๋น„์ •ํ˜• ๋ฐ์ดํ„ฐ๋ฅผ ๊ด€๋ฆฌํ•ฉ๋‹ˆ๋‹ค.
-- Managed Volume (๋น„์ •ํ˜• ๋ฐ์ดํ„ฐ์šฉ)
CREATE VOLUME catalog.schema.man_vol;

-- External Volume (LOCATION ๋ช…์‹œ ๋ฐ External Location ๋“ฑ๋ก ํ•„์š”)
CREATE EXTERNAL VOLUME catalog.schema.ext_vol
LOCATION 'gs://my-bucket/files/ext_vol/';

5. ๊ด€๋ฆฌํ˜• ๋ณผ๋ฅจ(Volume) vs ํ…Œ์ด๋ธ”(Table)

๊ตฌ๋ถ„TablesVolumes
๋ฐ์ดํ„ฐ ํ˜•ํƒœTabular (ํ–‰/์—ด)Files (๋ชจ๋“  ํ˜•์‹)
์ฃผ์š” ํฌ๋งทDelta, Parquet, CSV ๋“ฑ๋กœ๊ทธ, ์ด๋ฏธ์ง€, ํ•˜์œ„ ๋””๋ ‰ํ† ๋ฆฌ ๋“ฑ

๐Ÿ’ก ๊ถŒ์žฅ ์›Œํฌ๋กœ๋“œ:

  • ์ •ํ˜• ๋ฐ์ดํ„ฐ ๋ฐ ๋ถ„์„์šฉ ์ง€ํ‘œ ๋ฐ์ดํ„ฐ๋Š” Table ์ถ”์ฒœ
  • ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ ํŒŒ์ผ, ๋กœ๊ทธ ํŒŒ์ผ, ์›๋ณธ(Raw) ํŒŒ์ผ ๊ด€๋ฆฌ๋Š” Volume ์ถ”์ฒœ

6. ์š”์•ฝ ๋ฐ ์ฃผ์˜์‚ฌํ•ญ

  1. External Location ๋“ฑ๋ก: External Table/Volume์„ ๋งŒ๋“ค๊ธฐ ์ „, ํด๋ผ์šฐ๋“œ ์Šคํ† ๋ฆฌ์ง€ ๊ฒฝ๋กœ๊ฐ€ UC์— External Location์œผ๋กœ ๋“ฑ๋ก๋˜์–ด ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  2. Managed์˜ ์‚ญ์ œ ๊ทœ์ •: Managed ํƒ€์ž…์€ "UC๊ฐ€ ๋ฐ์ดํ„ฐ์˜ ์ƒ๋ช…์ฃผ๊ธฐ๋ฅผ ์ฑ…์ž„์ง„๋‹ค"๋Š” ๋œป์ด๋ฏ€๋กœ, DROP ์‹œ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์‚ญ์ œ๋ฉ๋‹ˆ๋‹ค. ๋‹จ, 7์ผ ์ด๋‚ด๋ผ๋ฉด UNDROP ๋ช…๋ น์–ด๋กœ ๋ณต๊ตฌ๊ฐ€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
  3. External์˜ ๋ณต๊ตฌ: External ํ…Œ์ด๋ธ”์€ DROP ์‹œ ๋ฐ์ดํ„ฐ๊ฐ€ ์œ ์ง€๋˜๋ฏ€๋กœ ์–ธ์ œ๋“  ์žฌ์—ฐ๊ฒฐ์ด ๊ฐ€๋Šฅํ•˜์ง€๋งŒ, UNDROP์„ ์‚ฌ์šฉํ•˜๋ฉด ํ…Œ์ด๋ธ” ๊ถŒํ•œ ๋“ฑ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๊นŒ์ง€ ํ•จ๊ป˜ ๋ณต๊ตฌ๋ฉ๋‹ˆ๋‹ค.
  4. ์Šคํ‚ค๋งˆ ์œ„์น˜ ์ƒ์†: ์Šคํ‚ค๋งˆ์— LOCATION์„ ์ฃผ๋ฉด ๊ทธ ์•„๋ž˜์˜ Managed ๊ฐ์ฒด๋“ค์€ ๋ถ€๋ชจ ์Šคํ‚ค๋งˆ์˜ ๊ฒฝ๋กœ๋ฅผ ๊ธฐ๋ณธ๊ฐ’์œผ๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

7. ์„ค๊ณ„ ์‹œ ๊ถŒ์žฅ ์‚ฌํ•ญ(Best Practices)

  1. ํ™˜๊ฒฝ ๋ถ„๋ฆฌ: dev, staging, prod ์นดํƒˆ๋กœ๊ทธ๋ฅผ ๋งŒ๋“ค๊ณ  ๊ฐ๊ฐ ๋‹ค๋ฅธ MANAGED LOCATION(S3 ๋ฒ„ํ‚ท ๋“ฑ)์„ ์ง€์ •ํ•˜์—ฌ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฒฉ๋ฆฌํ•˜์„ธ์š”.
  2. Managed ์šฐ์„ : ํŠน๋ณ„ํ•œ ์ด์œ (์™ธ๋ถ€ ์‹œ์Šคํ…œ ๊ณต์œ , ๊ธฐ์กด ๋ฐ์ดํ„ฐ ๋“ฑ)๊ฐ€ ์—†๋‹ค๋ฉด ์„ฑ๋Šฅ๊ณผ ๊ด€๋ฆฌ ํŽธ์˜์„ฑ์„ ์œ„ํ•ด Managed Table(Delta) ์‚ฌ์šฉ์„ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค.
  3. Volume ํ™œ์šฉ: .csv, .json ์›๋ณธ ํŒŒ์ผ์ด๋‚˜ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ, ๋กœ๊ทธ ๋“ฑ ๋น„์ •ํ˜• ํŒŒ์ผ์€ Table์ด ์•„๋‹Œ Volume์œผ๋กœ ๊ด€๋ฆฌํ•˜์—ฌ ๋ณด์•ˆ๊ณผ ์ถ”์ ์„ฑ์„ ํ™•๋ณดํ•ฉ๋‹ˆ๋‹ค.
  4. ๊ถŒํ•œ ์ตœ์†Œํ™”: EXTERNAL LOCATION์„ ์ง์ ‘ ์ฐธ์กฐํ•˜๋Š” ๊ถŒํ•œ์€ ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด ๋“ฑ ๊ด€๋ฆฌ์ž์—๊ฒŒ๋งŒ ๋ถ€์—ฌํ•˜๊ณ , ์ผ๋ฐ˜ ๋ถ„์„๊ฐ€๋Š” Managed Table์„ ํ†ตํ•ด์„œ๋งŒ ๋ฐ์ดํ„ฐ์— ์ ‘๊ทผํ•˜๋„๋ก ์„ค๊ณ„ํ•˜์„ธ์š”.
profile
๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด์˜ ์ž‘์—…๊ณต๊ฐ„ / #PYTHON #CLOUD #SPARK #AWS #GCP #NCLOUD

0๊ฐœ์˜ ๋Œ“๊ธ€