๐Ÿ˜ ํด๋Ÿฌ์Šคํ„ฐ ๊ด€๋ฆฌ์™€ ๋ฐ์ดํ„ฐ ์ œ๊ณต

okorionยท2025๋…„ 10์›” 29์ผ
0

1๏ธโƒฃ Hadoop ํด๋Ÿฌ์Šคํ„ฐ ๊ด€๋ฆฌ์˜ ํ•ต์‹ฌ ๊ตฌ์กฐ

ํ•˜๋‘ก์€ ๋‹จ์ผ ์„œ๋ฒ„๊ฐ€ ์•„๋‹Œ ์—ฌ๋Ÿฌ ๋…ธ๋“œ๊ฐ€ ๋ณ‘๋ ฌ๋กœ ๋™์ž‘ํ•˜๋Š” ๋ถ„์‚ฐ ํด๋Ÿฌ์Šคํ„ฐ๋‹ค.
์ด๋ฅผ ์•ˆ์ •์ ์œผ๋กœ ์šด์˜ํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์–‘ํ•œ ์ž์› ๊ด€๋ฆฌยท์กฐ์œจยท์ž๋™ํ™”ยท์ŠคํŠธ๋ฆฌ๋ฐ ์ปดํฌ๋„ŒํŠธ๊ฐ€ ํ•จ๊ป˜ ๋™์ž‘ํ•œ๋‹ค.

๐Ÿ“Š ์šด์˜ ๊ณ„์ธต ๊ตฌ์กฐ

๊ณ„์ธต๊ตฌ์„ฑ ์š”์†Œ์—ญํ• 
์ž์› ๊ด€๋ฆฌYARN, MesosCPUยท๋ฉ”๋ชจ๋ฆฌ ์Šค์ผ€์ค„๋ง ๋ฐ ํƒœ์Šคํฌ ๋ฐฐ๋ถ„
์กฐ์œจ ๊ด€๋ฆฌZooKeeperํด๋Ÿฌ์Šคํ„ฐ ์ƒํƒœ ๊ด€๋ฆฌ, ๋ฆฌ๋” ์„ ์ถœ
์›Œํฌํ”Œ๋กœ ์ž๋™ํ™”Oozie๋ฐฐ์น˜ ์žกยท์˜์กด์„ฑยท์Šค์ผ€์ค„ ๊ด€๋ฆฌ
๋ถ„์„ ์ธํ„ฐํŽ˜์ด์ŠคZeppelin, HueSQL/๋…ธํŠธ๋ถยท์‹œ๊ฐํ™” ํ™˜๊ฒฝ ์ œ๊ณต
๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธKafka, Flume์‹ค์‹œ๊ฐ„ ๋ฐ์ดํ„ฐ ์ŠคํŠธ๋ฆผ ์ˆ˜์ง‘ยท์ „๋‹ฌ

2๏ธโƒฃ YARN: ํด๋Ÿฌ์Šคํ„ฐ ์ž์›์˜ ํ•ต์‹ฌ ๊ด€๋ฆฌ์ž

YARN (Yet Another Resource Negotiator) ๋Š” Hadoop์˜ ์ž์› ๊ด€๋ฆฌ ํ”„๋ ˆ์ž„์›Œํฌ๋‹ค.
๋ชจ๋“  Spark, MapReduce, Hive, Tez ์ž‘์—…์€ YARN ์œ„์—์„œ ์‹คํ–‰๋œ๋‹ค.

๐Ÿ“˜ YARN ๊ตฌ์กฐ ์š”์•ฝ

๊ตฌ์„ฑ์š”์†Œ์—ญํ• 
ResourceManager (RM)ํด๋Ÿฌ์Šคํ„ฐ ์ „์ฒด ์ž์› ์Šค์ผ€์ค„๋ง, ํ ๊ด€๋ฆฌ
NodeManager (NM)๊ฐ ๋…ธ๋“œ์˜ CPUยท๋ฉ”๋ชจ๋ฆฌ ๋ชจ๋‹ˆํ„ฐ๋ง, ์ปจํ…Œ์ด๋„ˆ ์‹คํ–‰
ApplicationMaster (AM)๊ฐœ๋ณ„ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์˜ ์‹คํ–‰ยท๋ณต๊ตฌ ๊ด€๋ฆฌ
Container์‹ค์ œ ํƒœ์Šคํฌ๊ฐ€ ์‹คํ–‰๋˜๋Š” ๋…ผ๋ฆฌ์  ์ž์› ๋‹จ์œ„

๐Ÿ’ก ๋™์ž‘ ํ๋ฆ„
1๏ธโƒฃ ํด๋ผ์ด์–ธํŠธ๊ฐ€ ์žก(Job)์„ ์ œ์ถœ
2๏ธโƒฃ RM์ด ์ž์›์„ ํ• ๋‹นํ•˜๊ณ  AM์„ ์ƒ์„ฑ
3๏ธโƒฃ AM์ด NMs์— ํƒœ์Šคํฌ(Container) ๋ฐฐํฌ
4๏ธโƒฃ ์‹คํ–‰ ์™„๋ฃŒ ํ›„ RM์— ๋ณด๊ณ 

๐Ÿ“ˆ ์šด์˜ ํฌ์ธํŠธ

  • ์›น UI: http://localhost:8088
  • ํยท์šฐ์„ ์ˆœ์œ„ ๊ธฐ๋ฐ˜ ์Šค์ผ€์ค„๋ง(Capacity/Fair Scheduler)
  • Spark, Hive, Tez, MR ๋ชจ๋‘ YARN ๊ธฐ๋ฐ˜ ์‹คํ–‰

3๏ธโƒฃ Tez: Hive์™€ Pig์˜ ์†๋„๋ฅผ ๋†’์ด๋Š” DAG ์—”์ง„

Apache Tez๋Š” YARN ์œ„์—์„œ ์‹คํ–‰๋˜๋Š” DAG(Directed Acyclic Graph) ๊ธฐ๋ฐ˜ ์‹คํ–‰ ์—”์ง„์ด๋‹ค.
Hive์™€ Pig์˜ ๋‚ด๋ถ€ ์‹คํ–‰์„ MapReduce๋ณด๋‹ค ์ˆ˜์‹ญ ๋ฐฐ ๋น ๋ฅด๊ฒŒ ์ตœ์ ํ™”ํ•œ๋‹ค.

๐Ÿ“Š Tez์˜ ํ•ต์‹ฌ

  • ๋‹ค๋‹จ๊ณ„ MapReduce ์ž‘์—…์„ ํ•˜๋‚˜์˜ DAG๋กœ ๋ณ‘ํ•ฉ
  • ๋ถˆํ•„์š”ํ•œ ์ค‘๊ฐ„ ํŒŒ์ผ I/O ์ œ๊ฑฐ
  • in-memory shuffle ๋ฐ DAG ๋ณ‘๋ ฌํ™”

๐Ÿ’ก ํ™œ์šฉ

  • Hive ์„ค์ •์—์„œ set hive.execution.engine=tez;
  • Ambari โ†’ Hive ์„œ๋น„์Šค โ†’ Config โ†’ Execution Engine: Tez ์„ ํƒ
  • tez.view UI์—์„œ DAG ๊ทธ๋ž˜ํ”„ ์‹คํ–‰ ํ”Œ๋กœ์šฐ ํ™•์ธ

๐Ÿ“ˆ ์„ฑ๋Šฅ ์ด์  ์ธก์ • ์˜ˆ์‹œ

์—”์ง„์ฒ˜๋ฆฌ ์‹œ๊ฐ„(์˜ˆ์‹œ)๊ฐœ์„ ์œจ
MapReduce120์ดˆbaseline
Tez15์ดˆ8๋ฐฐ ๋น ๋ฆ„

4๏ธโƒฃ Mesos: ๋ฒ”์šฉ ํด๋Ÿฌ์Šคํ„ฐ ๋ฆฌ์†Œ์Šค ๊ด€๋ฆฌ์ž

Apache Mesos๋Š” Hadoop, Spark, Kafka ๋“ฑ ๋‹ค์–‘ํ•œ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ
ํ•˜๋‚˜์˜ ํด๋Ÿฌ์Šคํ„ฐ ์ž์› ํ’€๋กœ ํ†ตํ•ฉ ๊ด€๋ฆฌํ•˜๋Š” ํ”Œ๋žซํผ์ด๋‹ค.

๐Ÿ“˜ ์ฃผ์š” ํŠน์ง•

  • YARN๊ณผ ์œ ์‚ฌํ•˜์ง€๋งŒ, Hadoop ์™ธ ์„œ๋น„์Šค๊นŒ์ง€ ์ง€์›
  • DockerยทSparkยทElasticsearch ๋“ฑ ๋‹ค์–‘ํ•œ ์›Œํฌ๋กœ๋“œ ์‹คํ–‰
  • Framework Master๊ฐ€ ์ž์›์„ ์š”์ฒญํ•˜๋ฉด Mesos Master๊ฐ€ ๋ถ„๋ฐฐ

๐Ÿ“Ž ๋น„๊ต ์š”์•ฝ

ํ•ญ๋ชฉYARNMesos
์ฃผ์š” ์šฉ๋„Hadoop ์ƒํƒœ๊ณ„ ์ „์šฉ๋ฒ”์šฉ ํด๋Ÿฌ์Šคํ„ฐ ๊ด€๋ฆฌ
๋ฐฐํฌ ๋‹จ์œ„ApplicationMaster ์ค‘์‹ฌFramework ์ค‘์‹ฌ
ํ†ตํ•ฉ ๊ฐ€๋Šฅ์„ฑMapReduce, Spark, TezSpark, Kafka, Docker ๋“ฑ

๐Ÿ’ก ์ ์šฉ ์˜ˆ์‹œ

  • ํ•˜๋‚˜์˜ ํด๋Ÿฌ์Šคํ„ฐ์—์„œ Spark Streaming + Kafka + Hadoop์„ ํ†ตํ•ฉ ์šด์˜
  • ์ž์› ํ™œ์šฉ๋ฅ  ๊ทน๋Œ€ํ™”

5๏ธโƒฃ ZooKeeper: ํด๋Ÿฌ์Šคํ„ฐ์˜ โ€œ๋‡Œโ€

Apache ZooKeeper๋Š” ๋ถ„์‚ฐ ํ™˜๊ฒฝ์—์„œ ์„œ๋ฒ„ ๊ฐ„์˜ ๋™๊ธฐํ™”ยท๋ฆฌ๋” ์„ ์ถœยท๊ตฌ์„ฑ ๊ด€๋ฆฌ๋ฅผ ๋‹ด๋‹นํ•œ๋‹ค.
HBase, Kafka, Oozie ๋“ฑ ๋‹ค์ˆ˜์˜ Hadoop ๊ตฌ์„ฑ์š”์†Œ๊ฐ€ ZooKeeper์— ์˜์กดํ•œ๋‹ค.

๐Ÿ“˜ ํ•ต์‹ฌ ๊ธฐ๋Šฅ

๊ธฐ๋Šฅ์„ค๋ช…
Leader Election๋งˆ์Šคํ„ฐ ์žฅ์•  ์‹œ ์ž๋™ ๋ฆฌ๋” ์„ ์ถœ
Configuration Management๋…ธ๋“œ ์ƒํƒœยท๋ฉ”ํƒ€๋ฐ์ดํ„ฐ ์ €์žฅ
Synchronization์—ฌ๋Ÿฌ ๋…ธ๋“œ ๊ฐ„ ๋™์‹œ์„ฑ ์ œ์–ด
Watch Mechanism์ด๋ฒคํŠธ ๊ธฐ๋ฐ˜ ์ƒํƒœ ๊ฐ์‹œ

๐Ÿ“Ž ์‹ค์Šต: ๋งˆ์Šคํ„ฐ ์žฅ์•  ์‹œ๋ฎฌ๋ ˆ์ด์…˜
1๏ธโƒฃ ZooKeeper ํด๋Ÿฌ์Šคํ„ฐ(3๋…ธ๋“œ ์ด์ƒ) ์‹คํ–‰
2๏ธโƒฃ Leader ๋…ธ๋“œ ๊ฐ•์ œ ์ข…๋ฃŒ
3๏ธโƒฃ 1์ดˆ ์ด๋‚ด ์ƒˆ ๋ฆฌ๋” ์ž๋™ ์„ ์ถœ ํ™•์ธ

๐Ÿ“ˆ ๊ฒฐ๊ณผ

  • ๋ถ„์‚ฐ ์‹œ์Šคํ…œ์ด ๋‹จ์ผ ์žฅ์• ์ (SPOF)์„ ํ”ผํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•จ
  • Kafka, HBase, Oozie ๋ชจ๋‘ ZooKeeper ๊ธฐ๋ฐ˜์œผ๋กœ ๋™์ž‘

6๏ธโƒฃ Oozie: ์›Œํฌํ”Œ๋กœ ์ž๋™ํ™” ์—”์ง„

Apache Oozie๋Š” Hadoop ์ž‘์—…(Hive, Pig, Sqoop, Spark, Shell ๋“ฑ)์„
์‹œ๊ฐ„ยท์˜์กด์„ฑยท์ด๋ฒคํŠธ ๊ธฐ๋ฐ˜์œผ๋กœ ์ž๋™ ์‹คํ–‰ํ•˜๋Š” ์›Œํฌํ”Œ๋กœ ๋งค๋‹ˆ์ €๋‹ค.

๐Ÿ“˜ Oozie ๊ตฌ์กฐ

๊ตฌ์„ฑ ์š”์†Œ์„ค๋ช…
Coordinator์ฃผ๊ธฐ์  ์‹คํ–‰ (e.g., ๋งค์ผ 0์‹œ)
WorkflowXML ๊ธฐ๋ฐ˜ ์ž‘์—… ์ •์˜ (action ๊ฐ„ ์ˆœ์„œ ์ง€์ •)
Action NodeHive/Pig/Shell/MapReduce ์‹คํ–‰ ๋‹จ์œ„
Control NodeStart, End, Kill, Decision ๋“ฑ ํ๋ฆ„ ์ œ์–ด

๐Ÿ’ป ๊ฐ„๋‹จํ•œ ์›Œํฌํ”Œ๋กœ ์˜ˆ์‹œ

<workflow-app name="movielens_etl" xmlns="uri:oozie:workflow:0.5">
  <start to="hive-load"/>
  <action name="hive-load">
    <hive xmlns="uri:oozie:hive-action:0.5">
      <script>load_movielens.hql</script>
    </hive>
    <ok to="spark-train"/>
    <error to="fail"/>
  </action>
  <action name="spark-train">
    <spark xmlns="uri:oozie:spark-action:0.2">
      <master>yarn-cluster</master>
      <name>ALSModelTrain</name>
      <class>com.recommender.TrainALS</class>
    </spark>
    <ok to="end"/>
    <error to="fail"/>
  </action>
  <kill name="fail"><message>Workflow failed</message></kill>
  <end name="end"/>
</workflow-app>

๐Ÿ’ก ํ™œ์šฉ

  • ์ฃผ๊ธฐ์  ETL ์ž๋™ํ™” (Hive โ†’ Spark โ†’ Export)
  • Ambari์—์„œ Oozie ์„œ๋น„์Šค UI ์ œ๊ณต

7๏ธโƒฃ Zeppelin & Hue: ๋ฐ์ดํ„ฐ ๋ถ„์„ ์ธํ„ฐํŽ˜์ด์Šค

๐Ÿชถ Zeppelin

  • ๋…ธํŠธ๋ถ ๊ธฐ๋ฐ˜์˜ ๋Œ€ํ™”ํ˜• ๋ฐ์ดํ„ฐ ๋ถ„์„ ๋„๊ตฌ
  • Python, Spark, Hive, JDBC ๋“ฑ ๋‹ค์–‘ํ•œ ์ธํ„ฐํ”„๋ฆฌํ„ฐ ์ง€์›
  • ์‹œ๊ฐํ™” ๋ฐ ์‹ค์‹œ๊ฐ„ ์ฝ”๋“œ ์‹คํ–‰ ๊ฐ€๋Šฅ

๐Ÿ’ป Zeppelin์œผ๋กœ ์˜ํ™” ํ‰์  ๋ถ„์„

%hive
SELECT movieId, AVG(rating) AS avg_rating
FROM movielens.ratings
GROUP BY movieId
ORDER BY avg_rating DESC
LIMIT 10;
  • ํ…Œ์ด๋ธ” ๊ทธ๋ž˜ํ”„, ๋ง‰๋Œ€ํ˜•, ํŒŒ์ด์ฐจํŠธ ๋“ฑ ์ฆ‰์‹œ ์‹œ๊ฐํ™” ๊ฐ€๋Šฅ

๐ŸŒˆ Hue

  • Hadoop์šฉ ์›น UI ํ†ตํ•ฉ ํฌํ„ธ
  • HDFS ํƒ์ƒ‰, Hive ์ฟผ๋ฆฌ, Spark ์žก ๊ด€๋ฆฌ, Workflow ์‹œ๊ฐํ™” ๊ธฐ๋Šฅ
  • ๊ธฐ์—… ๋‚ด ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ดยท๋ถ„์„๊ฐ€ ํ˜‘์—…์šฉ

๐Ÿ“Š ํ™œ์šฉ

  • ๋ฐ์ดํ„ฐ ๋ธŒ๋ผ์šฐ์ง•
  • SQL Editor๋กœ Hive ์ฟผ๋ฆฌ ์‹คํ–‰
  • Oozie ์›Œํฌํ”Œ๋กœ ๋ชจ๋‹ˆํ„ฐ๋ง

8๏ธโƒฃ Kafka: ์ŠคํŠธ๋ฆฌ๋ฐ ๋ฐ์ดํ„ฐ ํ—ˆ๋ธŒ

Apache Kafka๋Š” ์‹ค์‹œ๊ฐ„ ๋กœ๊ทธยท์ด๋ฒคํŠธ๋ฅผ ๋ถ„์‚ฐ ๋ฉ”์‹œ์ง€ ํ๋กœ ์ฒ˜๋ฆฌํ•˜๋Š” ์‹œ์Šคํ…œ์ด๋‹ค.
์ˆ˜๋งŽ์€ ๋ฐ์ดํ„ฐ ์ƒ์‚ฐ์ž(Producer)์™€ ์†Œ๋น„์ž(Consumer)๊ฐ€ Kafka Topic์„ ํ†ตํ•ด ํ†ต์‹ ํ•œ๋‹ค.

๐Ÿ“˜ Kafka ๊ตฌ์„ฑ

๊ตฌ์„ฑ ์š”์†Œ์„ค๋ช…
Producer๋ฉ”์‹œ์ง€ ์ƒ์„ฑ (์›น ๋กœ๊ทธ, IoT, ์•ฑ ๋“ฑ)
Broker๋ฉ”์‹œ์ง€ ์ €์žฅยท์ „๋‹ฌ (๋ถ„์‚ฐ ํŒŒํ‹ฐ์…˜ ๊ด€๋ฆฌ)
Consumer๋ฉ”์‹œ์ง€ ๊ตฌ๋… ๋ฐ ์ฒ˜๋ฆฌ
ZooKeeper๋ธŒ๋กœ์ปค ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ ๊ด€๋ฆฌ

๐Ÿ’ก ํ™œ์šฉ ์˜ˆ์‹œ

  • ์›น ๋กœ๊ทธ โ†’ Kafka โ†’ Spark Streaming โ†’ Hive ์ €์žฅ
  • IoT ์„ผ์„œ โ†’ Kafka โ†’ Flink โ†’ Cassandra

๐Ÿ“Ž Kafka ๋ช…๋ น ์˜ˆ์‹œ

# ํ† ํ”ฝ ์ƒ์„ฑ
kafka-topics.sh --create --topic weblogs --bootstrap-server localhost:9092

# ๋ฉ”์‹œ์ง€ ๊ฒŒ์‹œ
kafka-console-producer.sh --topic weblogs --bootstrap-server localhost:9092

# ๋ฉ”์‹œ์ง€ ์†Œ๋น„
kafka-console-consumer.sh --topic weblogs --bootstrap-server localhost:9092 --from-beginning

9๏ธโƒฃ Flume: ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ ๋ฐ ์ „๋‹ฌ

Apache Flume์€ ๋กœ๊ทธ ํŒŒ์ผ์ด๋‚˜ ๋””๋ ‰ํ„ฐ๋ฆฌ์—์„œ ๋ฐœ์ƒํ•˜๋Š” ๋ฐ์ดํ„ฐ๋ฅผ
HDFS, HBase, Kafka ๋“ฑ์œผ๋กœ ์‹ค์‹œ๊ฐ„ ์ „์†กํ•œ๋‹ค.

๐Ÿ“˜ Flume ๊ตฌ์„ฑ

๊ตฌ์„ฑ์š”์†Œ์—ญํ• 
Source์ž…๋ ฅ ์ˆ˜์ง‘ (๋กœ๊ทธ ํŒŒ์ผ, TCP, syslog ๋“ฑ)
Channel๋ฒ„ํผ ์—ญํ•  (๋ฉ”๋ชจ๋ฆฌ or ํŒŒ์ผ ๊ธฐ๋ฐ˜)
Sink์ถœ๋ ฅ ๋Œ€์ƒ (HDFS, Kafka ๋“ฑ)

๐Ÿ’ก ์˜ˆ์‹œ ๊ตฌ์„ฑ (Flume โ†’ HDFS)

agent.sources = src
agent.sinks = sink
agent.channels = ch

agent.sources.src.type = spooldir
agent.sources.src.spoolDir = /var/logs/web
agent.channels.ch.type = memory
agent.sinks.sink.type = hdfs
agent.sinks.sink.hdfs.path = /data/weblogs
agent.sources.src.channels = ch
agent.sinks.sink.channel = ch

Flume์€ ๋””๋ ‰ํ„ฐ๋ฆฌ๋ฅผ ๊ฐ์‹œํ•˜๋ฉฐ ์ƒˆ ํŒŒ์ผ์ด ์ƒ๊ธฐ๋ฉด ์ž๋™์œผ๋กœ HDFS์— ์ €์žฅํ•œ๋‹ค.
Kafka์™€ ์—ฐ๋™ํ•˜๋ฉด ์™„์ „ํ•œ ์‹ค์‹œ๊ฐ„ ์ˆ˜์ง‘ ํŒŒ์ดํ”„๋ผ์ธ์„ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค.


๐Ÿ”Ÿ ํด๋Ÿฌ์Šคํ„ฐ ๊ธฐ๋ฐ˜ ๋ฐ์ดํ„ฐ ์ œ๊ณต ์•„ํ‚คํ…์ฒ˜ ์š”์•ฝ

๐Ÿ“Š ์—”๋“œํˆฌ์—”๋“œ ๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ ๊ฐœ๋…๋„

 [Flume/Kafka]  โ†’  [YARN + HDFS/HBase]  โ†’  [Hive/Tez/Spark]  โ†’  [Oozie Scheduler]
        โ†“                                                       โ†“
  (์‹ค์‹œ๊ฐ„ ๋กœ๊ทธ ์ˆ˜์ง‘)                                  (๊ฒฐ๊ณผ ์ œ๊ณต: Zeppelin/Hue)
๋‹จ๊ณ„์—ญํ• ์‚ฌ์šฉ ๊ธฐ์ˆ 
์ˆ˜์ง‘๋กœ๊ทธ/์ด๋ฒคํŠธ ์ˆ˜์ง‘Flume, Kafka
์ €์žฅ๋ถ„์‚ฐ ํŒŒ์ผยทDBHDFS, HBase
์ฒ˜๋ฆฌ๋ถ„์‚ฐ ๊ณ„์‚ฐSpark, Hive, Tez
๊ด€๋ฆฌ๋ฆฌ์†Œ์Šค ์Šค์ผ€์ค„๋งYARN, Mesos
์กฐ์œจ์žฅ์•  ๋ณต๊ตฌ, ์ƒํƒœ ๊ด€๋ฆฌZooKeeper
์ž๋™ํ™”์›Œํฌํ”Œ๋กœ ์ œ์–ดOozie
์ œ๊ณต๋ถ„์„ ์‹œ๊ฐํ™”Zeppelin, Hue

๐Ÿ’ฌ ๊ฒฐ๋ก 

์ด๋ฒˆ ํŽธ์—์„œ๋Š” Hadoop ํด๋Ÿฌ์Šคํ„ฐ๊ฐ€ ๋‹จ์ˆœ ์ €์žฅ์†Œ๋ฅผ ๋„˜์–ด
๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ โ†’ ๋ถ„์‚ฐ ์ฒ˜๋ฆฌ โ†’ ์ž๋™ํ™” โ†’ ์‹œ๊ฐํ™”๋กœ ์ด์–ด์ง€๋Š”
์™„์ „ํ•œ ๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ ์ƒํƒœ๊ณ„๋กœ ๋ฐœ์ „ํ•˜๋Š” ๊ณผ์ •์„ ๋‹ค๋ค˜๋‹ค.

YARN์ด ๋ชจ๋“  ๋ฆฌ์†Œ์Šค๋ฅผ ๊ด€๋ฆฌํ•˜๊ณ ,
Tez๊ฐ€ ์—ฐ์‚ฐ์„ ๊ฐ€์†ํ•˜๋ฉฐ,
ZooKeeper๊ฐ€ ์•ˆ์ •์„ฑ์„ ์œ ์ง€ํ•˜๊ณ ,
Oozie๊ฐ€ ์›Œํฌํ”Œ๋กœ๋ฅผ ์ž๋™ํ™”ํ•œ๋‹ค.
๊ทธ๋ฆฌ๊ณ  KafkaยทFlume์ด ๋ฐ์ดํ„ฐ๋ฅผ ์ง€์†์ ์œผ๋กœ ๊ณต๊ธ‰ํ•œ๋‹ค.

์ด ๊ตฌ์กฐ๋ฅผ ์ดํ•ดํ•˜๋ฉด ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ ์‹œ์Šคํ…œ์˜ ์šด์˜๊ณผ ๋ฐ์ดํ„ฐ ํ๋ฆ„ ์ œ์–ด๋ฅผ
์™„์ „ํžˆ ํ†ตํ•ฉ์ ์œผ๋กœ ์„ค๊ณ„ํ•  ์ˆ˜ ์žˆ๋‹ค.

๋‹ค์Œ ํŽธ์—์„œ๋Š” ์ด ๋ชจ๋“  ๊ตฌ์„ฑ์š”์†Œ๋ฅผ ํ™œ์šฉํ•ด
์‹ค์‹œ๊ฐ„ ์ŠคํŠธ๋ฆฌ๋ฐ ๋ถ„์„ ๋ฐ Hadoop ์•„ํ‚คํ…์ฒ˜ ์„ค๊ณ„๋ฅผ ๋‹ค๋ฃฌ๋‹ค.

profile
okorion's Tech Study Blog.

0๊ฐœ์˜ ๋Œ“๊ธ€