GPT vs Gemini vs Claude를 구조 레벨로 더 깊게 (block diagram + pseudo code) 까지 ...

Mujung Kim·2026년 5월 5일

LLM + RAG 시스템

목록 보기

9/11

1. 🟦 GPT (ChatGPT / GPT-4o / GPT-5 class)

🔧 Core idea

Dense (or partially sparse) Transformer
Strong tool + agent orchestration layer
Iterative reasoning loop (hidden chain-of-thought)

🧱 Block Diagram

User Input
   ↓
Tokenizer
   ↓
Embedding + Positional Encoding
   ↓
┌─────────────────────────────┐
│  Transformer Stack (N)      │
│  - Self Attention           │
│  - MLP (Dense / MoE hybrid) │
└─────────────────────────────┘
   ↓
Logits → Sampling
   ↓
┌─────────────────────────────┐
│ Agent Layer (critical)      │
│ - Tool calling              │
│ - Function execution        │
│ - Memory (short-term)       │
│ - Self-reflection loop      │
└─────────────────────────────┘
   ↓
Final Output

⚙️ Pseudo Code (simplified)

def GPT_Inference(input_text):

    tokens = tokenize(input_text)
    x = embed(tokens)

    # Transformer forward
    for layer in transformer_layers:
        x = layer.self_attention(x)
        x = layer.mlp(x)

    logits = lm_head(x)

    # sampling
    output = sample(logits)

    # --- Agent loop ---
    while needs_tool(output):
        tool_result = call_tool(output)

        # re-inject context
        tokens = tokens + tokenize(tool_result)
        x = embed(tokens)

        for layer in transformer_layers:
            x = layer(x)

        output = sample(lm_head(x))

    return output

🧠 Key structural 특징

Dense 중심 + 일부 MoE 가능
Agent loop가 “외부 orchestration” 느낌
Tool use는 후처리 루프에서 발생

👉 핵심:

LLM + Agent wrapper 구조

2. 🟨 Gemini (Google)

🔧 Core idea

Sparse MoE Transformer (native)
멀티모달 = 모델 내부에서 통합
Tool / planning이 모델 내부에 더 깊게 결합

🧱 Block Diagram

Multi-Modal Input (Text / Image / Audio)
        ↓
Unified Tokenizer
        ↓
Modality Encoder (Vision / Audio shared space)
        ↓
┌──────────────────────────────────────┐
│ Sparse Transformer (MoE)             │
│                                      │
│  Router → Expert 선택                 │
│           ↓                          │
│     Expert FFN 실행 (Top-k)           │
│                                      │
│  + Cross-modal Attention             │
└──────────────────────────────────────┘
        ↓
Planner / Tool Reasoner (내장)
        ↓
Tool Execution (API / Search / Code)
        ↓
Response Decoder

⚙️ Pseudo Code

def Gemini_Inference(multimodal_input):

    tokens = multimodal_tokenize(multimodal_input)

    x = embed(tokens)

    for layer in moe_transformer_layers:

        # Router decides experts
        expert_ids = router(x)

        # Sparse activation
        expert_outputs = []
        for e in expert_ids:
            expert_outputs.append(experts[e](x))

        x = combine(expert_outputs)

        x = self_attention(x)
        x = cross_modal_attention(x)

    # --- Built-in planning ---
    plan = internal_planner(x)

    if plan.requires_tool:
        tool_result = execute_tool(plan)

        x = integrate(x, tool_result)

    return decode(x)

🧠 Key structural 특징

MoE = 기본 구조 (not optional)
멀티모달 attention이 core
Tool/Planning이 모델 내부에 있음

👉 핵심:

LLM + Planner + Tools = 하나의 모델

3. 🟩 Claude (Anthropic)

🔧 Core idea

Dense Transformer 기반
Constitutional AI (alignment layer가 핵심 구조)
매우 강한 long-context reasoning

🧱 Block Diagram

User Input
   ↓
Tokenizer
   ↓
Embedding
   ↓
┌─────────────────────────────┐
│ Transformer Stack           │
│ (Long Context optimized)    │
└─────────────────────────────┘
   ↓
Initial Output
   ↓
┌─────────────────────────────┐
│ Alignment Layer             │
│ (Constitutional AI)         │
│ - Rule checking             │
│ - Self critique             │
│ - Revision loop             │
└─────────────────────────────┘
   ↓
Final Output

⚙️ Pseudo Code

def Claude_Inference(input_text):

    tokens = tokenize(input_text)
    x = embed(tokens)

    for layer in transformer_layers:
        x = layer(x)

    draft = decode(x)

    # --- Constitutional AI loop ---
    critique = evaluate_with_rules(draft)

    if critique.has_issues:
        revised = revise(draft, critique)
        return revised

    return draft

🧠 Key structural 특징

Dense Transformer (안정성 중심)
Alignment loop가 구조적으로 포함
Tool보다는 “reasoning integrity”에 집중

👉 핵심:

LLM + Self-critique system

4. 🔥 구조 비교 핵심 요약

요소	GPT	Gemini	Claude
Transformer	Dense 중심	MoE 중심	Dense
MoE	일부	핵심	거의 없음
Multimodal	통합 (후기)	native	제한적
Tool 사용	외부 loop	내부 통합	제한적
Agent 구조	강함	매우 강함	약함
Alignment	RLHF + system	RLHF + planning	Constitutional AI
Long Context	강함	매우 강함	매우 강함

5. 한 줄 구조 차이

GPT
→ Transformer + Agent Wrapper
Gemini
→ MoE Transformer + Built-in Planner + Multimodal
Claude
→ Transformer + Alignment Loop

6. 엔지니어 관점 핵심 통찰

중요한 포인트 하나 짚고 가면:

👉 “모델 성능 차이”는 이제
Transformer 구조 차이 때문이 아니라

1. MoE routing quality
2. Tool integration depth
3. Inference orchestration
4. Alignment loop sophistication

에서 갈립니다.

Mujung Kim

천천히 고민하면서 걷는 개발자

이전 포스트

Vectorstore 구현

다음 포스트

GPT vs Gemini vs Claude를 구조 레벨로 더 깊게 (block diagram + pseudo code) 까지 ...

LLM + RAG 시스템

1. 🟦 GPT (ChatGPT / GPT-4o / GPT-5 class)

🧱 Block Diagram

🧠 Key structural 특징

2. 🟨 Gemini (Google)

🔧 Core idea

🧱 Block Diagram

⚙️ Pseudo Code

🧠 Key structural 특징

3. 🟩 Claude (Anthropic)

🔧 Core idea

🧱 Block Diagram

⚙️ Pseudo Code

🧠 Key structural 특징

4. 🔥 구조 비교 핵심 요약

5. 한 줄 구조 차이

6. 엔지니어 관점 핵심 통찰

Vectorstore 구현

LLM Architecture

0개의 댓글