GPT vs Gemini vs Claude를 구조 레벨로 더 깊게 (block diagram + pseudo code) 까지 ...

Mujung Kim·2026년 5월 5일

LLM + RAG 시스템

목록 보기
9/11

1. 🟦 GPT (ChatGPT / GPT-4o / GPT-5 class)

🔧 Core idea

  • Dense (or partially sparse) Transformer
  • Strong tool + agent orchestration layer
  • Iterative reasoning loop (hidden chain-of-thought)

🧱 Block Diagram

User Input
   ↓
Tokenizer
   ↓
Embedding + Positional Encoding
   ↓
┌─────────────────────────────┐
│  Transformer Stack (N)      │
│  - Self Attention           │
│  - MLP (Dense / MoE hybrid) │
└─────────────────────────────┘
   ↓
Logits → Sampling
   ↓
┌─────────────────────────────┐
│ Agent Layer (critical)      │
│ - Tool calling              │
│ - Function execution        │
│ - Memory (short-term)       │
│ - Self-reflection loop      │
└─────────────────────────────┘
   ↓
Final Output

⚙️ Pseudo Code (simplified)

def GPT_Inference(input_text):

    tokens = tokenize(input_text)
    x = embed(tokens)

    # Transformer forward
    for layer in transformer_layers:
        x = layer.self_attention(x)
        x = layer.mlp(x)

    logits = lm_head(x)

    # sampling
    output = sample(logits)

    # --- Agent loop ---
    while needs_tool(output):
        tool_result = call_tool(output)

        # re-inject context
        tokens = tokens + tokenize(tool_result)
        x = embed(tokens)

        for layer in transformer_layers:
            x = layer(x)

        output = sample(lm_head(x))

    return output

🧠 Key structural 특징

  • Dense 중심 + 일부 MoE 가능
  • Agent loop가 “외부 orchestration” 느낌
  • Tool use는 후처리 루프에서 발생

👉 핵심:

  • LLM + Agent wrapper 구조

2. 🟨 Gemini (Google)

🔧 Core idea

  • Sparse MoE Transformer (native)
  • 멀티모달 = 모델 내부에서 통합
  • Tool / planning이 모델 내부에 더 깊게 결합

🧱 Block Diagram

Multi-Modal Input (Text / Image / Audio)
        ↓
Unified Tokenizer
        ↓
Modality Encoder (Vision / Audio shared space)
        ↓
┌──────────────────────────────────────┐
│ Sparse Transformer (MoE)             │
│                                      │
│  Router → Expert 선택                 │
│           ↓                          │
│     Expert FFN 실행 (Top-k)           │
│                                      │
│  + Cross-modal Attention             │
└──────────────────────────────────────┘
        ↓
Planner / Tool Reasoner (내장)
        ↓
Tool Execution (API / Search / Code)
        ↓
Response Decoder

⚙️ Pseudo Code

def Gemini_Inference(multimodal_input):

    tokens = multimodal_tokenize(multimodal_input)

    x = embed(tokens)

    for layer in moe_transformer_layers:

        # Router decides experts
        expert_ids = router(x)

        # Sparse activation
        expert_outputs = []
        for e in expert_ids:
            expert_outputs.append(experts[e](x))

        x = combine(expert_outputs)

        x = self_attention(x)
        x = cross_modal_attention(x)

    # --- Built-in planning ---
    plan = internal_planner(x)

    if plan.requires_tool:
        tool_result = execute_tool(plan)

        x = integrate(x, tool_result)

    return decode(x)

🧠 Key structural 특징

  • MoE = 기본 구조 (not optional)
  • 멀티모달 attention이 core
  • Tool/Planning이 모델 내부에 있음

👉 핵심:

LLM + Planner + Tools = 하나의 모델


3. 🟩 Claude (Anthropic)

🔧 Core idea

  • Dense Transformer 기반
  • Constitutional AI (alignment layer가 핵심 구조)
  • 매우 강한 long-context reasoning

🧱 Block Diagram

User Input
   ↓
Tokenizer
   ↓
Embedding
   ↓
┌─────────────────────────────┐
│ Transformer Stack           │
│ (Long Context optimized)    │
└─────────────────────────────┘
   ↓
Initial Output
   ↓
┌─────────────────────────────┐
│ Alignment Layer             │
│ (Constitutional AI)         │
│ - Rule checking             │
│ - Self critique             │
│ - Revision loop             │
└─────────────────────────────┘
   ↓
Final Output

⚙️ Pseudo Code

def Claude_Inference(input_text):

    tokens = tokenize(input_text)
    x = embed(tokens)

    for layer in transformer_layers:
        x = layer(x)

    draft = decode(x)

    # --- Constitutional AI loop ---
    critique = evaluate_with_rules(draft)

    if critique.has_issues:
        revised = revise(draft, critique)
        return revised

    return draft

🧠 Key structural 특징

  • Dense Transformer (안정성 중심)
  • Alignment loop가 구조적으로 포함
  • Tool보다는 “reasoning integrity”에 집중

👉 핵심:

LLM + Self-critique system


4. 🔥 구조 비교 핵심 요약

요소GPTGeminiClaude
TransformerDense 중심MoE 중심Dense
MoE일부핵심거의 없음
Multimodal통합 (후기)native제한적
Tool 사용외부 loop내부 통합제한적
Agent 구조강함매우 강함약함
AlignmentRLHF + systemRLHF + planningConstitutional AI
Long Context강함매우 강함매우 강함

5. 한 줄 구조 차이

  • GPT
    → Transformer + Agent Wrapper
  • Gemini
    → MoE Transformer + Built-in Planner + Multimodal
  • Claude
    → Transformer + Alignment Loop

6. 엔지니어 관점 핵심 통찰

중요한 포인트 하나 짚고 가면:

👉 “모델 성능 차이”는 이제
Transformer 구조 차이 때문이 아니라

1. MoE routing quality
2. Tool integration depth
3. Inference orchestration
4. Alignment loop sophistication

에서 갈립니다.

profile
천천히 고민하면서 걷는 개발자

0개의 댓글