# Claw Code 깊이 읽기 #4 — 대화 런타임과 세션 관리

조현상·2026년 4월 2일

AIAgent Claude ClawCode claudecode

ClaudeCode

목록 보기

4/17

AI 에이전트는 한 번의 대화가 아니라, "반복"으로 동작한다. 그 반복의 엔진을 해부한다.

들어가며: Agentic Loop이란 무엇인가

ChatGPT나 Claude와의 일반적인 대화는 단순하다. 사용자가 메시지를 보내면, AI가 응답한다. 한 번의 왕복이 전부다.

하지만 AI 에이전트는 다르다. 에이전트는 사용자의 요청을 받으면, 필요한 도구를 호출하고, 그 결과를 확인하고, 또 다른 도구를 호출하는 과정을 스스로 판단하여 반복한다. 파일을 읽고, 코드를 수정하고, 테스트를 실행하고, 결과를 확인하는 일련의 과정이 하나의 "턴(turn)" 안에서 일어난다.

이 반복을 Agentic Loop이라 부른다. Claw Code의 ConversationRuntime은 이 루프의 구현체이며, runtime 크레이트의 핵심이다. 이번 편에서는 이 루프가 어떻게 동작하는지, 세션은 어떻게 저장되는지, 토큰이 초과하면 어떻게 압축하는지를 코드 레벨에서 분석한다.

1. 세션 구조: 대화의 데이터 모델

모든 것은 Session에서 시작한다. 대화의 모든 메시지가 이 구조체에 축적된다.

핵심 타입 계층

#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq, Eq)]
#[serde(rename_all = "snake_case")]
pub enum MessageRole {
    System,
    User,
    Assistant,
    Tool,
}

#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum ContentBlock {
    Text { text: String },
    ToolUse { id: String, name: String, input: String },
    ToolResult {
        tool_use_id: String,
        tool_name: String,
        output: String,
        is_error: bool,
    },
}

pub struct ConversationMessage {
    pub role: MessageRole,
    pub blocks: Vec<ContentBlock>,
    pub usage: Option<TokenUsage>,
}

pub struct Session {
    pub version: u32,
    pub messages: Vec<ConversationMessage>,
}

이 타입 계층이 Anthropic API의 메시지 구조를 거의 1:1로 미러링하는 것에 주목하자. ContentBlock의 세 변형(Text, ToolUse, ToolResult)은 API의 content block 타입과 정확히 대응한다.

ConversationMessage가 blocks: Vec<ContentBlock>을 갖는 것은 하나의 메시지에 여러 종류의 콘텐츠가 섞일 수 있기 때문이다. AI가 "코드를 수정하겠습니다" 라는 텍스트와 함께 edit_file 도구 호출을 같은 메시지에 담는 것이 전형적인 예다:

{
  "role": "assistant",
  "blocks": [
    { "type": "text", "text": "파일을 수정하겠습니다." },
    { "type": "tool_use", "id": "toolu_01...", "name": "edit_file",
      "input": "{\"path\": \"src/main.rs\", ...}" }
  ],
  "usage": { "input_tokens": 1200, "output_tokens": 45 }
}

usage: Option<TokenUsage>가 Option인 것은 어시스턴트 메시지만 사용량 정보를 갖기 때문이다. 사용자 메시지나 도구 결과에는 토큰 사용량이 없다.

세션 영속화

pub fn save_to_path(&self, path: impl AsRef<Path>) -> Result<(), SessionError> {
    fs::write(path, self.to_json().render())?;
    Ok(())
}

pub fn load_from_path(path: impl AsRef<Path>) -> Result<Self, SessionError> {
    let contents = fs::read_to_string(path)?;
    Self::from_json(&JsonValue::parse(&contents)?)
}

세션은 .claude/sessions/ 디렉토리에 JSON으로 저장된다. version: u32 필드는 향후 세션 형식이 변경될 때 마이그레이션을 가능하게 하는 스키마 버전이다.

메시지 생성 헬퍼

impl ConversationMessage {
    pub fn user_text(text: impl Into<String>) -> Self {
        Self {
            role: MessageRole::User,
            blocks: vec![ContentBlock::Text { text: text.into() }],
            usage: None,
        }
    }

    pub fn tool_result(
        tool_use_id: impl Into<String>,
        tool_name: impl Into<String>,
        output: impl Into<String>,
        is_error: bool,
    ) -> Self {
        Self {
            role: MessageRole::Tool,
            blocks: vec![ContentBlock::ToolResult {
                tool_use_id: tool_use_id.into(),
                tool_name: tool_name.into(),
                output: output.into(),
                is_error,
            }],
            usage: None,
        }
    }
}

impl Into<String>을 받는 것은 &str과 String 모두를 수용하는 Rust 관용구다. 호출 시 .to_string() 변환 없이 리터럴을 직접 넘길 수 있다.

2. ConversationRuntime: Agentic Loop의 해부학

구조체 정의

pub struct ConversationRuntime<C, T> {
    session: Session,
    api_client: C,
    tool_executor: T,
    permission_policy: PermissionPolicy,
    system_prompt: Vec<String>,
    max_iterations: usize,
    usage_tracker: UsageTracker,
    hook_runner: HookRunner,
}

2편에서 살펴본 것처럼, C: ApiClient와 T: ToolExecutor라는 제네릭 파라미터가 이 구조체의 핵심이다. 이번 편에서는 이 구조체의 동작, 즉 run_turn() 메서드에 집중한다.

run_turn(): 한 턴의 전체 생명주기

run_turn()은 약 150줄의 메서드로, AI 에이전트의 한 턴 전체를 관리한다. 전체 흐름을 단계별로 분해해보자.

pub fn run_turn(
    &mut self,
    user_input: impl Into<String>,
    mut prompter: Option<&mut dyn PermissionPrompter>,
) -> Result<TurnSummary, RuntimeError>

1단계: 사용자 메시지 기록

self.session.messages.push(
    ConversationMessage::user_text(user_input.into())
);

사용자의 입력이 세션에 추가된다. 이 시점부터 이 메시지는 모든 후속 API 호출에 포함된다.

2단계: 메인 루프 진입

iteration = 0
LOOP:
  iteration += 1
  if iteration > max_iterations → 에러

max_iterations가 무한 루프의 안전장치다. AI가 도구를 호출하고 → 결과를 보고 → 또 도구를 호출하는 사이클이 끝없이 반복될 수 있다. 기본값은 usize::MAX(사실상 무제한)이지만, 프로덕션에서는 적절한 값으로 제한해야 한다.

3단계: API 호출과 이벤트 조립

let request = ApiRequest {
    system_prompt: self.system_prompt.clone(),
    messages: self.session.messages.clone(),
};
let events = self.api_client.stream(request)?;
let (assistant_message, usage) = build_assistant_message(events)?;

시스템 프롬프트와 전체 대화 이력을 API에 전송한다. 이것이 Agentic Loop의 핵심 비용 구조다. 반복할수록 메시지가 늘어나고, 입력 토큰이 증가한다. 이것이 나중에 설명할 세션 압축(compaction)이 필요한 이유다.

build_assistant_message()는 스트리밍 이벤트를 하나의 메시지로 조립한다:

fn build_assistant_message(events: Vec<AssistantEvent>)
    -> Result<(ConversationMessage, Option<TokenUsage>), RuntimeError> {
    let mut text = String::new();
    let mut blocks = Vec::new();
    let mut usage = None;

    for event in events {
        match event {
            AssistantEvent::TextDelta(delta) => text.push_str(&delta),
            AssistantEvent::ToolUse { id, name, input } => {
                flush_text_block(&mut text, &mut blocks);
                blocks.push(ContentBlock::ToolUse { id, name, input });
            }
            AssistantEvent::Usage(value) => usage = Some(value),
            AssistantEvent::MessageStop => { /* 완료 */ }
        }
    }
    flush_text_block(&mut text, &mut blocks);
    // ...
}

flush_text_block()이 흥미롭다. 텍스트 델타가 누적되다가 ToolUse 이벤트가 오면, 그때까지 쌓인 텍스트를 Text 블록으로 변환한다. 이 패턴은 "텍스트 → 도구 호출 → 텍스트 → 도구 호출" 같은 교차 패턴을 올바르게 처리한다.

4단계: 도구 사용 추출

let pending_tool_uses = assistant_message
    .blocks
    .iter()
    .filter_map(|block| match block {
        ContentBlock::ToolUse { id, name, input } =>
            Some((id.clone(), name.clone(), input.clone())),
        _ => None,
    })
    .collect::<Vec<_>>();

어시스턴트 메시지에서 ToolUse 블록만 추출한다. 도구가 없으면 루프가 종료된다. 이것이 Agentic Loop의 핵심 종료 조건이다: AI가 더 이상 도구를 호출하지 않으면, 대화 턴이 끝난 것이다.

5단계: 도구 실행 파이프라인

각 도구 호출에 대해 다섯 단계의 파이프라인이 실행된다:

권한 확인 → PreToolUse 훅 → 도구 실행 → PostToolUse 훅 → 결과 기록

for (tool_use_id, tool_name, tool_input) in &pending_tool_uses {
    // (a) 권한 확인
    let outcome = self.permission_policy.authorize(
        tool_name, tool_input, prompter.as_deref_mut()
    );
    if let PermissionOutcome::Deny { reason } = outcome {
        // 거부 시 에러 결과로 기록
        self.session.messages.push(
            ConversationMessage::tool_result(tool_use_id, tool_name,
                format!("Permission denied: {reason}"), true)
        );
        continue;
    }

    // (b) PreToolUse 훅
    let pre_hook = self.hook_runner.run_pre_tool_use(tool_name, tool_input);
    if pre_hook.is_denied() {
        // 훅이 거부하면 도구 실행 건너뜀
        continue;
    }

    // (c) 도구 실행
    let result = self.tool_executor.execute(tool_name, tool_input);

    // (d) PostToolUse 훅
    let post_hook = self.hook_runner.run_post_tool_use(
        tool_name, tool_input,
        result.as_deref().ok(), result.is_err()
    );

    // (e) 결과 기록
    let (output, is_error) = match result {
        Ok(output) => (output, false),
        Err(error) => (error.to_string(), true),
    };
    self.session.messages.push(
        ConversationMessage::tool_result(tool_use_id, tool_name, output, is_error)
    );
}

이 파이프라인의 설계에서 주목할 점은 권한 거부와 훅 거부가 다르게 처리된다는 것이다. 권한 거부 시에는 is_error: true인 도구 결과가 세션에 기록되어 AI가 "이 도구는 권한이 없어서 실행할 수 없었다"는 것을 알게 된다. 반면 훅 거부의 세부 처리는 훅의 메시지에 의존한다.

6단계: 루프 계속 또는 종료

if pending_tool_uses.is_empty() {
    break;  // 도구 호출 없음 = 턴 종료
}
// 도구 결과가 세션에 추가된 상태로 루프 처음으로 돌아감
// → 다음 반복에서 도구 결과를 포함한 전체 대화가 API에 전송됨

데이터 흐름 전체 다이어그램

사용자 입력: "src/main.rs의 버그를 고쳐줘"
    │
    ▼
세션에 User 메시지 추가
    │
    ▼
┌─── 반복 1 ──────────────────────────────────────────────┐
│ API 호출 → 어시스턴트: "파일을 먼저 읽어보겠습니다"               │
│            + ToolUse(read_file, "src/main.rs")         │
│                                                        │
│ 권한 확인 (ReadOnly ≤ ReadOnly) → 승인                     │
│ PreToolUse 훅 → 허용                                     │
│ read_file 실행 → 파일 내용 반환                             │
│ PostToolUse 훅 → 허용                                    │
│ ToolResult(파일 내용) 세션에 추가                           │
└────────────────────────────────────────────────────────┘
    │ (도구 사용 있었으므로 계속)
    ▼
┌─── 반복 2 ──────────────────────────────────────────────┐
│ API 호출 → 어시스턴트: "버그를 찾았습니다. 수정합니다"            │
│            + ToolUse(edit_file, {...})                 │
│                                                        │
│ 권한 확인 (WorkspaceWrite 필요) → 승인                     │
│ edit_file 실행 → 수정 완료                                │
│ ToolResult(수정 결과) 세션에 추가                           │
└────────────────────────────────────────────────────────┘
    │ (도구 사용 있었으므로 계속)
    ▼
┌─── 반복 3 ──────────────────────────────────────────────┐
│ API 호출 → 어시스턴트: "수정을 완료했습니다. [설명...]"          │
│            (도구 호출 없음)                               │
│                                                        │
│ pending_tool_uses.is_empty() → BREAK                   │
└────────────────────────────────────────────────────────┘
    │
    ▼
TurnSummary 반환 (3회 반복, 총 사용량)

TurnSummary: 턴의 결과물

pub struct TurnSummary {
    pub assistant_messages: Vec<ConversationMessage>,
    pub tool_results: Vec<ConversationMessage>,
    pub iterations: usize,
    pub usage: TokenUsage,
}

턴 동안 생성된 모든 어시스턴트 메시지와 도구 결과, 반복 횟수, 누적 토큰 사용량을 포함한다. CLI는 이 요약을 사용하여 상태바에 비용과 토큰 정보를 표시한다.

3. 훅 시스템: 도구 실행의 관문

HookRunner의 동작 원리

훅은 도구 실행 전후에 외부 명령을 서브프로세스로 실행하는 확장 메커니즘이다.

pub struct HookRunner {
    config: RuntimeHookConfig,
}

impl HookRunner {
    pub fn run_pre_tool_use(&self, tool_name: &str, tool_input: &str)
        -> HookRunResult {
        self.run_commands(HookEvent::PreToolUse, ...)
    }

    pub fn run_post_tool_use(&self, tool_name: &str, tool_input: &str,
        tool_output: Option<&str>, is_error: bool)
        -> HookRunResult {
        self.run_commands(HookEvent::PostToolUse, ...)
    }
}

훅 실행 프로토콜

훅 명령은 셸 프로세스로 실행되며, JSON 페이로드를 stdin으로 받는다:

let payload = json!({
    "hook_event_name": event.as_str(),    // "PreToolUse" 또는 "PostToolUse"
    "tool_name": tool_name,
    "tool_input": parse_tool_input(tool_input),
    "tool_input_json": tool_input,
    "tool_output": tool_output,           // PostToolUse에서만
    "tool_result_is_error": is_error,
});

환경변수도 설정된다: HOOK_EVENT, HOOK_TOOL_NAME, HOOK_TOOL_INPUT, HOOK_TOOL_IS_ERROR. 이중 인터페이스(stdin JSON + 환경변수)는 훅 스크립트가 자신에게 편한 방식을 선택할 수 있게 한다.

종료 코드의 의미

종료 코드	의미	동작
0	허용	stdout를 메시지로 수집
2	거부	도구 실행 차단, stdout를 거부 사유로 사용
기타	경고	도구 실행은 허용, 경고 로그

종료 코드 2를 "거부"로 선택한 것은 Unix 관례를 따른 것이다. 1은 일반적인 에러, 2는 의도적인 거부(예: grep에서 매칭 실패)라는 의미를 갖는다.

플랫폼별 셸 호출

#[cfg(windows)]
// cmd /C <command>

#[cfg(not(windows))]
// sh -lc <command>

sh -lc의 -l 플래그가 중요하다. 로그인 셸로 실행하여 사용자의 PATH와 환경 설정을 상속받는다. 훅 스크립트가 사용자가 설치한 도구들을 사용할 수 있게 하는 세심한 배려다.

4. 세션 압축: 토큰 예산 관리의 기술

대화가 길어지면 세션에 축적된 메시지가 컨텍스트 윈도우를 초과한다. compact.rs는 이 문제를 요약 기반 압축으로 해결한다.

압축 설정

pub struct CompactionConfig {
    pub preserve_recent_messages: usize,  // 기본값: 4
    pub max_estimated_tokens: usize,      // 기본값: 10,000
}

preserve_recent_messages = 4의 의미를 생각해보자. 일반적인 대화 패턴은 [사용자 메시지, 어시스턴트 응답, 도구 호출, 도구 결과]로 구성된다. 4개를 보존하면 가장 최근의 완전한 대화 사이클 하나가 유지된다.

압축 판단: should_compact()

pub fn should_compact(session: &Session, config: CompactionConfig) -> bool {
    let start = compacted_summary_prefix_len(session);
    let compactable = &session.messages[start..];

    compactable.len() > config.preserve_recent_messages
        && compactable.iter()
            .map(estimate_message_tokens)
            .sum::<usize>() >= config.max_estimated_tokens
}

두 조건이 AND로 결합된다: (1) 보존할 메시지보다 더 많은 메시지가 있고, (2) 추정 토큰이 임계값을 초과할 때만 압축한다. 메시지가 많아도 토큰이 적으면 압축할 필요가 없다.

compacted_summary_prefix_len()은 이미 압축된 요약 메시지가 있는지 확인한다. 재압축 시 기존 요약은 건너뛰고 새 메시지만 평가한다.

토큰 추정: 4로 나누기

fn estimate_message_tokens(message: &ConversationMessage) -> usize {
    message.blocks.iter()
        .map(|block| match block {
            ContentBlock::Text { text } => text.len() / 4 + 1,
            ContentBlock::ToolUse { name, input, .. } =>
                (name.len() + input.len()) / 4 + 1,
            ContentBlock::ToolResult { tool_name, output, .. } =>
                (tool_name.len() + output.len()) / 4 + 1,
        })
        .sum()
}

문자 수 / 4 + 1 이라는 소박한 공식이다. 정확한 토큰화(tokenization)는 모델별 토크나이저가 필요하고 비용이 있지만, 압축 판단에는 이 정도 근사치면 충분하다. 영어 텍스트에서 평균 토큰 길이가 약 4문자라는 경험적 관찰에 기반한다.

+ 1은 빈 블록도 최소 1토큰으로 계산하기 위한 것이다.

compact_session(): 핵심 압축 알고리즘

pub fn compact_session(session: &Session, config: CompactionConfig)
    -> CompactionResult {
    if !should_compact(session, config) {
        return CompactionResult { /* 변경 없음 */ };
    }

    // 기존 압축 요약이 있는지 확인
    let existing_summary = session.messages.first()
        .and_then(extract_existing_compacted_summary);
    let compacted_prefix_len = usize::from(existing_summary.is_some());

    // 보존할 메시지와 제거할 메시지 분리
    let keep_from = session.messages.len()
        .saturating_sub(config.preserve_recent_messages);
    let removed = &session.messages[compacted_prefix_len..keep_from];
    let preserved = session.messages[keep_from..].to_vec();

    // 제거된 메시지 요약 생성
    let summary = merge_compact_summaries(
        existing_summary.as_deref(),
        &summarize_messages(removed)
    );

    // 압축된 세션 구성
    let continuation = get_compact_continuation_message(&summary, true, !preserved.is_empty());
    let mut compacted_messages = vec![ConversationMessage {
        role: MessageRole::System,
        blocks: vec![ContentBlock::Text { text: continuation }],
        usage: None,
    }];
    compacted_messages.extend(preserved);

    CompactionResult {
        compacted_session: Session {
            version: session.version,
            messages: compacted_messages,
        },
        removed_message_count: removed.len(),
        // ...
    }
}

압축 후의 세션 구조를 시각화하면:

압축 전:
[msg1, msg2, msg3, msg4, msg5, msg6, msg7, msg8]
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^
        제거될 메시지              보존될 메시지 (최근 4개)

압축 후:
[System(요약), msg5, msg6, msg7, msg8]
 ^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^
 요약으로 대체    원본 유지

summarize_messages(): 구조화된 요약 생성

제거되는 메시지들에서 의미 있는 정보를 추출하는 과정은 정교하다:

fn summarize_messages(messages: &[ConversationMessage]) -> String {
    // 1. 메시지 통계
    let user_count = messages.iter()
        .filter(|m| m.role == MessageRole::User).count();
    let assistant_count = messages.iter()
        .filter(|m| m.role == MessageRole::Assistant).count();

    // 2. 사용된 도구 목록 (중복 제거)
    let mut tool_names: Vec<&str> = messages.iter()
        .flat_map(|m| m.blocks.iter())
        .filter_map(|block| match block {
            ContentBlock::ToolUse { name, .. } => Some(name.as_str()),
            ContentBlock::ToolResult { tool_name, .. } => Some(tool_name.as_str()),
            _ => None,
        })
        .collect();
    tool_names.sort_unstable();
    tool_names.dedup();

    // 3. 최근 사용자 요청 (최대 3개, 160자 잘림)
    // 4. 대기 중인 작업 ("todo", "next", "pending" 키워드)
    // 5. 핵심 파일 참조 (최대 8개, 확장자 기반 필터링)
    // 6. 타임라인 (role: 요약 형태)
    // ...
}

요약에 포함되는 항목을 정리하면:

항목	추출 방식	목적
메시지 통계	role별 카운트	대화 규모 파악
도구 목록	ToolUse/ToolResult에서 추출, 중복 제거	수행된 작업 유형
최근 요청	User 메시지에서 마지막 3개, 160자 제한	작업 컨텍스트
대기 작업	"todo", "next", "pending" 키워드 검색	미완료 작업
핵심 파일	.rs, .ts, .js, .json, .md 경로 추출, 최대 8개	작업 대상
타임라인	각 메시지의 요약	대화 흐름

sort_unstable() + dedup()는 정렬 후 인접 중복을 제거하는 Rust 관용구다. sort_unstable은 sort보다 빠르고, 중복 제거에는 안정 정렬이 불필요하다.

재압축: merge_compact_summaries()

이미 한 번 압축된 세션을 다시 압축할 때, 기존 요약을 버리면 안 된다:

fn merge_compact_summaries(existing: Option<&str>, new: &str) -> String {
    let Some(existing) = existing else {
        return new.to_string();  // 첫 압축이면 그대로
    };

    // 기존 요약의 하이라이트 추출
    let previous_highlights = extract_summary_highlights(existing);
    // 새 요약의 하이라이트 추출
    let new_highlights = extract_summary_highlights(&format_compact_summary(new));

    let mut lines = vec!["<summary>".to_string()];
    if !previous_highlights.is_empty() {
        lines.push("- Previously compacted context:".to_string());
        lines.extend(previous_highlights.into_iter()
            .map(|l| format!("  {l}")));
    }
    if !new_highlights.is_empty() {
        lines.push("- Newly compacted context:".to_string());
        lines.extend(new_highlights.into_iter()
            .map(|l| format!("  {l}")));
    }
    lines.push("</summary>".to_string());
    lines.join("\n")
}

이 설계의 핵심은 계층적 요약이다. 첫 번째 압축의 요약은 "Previously compacted context"로, 두 번째 압축의 요약은 "Newly compacted context"로 구분된다. 대화가 아무리 길어져도, AI는 전체 대화의 맥락을 잃지 않는다.

계속 프롬프트: get_compact_continuation_message()

압축된 세션의 첫 번째 메시지에는 AI에게 "이전 대화를 계속하라"는 지시가 포함된다:

const COMPACT_CONTINUATION_PREAMBLE: &str =
    "This session is being continued from a previous conversation...";

pub fn get_compact_continuation_message(
    summary: &str,
    suppress_follow_up_questions: bool,
    recent_messages_preserved: bool,
) -> String {
    let mut base = format!("{COMPACT_CONTINUATION_PREAMBLE}{}",
        format_compact_summary(summary));

    if recent_messages_preserved {
        base.push_str("\n\nRecent messages are preserved verbatim.");
    }
    if suppress_follow_up_questions {
        base.push_str("\nContinue the conversation from where it left off...");
    }
    base
}

"Recent messages are preserved verbatim"이라는 문구가 중요하다. AI에게 최근 메시지는 요약이 아닌 원본 그대로라는 것을 명시하여, 요약 내용과 원본 메시지 사이의 혼동을 방지한다.

5. 토큰 사용량 추적과 비용 추정

UsageTracker: 누적 통계

pub struct UsageTracker {
    latest_turn: TokenUsage,
    cumulative: TokenUsage,
    turns: u32,
}

impl UsageTracker {
    pub fn record(&mut self, usage: TokenUsage) {
        self.latest_turn = usage;
        self.cumulative.input_tokens += usage.input_tokens;
        self.cumulative.output_tokens += usage.output_tokens;
        self.cumulative.cache_creation_input_tokens +=
            usage.cache_creation_input_tokens;
        self.cumulative.cache_read_input_tokens +=
            usage.cache_read_input_tokens;
        self.turns += 1;
    }
}

cumulative는 항상 증가한다. 세션 압축이 일어나도 누적 사용량은 줄어들지 않는다. "이 세션에 총 얼마를 썼는가"를 정확히 추적하기 위해서다.

from_session()은 기존 세션에서 UsageTracker를 복원한다. 세션을 로드하면 모든 어시스턴트 메시지의 usage를 합산하여 누적치를 재구성한다.

모델별 가격 체계

pub fn pricing_for_model(model: &str) -> Option<ModelPricing> {
    let normalized = model.to_ascii_lowercase();
    if normalized.contains("haiku") {
        return Some(ModelPricing {
            input_cost_per_million: 1.0,
            output_cost_per_million: 5.0,
            cache_creation_cost_per_million: 1.25,
            cache_read_cost_per_million: 0.1,
        });
    }
    if normalized.contains("opus") {
        return Some(ModelPricing {
            input_cost_per_million: 15.0,
            output_cost_per_million: 75.0,
            cache_creation_cost_per_million: 18.75,
            cache_read_cost_per_million: 1.5,
        });
    }
    // sonnet은 기본 티어
    // ...
}

모델명에 contains()를 사용하는 것은 claude-3-5-haiku-20250101 같은 전체 모델 ID와 haiku 같은 약칭 모두를 매칭하기 위해서다.

캐시 토큰의 가격이 일반 토큰과 다르다는 점이 중요하다. Haiku의 경우 캐시 읽기가 $0.10/M으로 일반 입력($1.00/M)의 10분의 1이다. 프롬프트 캐싱을 잘 활용하면 비용을 크게 절감할 수 있다는 뜻이다.

비용 계산 공식

fn cost_for_tokens(tokens: u32, usd_per_million_tokens: f64) -> f64 {
    f64::from(tokens) / 1_000_000.0 * usd_per_million_tokens
}

단순명쾌한 공식: (토큰 수 ÷ 1,000,000) × 백만 토큰당 가격. 결과는 미국 달러(USD)다.

6. 시스템 프롬프트 빌더: 컨텍스트의 조립

빌드 순서

SystemPromptBuilder.build()는 시스템 프롬프트를 10개의 섹션으로 조립한다:

pub fn build(&self) -> Vec<String> {
    let mut sections = Vec::new();
    sections.push(get_simple_intro_section(...));     // 1. 인트로
    // 2. 출력 스타일 (선택)
    sections.push(get_simple_system_section());       // 3. 시스템 지침
    sections.push(get_simple_doing_tasks_section());  // 4. 작업 수행 가이드
    sections.push(get_actions_section());             // 5. 실행 제약사항
    sections.push(SYSTEM_PROMPT_DYNAMIC_BOUNDARY);    // 6. 동적 경계
    sections.push(self.environment_section());        // 7. 환경 정보
    // 8. 프로젝트 컨텍스트 (선택)
    // 9. 지시 파일 (CLAW.md 등)
    // 10. 런타임 설정 (선택)
    sections
}

SYSTEM_PROMPT_DYNAMIC_BOUNDARY가 흥미롭다. 이 마커 위의 내용은 모든 대화에서 동일한 정적 지침이고, 아래의 내용은 프로젝트와 환경에 따라 달라지는 동적 컨텍스트다. Anthropic의 프롬프트 캐싱에서 이 경계가 캐시 유효 범위를 결정한다.

CLAUDE.md 발견 알고리즘

fn discover_instruction_files(cwd: &Path) -> io::Result<Vec<ContextFile>> {
    // 현재 디렉토리 → 루트까지 부모 체인 수집
    let mut directories = Vec::new();
    let mut cursor = Some(cwd);
    while let Some(dir) = cursor {
        directories.push(dir.to_path_buf());
        cursor = dir.parent();
    }
    directories.reverse();  // 루트부터 시작

    let mut files = Vec::new();
    for dir in directories {
        for candidate in [
            dir.join("CLAW.md"),
            dir.join("CLAW.local.md"),
            dir.join(".claw").join("CLAW.md"),
            dir.join(".claw").join("instructions.md"),
        ] {
            push_context_file(&mut files, candidate)?;
        }
    }
    Ok(dedupe_instruction_files(files))
}

루트부터 현재 디렉토리까지 탐색하는 것이 핵심이다. 이 순서 덕분에 상위 디렉토리의 지시(예: 회사 전체 규칙)가 먼저 적용되고, 하위 디렉토리의 지시(예: 프로젝트별 규칙)가 나중에 적용된다. 나중에 오는 지시가 앞선 지시를 보충하거나 재정의할 수 있다.

중복 제거는 내용 해시 기반이다:

fn dedupe_instruction_files(files: Vec<ContextFile>) -> Vec<ContextFile> {
    let mut seen_hashes = Vec::new();
    for file in files {
        let hash = stable_content_hash(&normalize_instruction_content(&file.content));
        if seen_hashes.contains(&hash) { continue; }
        seen_hashes.push(hash);
        deduped.push(file);
    }
    deduped
}

파일 경로가 아닌 내용으로 중복을 판단한다. 같은 내용의 CLAUDE.md가 여러 경로에 있어도 한 번만 포함된다.

지시 파일 예산

const MAX_INSTRUCTION_FILE_CHARS: usize = 4_000;   // 파일당 최대
const MAX_TOTAL_INSTRUCTION_CHARS: usize = 12_000;  // 전체 최대

시스템 프롬프트가 너무 길어지면 AI의 성능이 저하되고, 토큰 비용이 증가한다. 파일당 4,000자, 전체 12,000자 제한은 이 두 가지를 균형 잡는 실용적인 선택이다.

7. 부트스트랩: 시작의 12단계

Claw Code는 시작 시 12단계의 부트스트랩을 거친다:

pub enum BootstrapPhase {
    CliEntry,                    // CLI 진입점
    FastPathVersion,             // 버전 빠른 경로
    StartupProfiler,             // 시작 프로파일링
    SystemPromptFastPath,        // 프롬프트 캐싱
    ChromeMcpFastPath,           // Chrome MCP 초기화
    DaemonWorkerFastPath,        // 데몬 워커
    BridgeFastPath,              // 브릿지 설정
    DaemonFastPath,              // 데몬 준비
    BackgroundSessionFastPath,   // 백그라운드 세션
    TemplateFastPath,            // 템플릿 로드
    EnvironmentRunnerFastPath,   // 환경 실행자
    MainRuntime,                 // 메인 런타임 시작
}

"FastPath"라는 접미사가 여러 단계에 붙어 있다. 이는 원본 Claude Code의 시작 최적화 전략을 반영한 것으로, 필수 정보만 빠르게 로드하고 나머지는 지연 초기화하는 패턴이다.

BootstrapPlan은 단계 목록을 중복 제거하여 보관한다:

pub fn from_phases(phases: Vec<BootstrapPhase>) -> Self {
    let mut deduped = Vec::new();
    for phase in phases {
        if !deduped.contains(&phase) {
            deduped.push(phase);
        }
    }
    Self { phases: deduped }
}

같은 단계가 여러 번 등록되어도 한 번만 실행된다. 설정 오류에 대한 방어적 프로그래밍이다.

마치며: 런타임의 설계 원칙

runtime 크레이트를 관통하는 설계 원칙을 정리하면:

Agentic Loop은 종료 조건이 핵심이다. "도구 호출이 없을 때" 루프가 끝난다는 단순한 규칙이, AI 에이전트의 자율적 행동과 인간의 통제 사이의 균형을 만든다. max_iterations는 그 균형의 안전장치다.

세션은 진실의 원천(source of truth)이다. 모든 메시지, 도구 호출, 도구 결과가 세션에 기록된다. API 호출은 항상 전체 세션을 전송한다. 이 설계 덕분에 세션만 있으면 대화를 완전히 재현할 수 있다.

압축은 정보의 손실이 아니라 변환이다. 오래된 메시지를 버리는 대신 구조화된 요약으로 변환한다. 도구 목록, 핵심 파일, 대기 작업 같은 에이전트에게 중요한 정보가 우선적으로 보존된다.

훅은 서브프로세스 기반의 확장이다. 런타임과 같은 프로세스에서 실행되지 않고, 독립적인 셸 프로세스로 실행된다. 이 분리가 언어 독립적 확장성과 보안 격리를 동시에 제공한다.

다음 편에서는 이 런타임 위에서 동작하는 도구 시스템과 권한 모델을 더 깊이 분석한다. 18개 내장 도구가 어떻게 정의되고, 3단계 권한이 실제로 어떻게 적용되는지 살펴본다.

실습 가이드

# 1. 세션 JSON 구조 직접 확인
find ~/.claude/sessions -name "*.json" | head -1 | xargs cat | jq .

# 2. compact 알고리즘 트레이싱
cd rust && cargo test -p runtime compact -- --nocapture

# 3. 토큰 추정 검증
# 100자 문자열의 추정치: 100/4+1 = 26토큰
# 실제 토큰 수와 비교해보기

# 4. CLAUDE.md 발견 경로 확인
find / -name "CLAW.md" -o -name "CLAUDE.md" 2>/dev/null

# 5. SystemPromptBuilder 동작 확인
cargo test -p runtime prompt -- --nocapture

이 글은 [Claw Code 깊이 읽기] 시리즈의 4편입니다.

시리즈 목차:
1. 프로젝트 배경과 아키텍처 개요
2. Rust 워크스페이스 아키텍처 심층 분석
3. API 통신과 SSE 스트리밍 구현
4. 대화 런타임과 세션 관리 ← 현재 글
5. 도구 시스템과 권한 모델
6. Python 포팅 워크스페이스 분석
7. 테스팅 전략과 패리티 추적
8. TUI 개선 로드맵과 미래 방향

태그: #ClawCode #Rust #AgenticLoop #세션관리 #토큰압축 #ConversationRuntime #AI에이전트 #시스템프롬프트

조현상

꿈꾸는 개발자

이전 포스트

# Claw Code 깊이 읽기 #3 — API 통신과 SSE 스트리밍 구현

다음 포스트