[Tool Calling] Tool Calling과 MCP, Agent

kimjayhyun·2025년 6월 29일

Chat Template LLM MCP Speical Token agent jsonrpc

[LLM Special Token] Tool Calling, MCP Agent 이해하기

목록 보기

3/3

📝 TL;DR

📖 읽기 정보
⏱️ 소요 시간: 약 10분 | 🎯 난이도: 중급 | 📚 사전 지식: 1편, 2편 권장

이 글은 Llama 3.1의 Tool Calling 흐름을 이해하고 이를 통해 MCP 사용 흐름과 Agent의 정의에 대해 알아볼 수 있습니다.

이 글의 핵심 3줄 요약

Agent = Executor → Tool Calling에서 도구를 실제로 실행하는 주체가 바로 Agent입니다
MCP = JSON 기반 Tool Calling의 표준화 → Llama의 JSON 방식이 MCP의 동작 원리와 동일합니다
Special Token = 각 단계별 제어 신호 → <|eom_id|>, <|eot_id|>, ipython 등으로 흐름을 제어합니다

LLM 모델마다 Speical Token은 다르지만 흐름은 같습니다.

왜 이게 중요한가요?

🤖 Agent 개념 이해: AI 시스템에서 "실행 주체"가 무엇인지 명확히 정의
🔄 MCP 동작 원리: 최신 AI 도구 연결 표준의 내부 구조 파악
⚡ 실무 활용: Special Token 지식으로 디버깅과 커스터마이징 능력 향상

들어가며

이 포스팅은 아래 문서를 참고하여 정리하고 번역한 글입니다.

공식 문서

🔄 이전 글에서

1편에서는 LLM의 Special Token과 Chat Template 기본 개념을 알아봤고,
2편에서는 Llama 모델의 구체적인 Special Token 구현을 살펴봤습니다.

💡 MCP와의 연관성

MCP(Model Context Protocol)는 LLM과 외부 도구를 연결하는 표준화된 프로토콜입니다.

Llama의 Tool Calling 흐름이 MCP의 동작 방식과 아주 비슷하므로,
이 포스팅을 통해 MCP가 어떻게 동작하는지도 함께 이해할 수 있습니다!

이 포스팅을 통해서 Agent의 정의를 어떻게 내릴 수 있을지 알 수 있습니다.

📋 이번 글에서 다룰 내용

<|python_tag|>, <|eom_id|>, <|eot_id|> 등이 Tool Calling에서 수행하는 역할
Chat Template이 Tool Calling 워크플로우에서 메시지를 어떻게 구조화하는지
왜 Special Token 이해가 Tool Calling 디버깅에 필수적인지

⚠️ 주의사항

Llama 8B-Instruct는 Tool Calling과 대화 유지를 동시에 잘 하지 못하므로
대화 + 도구 호출에는 Llama 70B 또는 405B 모델을 권장.
시스템 프롬프트 또는 사용자 프롬프트에 여러 개의 도구를 정의하면,
모델이 여러 도구 호출을 생성할 수 있으므
프롬프트 조정을 하며 결과를 적절히 파싱(parse) 해야함.

🔧 Llama 3.1 Tool Calling

⚙️ Tool Calling Workflow

다른 LLM 모델도 위와 같은 흐름으로 MCP를 사용합니다.

눈 여겨 볼 점은 2편에서 언급한 것처럼 모델 자체는 코드를 실행하지 않는다는 것입니다.

즉, 우리는 Executor를 Agent라고 정의할 수 있습니다!!

✅ Tool Calling 방식

Llama 3.1 모델은 외부 도구 호출 기능을 지원하며, 총 세 가지 방식이 있습니다.

내장 도구 호출 (Built-in Tools)
JSON 기반 도구 호출 (JSON-based Tool Calling)
- MCP가 사용하는 방식
사용자 정의 도구 호출 (User-defined Custom Tool Calling)

1. 📡 Built-in Tools (내장 도구)

내장 도구 호출은 2 편에서 설명한 방식으로 동작을 합니다.

🛠️ 내장된 도구 종류

brave_search: 웹 검색
wolfram_alpha: 수학 계산 및 지식 질의

📌 예시

1. 사용자 및 시스템템 프롬프트

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Environment: ipython
Tools: brave_search, wolfram_alpha
Cutting Knowledge Date: December 2023
Today Date: 23 July 2024

You are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>

Can you help me solve this equation: x^3 - 4x^2 + 6x - 24 = 0<|eot_id|><|start_header_id|>assistant<|end_header_id|>

2. LLM이 호출할 도구 결정

<|python_tag|>wolfram_alpha.call(query="solve x^3 - 4x^2 + 6x - 24 = 0")<|eom_id|>

🔎 중요 포인트

LLM 응답이 특수 태그 <|python_tag|> 로 시작합니다.
모델이 일반적인 <|eot_id|> 대신 <|eom_id|> 태그를 생성할 수 있습니다.
- <|eot_id|>는 턴이 종료되었음을 나타내지만,
- <|eom_id|>는 다중 단계 추론이 계속됨을 의미합니다.
즉, 도구 호출의 실행 결과를 포함한 후속 메시지가 올 것으로 모델이 기대하고 있다는 뜻입니다.
내장 도구(built-in tool)는 JSON 형식이 아니라 Python 문법을 사용하여 호출됩니다.
모델이 각 도구에 대해 생성하는 Python 호출 구문은 다음과 같은 형태입니다

# for Search
<|python_tag|>
brave_search.call(query="...")
<|eom_id|>

# for Wolfram
<|python_tag|>
wolfram_alpha.call(query="...")
<|eom_id|>

3. 도구(Wolfram Alpha)가 생성한 응답

{
  "queryresult": {
    "success": true,
    "inputstring": "solve x^3 - 4x^2 + 6x - 24 = 0",
    "pods": [
      {
        "title": "Input interpretation",
        "subpods": [
          {
            "title": "",
            "plaintext": "solve x^3 - 4 x^2 + 6 x - 24 = 0"
          }
        ]
      },
      {
        "title": "Results",
        "primary": true,
        "subpods": [
          {
            "title": "",
            "plaintext": "x = 4"
          },
          {
            "title": "",
            "plaintext": "x = ± (i sqrt(6))"
          }
        ]
      },
      ...
    ]
  }
}

4. 도구 응답과 함께 모델 재요청

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Environment: ipython
Tools: brave_search, wolfram_alpha
Cutting Knowledge Date: December 2023
Today Date: 23 July 2024

You are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>


Can you help me solve this equation: x^3 - 4x^2 + 6x - 24 = 0<|eot_id|><|start_header_id|>assistant<|end_header_id|>

<|python_tag|>wolfram_alpha.call(query="solve x^3 - 4x^2 + 6x - 24 = 0")<|eom_id|><|start_header_id|>ipython<|end_header_id|>

{"queryresult": {"success": true, "inputstring": "solve x^3 - 4x^2 + 6x - 24 = 0", "pods": [{"title": "Input interpretation", "subpods": [{"title": "", "plaintext": "solve x^3 - 4 x^2 + 6 x - 24 = 0"}]}, {"title": "Results", "primary": true, "subpods": [{"title": "", "plaintext": "x = 4"}, {"title": "", "plaintext": "x = \u00b1 (i sqrt(6))"}]}, ... ]}}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

5. 에이전트에서 사용자로의 최종 응답

The solutions to the equation x^3 - 4x^2 + 6x - 24 = 0 are x = 4 and x = ±(i√6).<|eot_id|>

💡 내장 도구 호출 요약

<|python_tag|> → 도구 호출 시작
<|eom_id|> → 추가 처리 대기 (턴 종료 아님)
<|start_header_id|>ipython<|end_header_id|> → 도구 실행 결과
<|eot_id|> → 최종 응답 완료

2. 📦 JSON 기반 Tool Calling

📒 Notes

⚙️ 시스템 프롬프트 조정
모델이 도구 호출 결과를 어떻게 처리할지 알려주는 지침이 포함되어야 합니다.
🔧 도구 정의 위치
모델 학습 시에는 user prompt에 도구 정의를 포함시켰으므로,
사용자 프롬프트에 도구 정의를 제공하는 것이 일반적입니다.
🔄 유연한 위치 설정
도구 정의는 시스템 프롬프트에 포함해도 작동 가능하며,
개발자는 각 상황에 맞게 테스트해보는 것이 중요합니다.

📌 예시

1. 사용자 정의 도구 정보가 포함된 사용자 프롬프트

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Cutting Knowledge Date: December 2023
Today Date: 23 July 2024

When you receive a tool call response, use the output to format an answer to the original user question.

You are a helpful assistant with tool calling capabilities.<|eot_id|><|start_header_id|>user<|end_header_id|>

Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.

Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.

{
    "type": "function",
    "function": {
    "name": "get_current_conditions",
    "description": "Get the current weather conditions for a specific location",
    "parameters": {
        "type": "object",
        "properties": {
        "location": {
            "type": "string",
            "description": "The city and state, e.g., San Francisco, CA"
        },
        "unit": {
            "type": "string",
            "enum": ["Celsius", "Fahrenheit"],
            "description": "The temperature unit to use. Infer this from the user's location."
        }
        },
        "required": ["location", "unit"]
    }
    }
}

Question: what is the weather like in San Fransisco?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

2. 모델이 호출할 도구 결정

{"name": "get_current_conditions", "parameters": {"location": "San Francisco, CA", "unit": "Fahrenheit"}}<|eot_id|>

💡 중요 포인트

시스템 프롬프트에 Environment: ipython 설정이 없으므로,

모델은 <|eom_id|> 대신 <|eot_id|> 토큰으로 응답을 마무리합니다.

내장 도구: Environment: ipython → <|eom_id|> (계속 진행)
JSON 기반: 환경 설정 없음 → <|eot_id|> (턴 종료)

3. 도구 호출 결과를 모델에 다시 전달

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the original use question.<|eot_id|><|start_header_id|>user<|end_header_id|>

Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.

Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.

{
    "type": "function",
    "function": {
    "name": "get_current_conditions",
    "description": "Get the current weather conditions for a specific location",
    "parameters": {
        "type": "object",
        "properties": {
        "location": {
            "type": "string",
            "description": "The city and state, e.g., San Francisco, CA"
        },
        "unit": {
            "type": "string",
            "enum": ["Celsius", "Fahrenheit"],
            "description": "The temperature unit to use. Infer this from the user's location."
        }
        },
        "required": ["location", "unit"]
    }
    }
}

Question: what is the weather like in San Fransisco?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{"name": "get_current_conditions", "parameters": {"location": "San Francisco, CA", "unit": "Fahrenheit"}}<|eot_id|>

## 도구 실행 결과
<|start_header_id|>ipython<|end_header_id|>

{"output": "Clouds giving way to sun Hi: 76° Tonight: Mainly clear early, then areas of low clouds forming Lo: 56°"}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

💡 중요 포인트

assistant가 선택한 도구를 Executor가 실행합니다.

도구 호출의 결과를 모델에 다시 보낼 때는 헤더에 새로운 역할인 ipython을 사용합니다.

4. 모델이 사용자에게 최종 응답 생성

The weather in Menlo Park is currently cloudy with a high of 76° and a low of 56°, with clear skies expected tonight.<eot_id>

🔄 MCP와의 연관성

위에서 본 복잡한 Tool Calling 과정을 MCP(Model Context Protocol)가 표준화해줍니다.

JSON-RPC 2.0 기반: MCP는 JSON-RPC 2.0을 통신 프로토콜로 사용하여 클라이언트-서버 간 표준화된 메시지 교환
도구 정의 표준화: 모든 도구를 동일한 JSON 스키마로 정의
통신 프로토콜 통일: 클라이언트-서버 간 일관된 메시지 형식
재사용성 향상: 한 번 만든 MCP 서버를 여러 AI 앱에서 사용 가능

즉, JSON 기반 Tool Calling으로 MCP의 내부 동작을 이해할 수 있습니다.

여전히 구현해야 하는 부분과 Agent

MCP 클라이언트 개발 (도구 호출을 중계하는 역할, Agent)
모델별 Special Token 처리 (<|eot_id|>, ipython 역할 등)
도구 실행 결과를 LLM에 전달하는 로직

💡 핵심: Tool Calling 과정을 수행하는 Executor를 Agent라고 할 수 있습니다.

🎯 마무리

3. Custom Tool Calling가 궁금하시다면 이 포스팅의 맨 아래를 참고하세요

이번 글에서는 Llama 3.1의 Tool Calling 구현을 통해 Special Token이 실제로 어떻게 활용되는지 살펴봤습니다.

🔑 핵심 내용 정리

1. Agent의 정의 발견 🤖

Executor = Agent: Tool Calling 과정을 수행하는 주체
LLM은 도구를 "선택"하고, Agent는 도구를 "실행"

2. MCP의 실체 이해 🔄

JSON 기반 Tool Calling이 MCP의 핵심 동작 방식
JSON-RPC 2.0 기반의 표준화된 통신 프로토콜
표준화를 통한 개발 복잡성 해결

3. Special Token의 실전 활용 ⚡
LLM 마다 Speical Token은 다르지만 흐름은 같습니다.

<|python_tag|> + <|eom_id|>: 내장 도구의 연속 처리
<|eot_id|> + ipython 역할: JSON 기반의 명확한 턴 구분
각 방식별 서로 다른 Special Token 전략

🚀 실무 관점에서의 의미

MCP 클라이언트 구현으로 표준화된 도구 연결 가능
Special Token 이해로 디버깅과 커스터마이징 능력 향상
JSON 기반 Tool Calling으로 MCP 생태계 활용
재사용 가능한 도구 서버로 개발 효율성 증대

💡 다음 단계

이제 Special Token → Tool Calling → MCP → Agent의 전체 흐름을 이해했습니다.

이 시리즈를 통해서 MCP를 활용하여 Agent를 구현할 때

MCP의 필요성 및 Agent에 대한 이해
MCP 클라이언트 구현 시 동작 흐름 이해
Speical Token 이해로 디버깅, 커스터마이징 능력 향상

이제 나만의 Agent를 개발해서 서비스를 준비해보는 건 어떨까요?

📚 시리즈 전체 보기

1. LLM 기초: Messages & Special Token - 기본 개념과 Chat Template
2. Llama Special Token 완전 분석 - 구체적인 구현과 활용법
3. Tool Calling과 MCP, Agent - 실전 활용과 MCP 연결 (현재 글)

3. (참고 자료) 사용자 정의 도구 호출 (Custom Tool Calling)

사실상 MCP가 표준이 되어가는 지금 이 방법은 많이 사용하지 않는 방법입니다.

아래 프롬프트들은 시스템 프롬프트에 사용자 정의 형식(custom format) 을 정의하고,

이를 모델과 상호작용할 때 어떻게 활용할 수 있는지를 보여주는 예시입니다.

이 예시는 새로운 <function> 태그를 사용하여, 함수의 파라미터를 담은 JSON 객체를 감쌉니다.

📒 Notes

시스템 프롬프트는 사용자 정의 형식(custom format) 과
모델이 사용할 수 있는 도구 목록을 정의합니다.

📌 예시

1. 사용자 정의 도구를 포함한 시스템 프롬프트

<|begin_of_text|><|start_header_id|>system<|end_header_id|>


Environment: ipython
Tools: brave_search, wolfram_alpha
Cutting Knowledge Date: December 2023
Today Date: 23 July 2024

# Tool Instructions
- Always execute python code in messages that you share.
- When looking for real time information use relevant functions if available else fallback to brave_search



You have access to the following functions:

Use the function 'spotify_trending_songs' to: Get top trending songs on Spotify
{
  "name": "spotify_trending_songs",
  "description": "Get top trending songs on Spotify",
  "parameters": {
    "n": {
      "param_type": "int",
      "description": "Number of trending songs to get",
      "required": true
    }
  }
}



If a you choose to call a function ONLY reply in the following format:
<{start_tag}={function_name}>{parameters}{end_tag}
where

start_tag => `<function`
parameters => a JSON dict with the function argument name as key and function argument value as value.
end_tag => `</function>`

Here is an example,
<function=example_function_name>{"example_name": "example_value"}</function>

Reminder:
- Function calls MUST follow the specified format
- Required parameters MUST be specified
- Only call one function at a time
- Put the entire function call reply on one line
- Always add your sources when using search results to answer the user query

You are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>

Can you check the top 5 trending songs on spotify?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

2. 호출할 Tool 선택

이제 모델은 시스템 프롬프트에 정의된 형식에 따라 응답합니다.

응답은 시작 태그에 함수 이름, 그리고 그 사이에는 JSON 객체 형태의 파라미터가 포함됩니다.

시스템 프롬프트에 Environment: ipython 지시어를 사용하고 있으므로,

모델은 응답을 <|eom_id|> 토큰으로 종료합니다.

<function=spotify_trending_songs>{"n": "5"}</function><eom_id>

3. Tool 결과를 LLM에 전달달

<|begin_of_text|><|start_header_id|>system<|end_header_id|>


Environment: ipython
Tools: brave_search, wolfram_alpha
Cutting Knowledge Date: December 2023
Today Date: 23 July 2024

# Tool Instructions
- Always execute python code in messages that you share.
- When looking for real time information use relevant functions if available else fallback to brave_search



You have access to the following functions:

Use the function 'spotify_trending_songs' to: Get top trending songs on Spotify
{"name": "spotify_trending_songs", "description": "Get top trending songs on Spotify", "parameters": {"n": {"param_type": "int", "description": "Number of trending songs to get", "required": true}}}


If a you choose to call a function ONLY reply in the following format:
<{start_tag}={function_name}>{parameters}{end_tag}
where

start_tag => `<function`
parameters => a JSON dict with the function argument name as key and function argument value as value.
end_tag => `</function>`

Here is an example,
<function=example_function_name>{"example_name": "example_value"}</function>

Reminder:
- Function calls MUST follow the specified format
- Required parameters MUST be specified
- Only call one function at a time
- Put the entire function call reply on one line
- Always add your sources when using search results to answer the user query

You are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>

Can you check the top 5 trending songs on spotify?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

<function=spotify_trending_songs>{"n": "5"}</function><|eom_id|><|start_header_id|>ipython<|end_header_id|>

["1. BIRDS OF A FEATHER by Billie Eilish", "2. Espresso by Sabrina Carpenter", "3. Please Please Please by Sabrina Carpenter", "4. Not Like Us by Kendrick Lamar", "5. Gata Only by FloyyMenor, Cris Mj"]<|eot_id|><|start_header_id|>assistant<|end_header_id|>

4. 최종 결과 전달

The top 5 trending songs on Spotify are:

1. BIRDS OF A FEATHER by Billie Eilish
2. Espresso by Sabrina Carpenter
3. Please Please Please by Sabrina Carpenter
4. Not Like Us by Kendrick Lamar
5. Gata Only by FloyyMenor, Cris Mj<|eot_id|>

kimjayhyun

이전 포스트

[Tool Calling] Tool Calling과 MCP, Agent

[LLM Special Token] Tool Calling, MCP Agent 이해하기

📝 TL;DR

들어가며

🔄 이전 글에서

💡 MCP와의 연관성

📋 이번 글에서 다룰 내용

⚠️ 주의사항

🔧 Llama 3.1 Tool Calling

⚙️ Tool Calling Workflow

✅ Tool Calling 방식

1. 📡 Built-in Tools (내장 도구)

🛠️ 내장된 도구 종류

📌 예시

1. 사용자 및 시스템템 프롬프트

2. LLM이 호출할 도구 결정

3. 도구(Wolfram Alpha)가 생성한 응답

4. 도구 응답과 함께 모델 재요청

5. 에이전트에서 사용자로의 최종 응답

💡 내장 도구 호출 요약

2. 📦 JSON 기반 Tool Calling

📒 Notes

📌 예시

1. 사용자 정의 도구 정보가 포함된 사용자 프롬프트

2. 모델이 호출할 도구 결정

3. 도구 호출 결과를 모델에 다시 전달

4. 모델이 사용자에게 최종 응답 생성

🔄 MCP와의 연관성

여전히 구현해야 하는 부분과 Agent

🎯 마무리

🔑 핵심 내용 정리

🚀 실무 관점에서의 의미

💡 다음 단계

📚 시리즈 전체 보기

3. (참고 자료) 사용자 정의 도구 호출 (Custom Tool Calling)

📒 Notes

📌 예시

1. 사용자 정의 도구를 포함한 시스템 프롬프트

2. 호출할 Tool 선택

3. Tool 결과를 LLM에 전달달

4. 최종 결과 전달

[Llama] Special Token 정리

0개의 댓글