🔍 OpenAI SDK를 활용한 Deep Research 패턴 구현

twonezero·2026년 2월 2일

📝 서론

비동기 프로그래밍과 AI 에이전트를 결합하여 심층 검색(Deep Research) 시스템을 구현하는 방법을 정리했습니다. 이 글에서는 OpenAI SDK의 Agent 프레임워크를 활용하여 Plan → Search → Report 3단계 파이프라인을 구성하는 실제 구현 방법을 다룹니다.

핵심은 Python의 asyncio를 통한 병렬 처리와 Pydantic 기반 Structured Output을 통한 응답 제어입니다.

🧩 AsyncIO와 코루틴의 이해

AsyncIO 기본 개념

async def로 정의된 함수는 비동기 함수로, await 키워드를 사용하여 비동기 작업을 수행할 수 있습니다. 이러한 함수는 코루틴(Coroutine)을 생성합니다.

async def example_async_function():
    result = await some_async_operation()
    return result

코루틴(Coroutine)이란?

일반 함수와 비슷하지만, await 키워드를 사용하여 실행을 일시 중지하고 나중에 다시 시작할 수 있습니다.
호출 시 coroutine object를 반환합니다.
Event Loop가 코루틴을 실행합니다.

Event Loop의 역할

Event Loop는 비동기 작업을 스케줄링하고 실행하는 역할을 합니다. 특정 coroutine이 waiting 상태라면 다른 coroutine을 실행하여 효율적인 동시성(concurrency)을 제공합니다.

💡
Event Loop는 단일 스레드 내에서 여러 작업을 동시에 처리할 수 있게 해주는 핵심 메커니즘입니다. I/O bound 작업에서 특히 효과적입니다.

🤖 OpenAI SDK Agent 프레임워크

Agent와 Tool 등록

OpenAI SDK는 AI 에이전트를 쉽게 구축할 수 있는 라이브러리를 제공합니다. @function_tool 데코레이터를 통해 Python 함수를 tool로 등록할 수 있습니다.

from agents import Agent, function_tool

@function_tool
def my_tool_function(param: str) -> str:
    """도구 설명"""
    return f"결과: {param}"

agent = Agent(
    name="MyAgent",
    instructions="당신의 역할은...",
    tools=[my_tool_function],
    model="gpt-4o-mini"
)

Agent를 Tool로 등록

Agent 자체도 다른 Agent의 tool로 등록할 수 있습니다. 이는 as_tool() 메서드를 사용합니다.

agent1.as_tool(
    tool_name="sales_agent1", 
    tool_description="영업 관련 질문에 답변하는 에이전트"
)

Handoffs: Agent 간 대화 관리

Handoffs는 Agent 간의 작업 전달을 관리하는 기능입니다. 특정 역할을 가진 Agent에게 작업을 전달하고, 해당 Agent의 작업이 끝나면 다른 Agent에게 작업을 전달할 수 있습니다.

handoffs_description 매개변수에 에이전트의 역할 또는 임무를 설명합니다.

🛡️ Guardrails 패턴

Guardrail을 Agent로 구현

가드레일 자체를 Agent로 구현하여 입력 검증 로직을 구조화할 수 있습니다.

from pydantic import BaseModel
from agents import Agent, input_guardrail, Runner, GuardrailFunctionOutput

class NameCheckOutput(BaseModel):
    is_name_in_message: bool
    name: str

guardrail_agent = Agent( 
    name="Name check",
    instructions="Check if the user is including someone's personal name in what they want you to do.",
    output_type=NameCheckOutput,  # Structured output
    model="gpt-4o-mini"
)

@input_guardrail
async def guardrail_against_name(ctx, agent, message):
    result = await Runner.run(guardrail_agent, message, context=ctx.context)
    is_name_in_message = result.final_output.is_name_in_message
    return GuardrailFunctionOutput(
        output_info={"found_name": result.final_output},
        tripwire_triggered=is_name_in_message
    )

💡
Pydantic 모델을 output_type으로 지정하면, Agent가 자유 형식이 아닌 정해진 구조로만 응답하도록 강제할 수 있습니다. 이는 강력한 가드레일 역할을 합니다.

🔬 Deep Research 구현 패턴

Deep Research는 OpenAI SDK의 내장 WebSearchTool과 Agent의 Structured Output을 활용하여 심층 검색 기능을 구현하는 패턴입니다.

전체 아키텍처: Plan → Search → Report

graph LR
    A[사용자 쿼리] --> B[Planner Agent]
    B -->|검색 계획| C[Search Agent]
    C -->|병렬 검색| D[검색 결과들]
    D --> E[Writer Agent]
    E --> F[최종 리포트]

📋 1단계: Plan - Structured Outputs를 통한 검색 계획

Pydantic 모델 정의

검색 계획을 구조화하기 위해 Pydantic 모델을 정의합니다.

from pydantic import BaseModel, Field

class WebSearchItem(BaseModel):
    reason: str = Field(description="이 검색이 필요한 이유")
    query: str = Field(description="검색어")

class WebSearchPlan(BaseModel):
    searches: list[WebSearchItem] = Field(description="수행할 웹 검색 목록")

Planner Agent 구성

from agents import Agent

HOW_MANY_SEARCHES = 5

planner_agent = Agent(
    name="PlannerAgent",
    instructions=f"You are a helpful research assistant. Given a query, come up with a set of web searches to perform to best answer the query. Output {HOW_MANY_SEARCHES} terms to query for.",
    model="gpt-4o-mini",
    output_type=WebSearchPlan,  # ← Structured Output (Format Guardrail)
)

💡
output_type을 지정하면 LLM이 반드시 해당 스키마에 맞는 JSON만 반환합니다. 이는 파싱 오류를 원천 차단하는 강력한 방법입니다.

실제 사용 예시

result = await Runner.run(planner_agent, f"Query: {query}")
search_plan = result.final_output_as(WebSearchPlan)
print(f"Will perform {len(search_plan.searches)} searches")

🌐 2단계: Search - 병렬 검색 실행

OpenAI WebSearchTool 활용

Search Agent는 OpenAI의 내장 WebSearchTool을 사용하여 실제 웹 검색을 수행합니다.

from agents import Agent, WebSearchTool

search_agent = Agent(
    name="Search agent",
    instructions="검색어에 대해 2-3문단으로 요약하세요.",
    tools=[WebSearchTool(search_context_size="low")],  # OpenAI 내장 검색 툴
    model="gpt-4o-mini",
)

AsyncIO를 통한 병렬 검색 구현

여러 검색어를 병렬(Parallel)로 처리하는 것이 Deep Research의 핵심입니다. asyncio.create_task와 asyncio.gather를 사용합니다.

import asyncio

async def perform_searches(search_plan: WebSearchPlan) -> list[str]:
    """검색 계획의 모든 검색을 병렬로 실행"""
    # 각 검색 아이템을 Task로 생성
    tasks = [
        asyncio.create_task(search(item)) 
        for item in search_plan.searches
    ]
    # 병렬로 실행 및 결과 수집
    results = await asyncio.gather(*tasks)
    return results

async def search(item: WebSearchItem) -> str:
    """개별 검색 수행"""
    input_text = f"Search term: {item.query}\nReason for searching: {item.reason}"
    result = await Runner.run(search_agent, input_text)
    return str(result.final_output)

💡
asyncio.gather(*tasks)는 모든 Task를 병렬로 실행하고, 모든 결과가 완료될 때까지 기다립니다. 5개의 검색을 순차적으로 하면 5배의 시간이 걸리지만, 병렬 처리로 큰 성능 향상을 얻을 수 있습니다.

실제 구현: 진행 상황 추적

실제 프로젝트에서는 asyncio.as_completed를 사용하여 완료되는 검색부터 처리하고 진행 상황을 추적할 수 있습니다.

async def perform_searches(self, search_plan: WebSearchPlan) -> list[str]:
    """검색을 병렬로 수행하고 진행 상황을 추적"""
    num_completed = 0
    tasks = [
        asyncio.create_task(self.search(item)) 
        for item in search_plan.searches
    ]

    results = []
    for task in asyncio.as_completed(tasks):
        result = await task
        if result is not None:
            results.append(result)
        num_completed += 1
        print(f"Searching... {num_completed}/{len(tasks)} completed")
    
    return results

✍️ 3단계: Report - 최종 보고서 작성

Writer Agent 구성

검색 결과를 종합하여 최종 리포트를 작성하는 Agent입니다.

from pydantic import BaseModel, Field

class ReportData(BaseModel):
    short_summary: str = Field(
        description="A short 2-3 sentence summary of the findings."
    )
    markdown_report: str = Field(description="The final report")
    follow_up_questions: list[str] = Field(
        description="Suggested topics to research further"
    )

writer_agent = Agent(
    name="WriterAgent",
    instructions=(
        "You are a senior researcher tasked with writing a cohesive report for a research query. "
        "You will be provided with the original query, and some initial research done by a research assistant.\n"
        "You should first come up with an outline for the report that describes the structure and "
        "flow of the report. Then, generate the report and return that as your final output.\n"
        "The final output should be in markdown format, and it should be lengthy and detailed. Aim "
        "for 5-10 pages of content, at least 1000 words.\n"
        "무조건 한국어로 작성해줘."
    ),
    model="gpt-4o-mini",
    output_type=ReportData,
)

보고서 생성

async def write_report(query: str, search_results: list[str]) -> ReportData:
    """검색 결과를 바탕으로 최종 리포트 작성"""
    input_text = f"Original query: {query}\nSummarized search results: {search_results}"
    result = await Runner.run(writer_agent, input_text)
    return result.final_output_as(ReportData)

🏗️ 전체 시스템 통합: ResearchManager

모든 단계를 통합하는 ResearchManager 클래스입니다.

class ResearchManager:
    def __init__(self, api_key: str):
        """Initialize the ResearchManager with an OpenAI API key"""
        self.api_key = api_key

    async def run(self, query: str):
        """Run the deep research process, yielding the status updates and the final report"""
        # 1. Planning phase
        search_plan = await self.plan_searches(query)
        
        # 2. Searching phase
        search_results = await self.perform_searches(search_plan)
        
        # 3. Writing phase
        report = await self.write_report(query, search_results)
        
        return report

API Key 관리 패턴

Gradio app 을 허깅페이스에 업로드하기 위해 동적으로 API Key를 관리하기 위해 환경 변수를 임시로 설정하는 패턴을 사용합니다.

async def plan_searches(self, query: str) -> WebSearchPlan:
    """Plan the searches to perform for the query"""
    original_key = os.environ.get("OPENAI_API_KEY")
    os.environ["OPENAI_API_KEY"] = self.api_key
    try:
        result = await Runner.run(planner_agent, f"Query: {query}")
        return result.final_output_as(WebSearchPlan)
    finally:
        # Restore original key
        if original_key:
            os.environ["OPENAI_API_KEY"] = original_key
        else:
            os.environ.pop("OPENAI_API_KEY", None)

💡
API Key를 다룰 때는 반드시 try-finally 블록을 사용하여 원래 환경 변수를 복원해야 합니다. 그렇지 않으면 다른 코드에서 잘못된 API Key를 사용할 수 있습니다.

🖥️ Gradio UI 통합

실시간 상태 업데이트를 제공하는 Gradio UI를 구축했습니다.

import gradio as gr

async def run(query: str, api_key: str):
    """Run research and yield status updates"""
    # Validate API key
    if not api_key or not api_key.strip():
        yield ("", "❌ **Error: Please provide a valid OpenAI API key**", "")
        return

    # Clear query box and show initial status
    yield ("", "🚀 **Starting research...**", "")

    status_text = ""
    final_report = ""

    async for chunk in ResearchManager(api_key).run(query):
        # Check if this chunk contains the final report
        if "---" in chunk:
            parts = chunk.split("## 📊 Research Report")
            if len(parts) == 2:
                status_text = parts[0]
                final_report = "## 📊 Research Report" + parts[1]
                yield ("", status_text, final_report)
        else:
            status_text = chunk
            yield ("", status_text, final_report)

CSS 애니메이션

.status-container {
    animation: fadeBlur 1.5s ease-in-out infinite;
}

@keyframes fadeBlur {
    0%, 100% {
        opacity: 1;
        filter: blur(0px);
    }
    50% {
        opacity: 0.6;
        filter: blur(0.5px);
    }
}

🎯 핵심 설계 원칙

1. Separation of Concerns

Plan, Search, Report 각 단계를 독립적인 Agent로 분리하여 관심사를 명확히 구분했습니다.

2. Structured Output을 통한 신뢰성 확보

Pydantic 모델을 사용하여 Agent의 출력 형식을 강제함으로써, 파싱 오류와 예측 불가능한 응답을 방지했습니다.

3. AsyncIO를 통한 성능 최적화

병렬 검색 처리로 전체 실행 시간을 크게 단축했습니다. 순차 처리 대비 N배의 성능 향상을 달성했습니다.

4. 에러 처리와 복원력

API Key 관리, 예외 처리, 환경 변수 복원 등을 통해 시스템의 안정성을 확보했습니다.

💡 결론

OpenAI SDK의 Agent 프레임워크와 Python AsyncIO를 결합하여 효율적인 Deep Research 시스템을 구현했습니다.
핵심은 다음 세 가지입니다:

Structured Output을 통한 제어 가능성: Pydantic 모델로 LLM 응답을 구조화

병렬 처리를 통한 성능: AsyncIO로 여러 검색을 동시에 실행

명확한 단계 분리: Plan → Search → Report의 3단계 파이프라인

이 패턴은 단순한 Q&A를 넘어서, 논리적이고 체계적인 심층 리서치 결과를 제공하는 AI 시스템을 구축하는 데 효과적입니다.

📚 참고 사항

OpenAI SDK 설명

twonezero

I Enjoy Learn-and-Run Vibe😊

이전 포스트

🚀 Next.js 16 캐싱 적용해보기

다음 포스트