๐Ÿงญ LangGraph ์™„์ „ ์ •๋ณต โ‘ฅ โ€” ๊ทธ๋ž˜ํ”„ ๊ธฐ๋ฐ˜ RAG (๊ฒ€์ƒ‰ + ์ƒ์„ฑ ์‹œ์Šคํ…œ)

okorionยท2025๋…„ 10์›” 4์ผ

โ€œLLM์˜ ํ•œ๊ณ„๋ฅผ ๋„˜์–ด, ๊ฒ€์ƒ‰์œผ๋กœ ๊ฐ•ํ™”๋œ ์ง€๋Šฅํ˜• ๊ทธ๋ž˜ํ”„ ๋งŒ๋“ค๊ธฐ.โ€
LangGraph๋กœ RAG(๊ฒ€์ƒ‰-์ƒ์„ฑ-ํ‰๊ฐ€-์ˆ˜์ •) ๋ฃจํ”„๋ฅผ ๊ตฌํ˜„ํ•œ๋‹ค.


๐Ÿงฉ 1. RAG๋ž€ ๋ฌด์—‡์ธ๊ฐ€?

RAG(Retrieval-Augmented Generation)๋Š” ๊ฒ€์ƒ‰(Retrieval) ๊ณผ ์ƒ์„ฑ(Generation) ์„ ๊ฒฐํ•ฉํ•œ ๊ตฌ์กฐ์ž…๋‹ˆ๋‹ค.

๋‹จ์ˆœ LLM์€ ๊ธฐ์–ต์ด ์ œํ•œ์ ์ด์ง€๋งŒ, RAG๋Š” ์™ธ๋ถ€ ์ง€์‹(์˜ˆ: ๋ฌธ์„œ, DB, ๋ฒกํ„ฐDB)์„ ๊ฒ€์ƒ‰ํ•ด ๊ทธ ๊ฒฐ๊ณผ๋ฅผ LLM ์ž…๋ ฅ์œผ๋กœ ๋„ฃ์–ด ์ •ํ™•์„ฑ์„ ๋†’์ž…๋‹ˆ๋‹ค.


โš™๏ธ 2. LangGraph์—์„œ RAG๋ฅผ ๋‹ค๋ฃจ๋Š” ๋ฐฉ์‹

LangGraph๋Š” ๊ธฐ์กด์˜ query โ†’ retrieval โ†’ generation ์ˆœ์ฐจ ๊ตฌ์กฐ๋ฅผ
๊ทธ๋ž˜ํ”„ ํ˜•ํƒœ๋กœ ํ‘œํ˜„ํ•˜์—ฌ ์กฐ๊ฑด ๋ถ„๊ธฐ, ํ”ผ๋“œ๋ฐฑ, ๋ฃจํ”„ ์žฌ์‹œ๋„ ๋“ฑ์„ ๋ช…์‹œ์ ์œผ๋กœ ์ œ์–ดํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ“Š ์ผ๋ฐ˜์  ๊ตฌ์กฐ:

์ด ๊ตฌ์กฐ๋ฅผ ๊ทธ๋Œ€๋กœ ์ฝ”๋“œ๋กœ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.


๐Ÿ’ก 3. ํ•ต์‹ฌ ๊ฐœ๋… ์š”์•ฝ

๊ฐœ๋…์—ญํ• LangGraph ํ‘œํ˜„
Retrieval์™ธ๋ถ€ ์ง€์‹ ๊ฒ€์ƒ‰Node (VectorDB Query)
Generation๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ ๊ธฐ๋ฐ˜ ์‘๋‹ต ์ƒ์„ฑNode (LLM ํ˜ธ์ถœ)
Evaluation์‘๋‹ต ํ’ˆ์งˆ ํ‰๊ฐ€Conditional Edge
Correction์ €ํ’ˆ์งˆ ์‹œ ์žฌ๊ฒ€์ƒ‰/์žฌ์ƒ์„ฑFeedback Loop
HITLHuman-in-the-Loop (์‚ฌ๋žŒ ๊ฐœ์ž…)SubGraph / Manual Node

๐Ÿง  4. LangGraph๋กœ ๊ตฌํ˜„ํ•˜๋Š” ๊ธฐ๋ณธ RAG

์•„๋ž˜๋Š” โ€œ๊ฒ€์ƒ‰ โ†’ ์ƒ์„ฑ โ†’ ํ‰๊ฐ€โ€ 3๋‹จ๊ณ„ ๊ทธ๋ž˜ํ”„๋ฅผ ๊ตฌํ˜„ํ•œ ์˜ˆ์‹œ์ž…๋‹ˆ๋‹ค.

from langgraph.graph import StateGraph
from langchain_openai import ChatOpenAI

# ์ƒํƒœ ์ •์˜
class RAGState:
    query: str
    docs: list[str]
    answer: str
    score: float

# ๊ฒ€์ƒ‰ ๋…ธ๋“œ
def retrieve(state: RAGState):
    # ์‹ค์ œ๋กœ๋Š” VectorDB (์˜ˆ: Pinecone, Chroma) ํ˜ธ์ถœ
    fake_docs = [
        "LangGraph๋Š” LangChain ์ƒํƒœ๊ณ„์˜ ๊ทธ๋ž˜ํ”„ ์›Œํฌํ”Œ๋กœ์šฐ ์—”์ง„์ด๋‹ค.",
        "LangGraph๋Š” ๋…ธ๋“œ์™€ ์—ฃ์ง€๋กœ LLM ์‹คํ–‰ ํ๋ฆ„์„ ์ œ์–ดํ•œ๋‹ค."
    ]
    state.docs = fake_docs
    print(f"[Retrieve] {len(fake_docs)}๊ฐœ์˜ ๋ฌธ์„œ ๊ฒ€์ƒ‰๋จ")
    return state

# ์ƒ์„ฑ ๋…ธ๋“œ
def generate(state: RAGState):
    llm = ChatOpenAI(model="gpt-4o-mini")
    prompt = f"๋ฌธ์„œ ๋‚ด์šฉ:\n{state.docs}\n\n์งˆ๋ฌธ: {state.query}\n๋‹ต๋ณ€:"
    result = llm.invoke(prompt)
    state.answer = result.content
    print(f"[Generate] ๋‹ต๋ณ€ ์ƒ์„ฑ ์™„๋ฃŒ")
    return state

# ํ‰๊ฐ€ ๋…ธ๋“œ
def evaluate(state: RAGState):
    llm = ChatOpenAI(model="gpt-4o-mini")
    eval_prompt = f"๋‹ค์Œ ๋‹ต๋ณ€์˜ ์ •ํ™•๋„๋ฅผ 0~1 ์‚ฌ์ด ์ ์ˆ˜๋กœ ํ‰๊ฐ€:\n{state.answer}"
    result = llm.invoke(eval_prompt)
    try:
        state.score = float(result.content.strip())
    except:
        state.score = 0.7  # ๊ธฐ๋ณธ๊ฐ’
    print(f"[Evaluate] ํ’ˆ์งˆ ์ ์ˆ˜: {state.score}")
    return "finish" if state.score >= 0.8 else "retry"

# ๊ทธ๋ž˜ํ”„ ๊ตฌ์„ฑ
graph = StateGraph(RAGState)
graph.add_node("retrieve", retrieve)
graph.add_node("generate", generate)
graph.add_node("evaluate", evaluate)

graph.add_edge("retrieve", "generate")
graph.add_edge("generate", "evaluate")
graph.add_conditional_edges("evaluate", lambda s: "finish" if s.score >= 0.8 else "retrieve")

graph.set_entry_point("retrieve")
graph.set_finish_point("evaluate")

app = graph.compile()
app.invoke({"query": "LangGraph๋Š” ๋ฌด์—‡์ธ๊ฐ€?"})

๐Ÿงฉ ์‹คํ–‰ ๊ฒฐ๊ณผ ์˜ˆ์‹œ:

[Retrieve] 2๊ฐœ์˜ ๋ฌธ์„œ ๊ฒ€์ƒ‰๋จ
[Generate] ๋‹ต๋ณ€ ์ƒ์„ฑ ์™„๋ฃŒ
[Evaluate] ํ’ˆ์งˆ ์ ์ˆ˜: 0.85

๐Ÿ” 5. Adaptive / Self / Corrective RAG ๊ตฌ์กฐ

LangGraph๋Š” RAG๋ฅผ ๋‹จ์ˆœ ์‹คํ–‰ํ˜•์—์„œ ์ง€๋Šฅํ˜• ๋ฃจํ”„ํ˜•์œผ๋กœ ๋ฐœ์ „์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์œ ํ˜•์„ค๋ช…๊ทธ๋ž˜ํ”„ ํŠน์ง•
Adaptive RAG์งˆ๋ฌธ ๋‚œ์ด๋„์— ๋”ฐ๋ผ ๊ฒ€์ƒ‰ ๊นŠ์ด๋ฅผ ์กฐ์ •โ€œ์งˆ๋ฌธ ๊ธธ์ด or ๋‚œ์ด๋„ ๊ธฐ๋ฐ˜ ์กฐ๊ฑด ๋ถ„๊ธฐโ€
Self-RAGLLM์ด ์Šค์Šค๋กœ ๋‹ต๋ณ€ ํ‰๊ฐ€ ๋ฐ ์ˆ˜์ •โ€œ์ž์ฒด ํ‰๊ฐ€ ๋ฃจํ”„โ€
Corrective RAGLLM + ์‚ฌ์šฉ์ž ํ”ผ๋“œ๋ฐฑ์œผ๋กœ ์ˆ˜์ •โ€œHITL + SubGraph ๊ตฌ์กฐโ€

๐Ÿงฉ Adaptive RAG ์˜ˆ์‹œ (์งˆ๋ฌธ ๋‚œ์ด๋„ ๊ธฐ๋ฐ˜ ๊ฒ€์ƒ‰ ๊นŠ์ด)

def choose_strategy(state):
    if len(state.query) < 20:
        state.depth = "shallow"
    else:
        state.depth = "deep"
    return state

def retrieve_shallow(state): ...
def retrieve_deep(state): ...

graph.add_node("strategy", choose_strategy)
graph.add_node("shallow", retrieve_shallow)
graph.add_node("deep", retrieve_deep)
graph.add_conditional_edges("strategy", lambda s: s.depth)

๐Ÿ“Š ์‹œ๊ฐํ™”:


๐Ÿงฉ Corrective RAG (ํ”ผ๋“œ๋ฐฑ ๋ฃจํ”„ ํฌํ•จ)

def human_feedback(state):
    feedback = input("๋‹ต๋ณ€์— ๋Œ€ํ•œ ํ”ผ๋“œ๋ฐฑ์„ ์ž…๋ ฅํ•˜์„ธ์š”: ")
    state.feedback = feedback
    return state

human_feedback ๋…ธ๋“œ๋ฅผ ๊ทธ๋ž˜ํ”„์— ํฌํ•จํ•˜๋ฉด ์‚ฌ์šฉ์ž๊ฐ€ ๋ฃจํ”„ ์ค‘๊ฐ„์— ์ง์ ‘ ๊ฐœ์ž…(HITL, Human-in-the-loop)ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.


๐Ÿ” 6. SubGraph๋กœ ๋ชจ๋“ˆํ™”ํ•˜๊ธฐ

LangGraph๋Š” SubGraph ๊ธฐ๋Šฅ์„ ํ†ตํ•ด RAG ์ „์ฒด๋ฅผ ํ•˜๋‚˜์˜ ๋ชจ๋“ˆ๋กœ ๋ฌถ์–ด ๋‹ค๋ฅธ ๊ทธ๋ž˜ํ”„์—์„œ ์žฌ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

from langgraph.graph import SubGraph

rag_module = SubGraph(graph)
main_graph.add_node("RAGPipeline", rag_module)

โœ… ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด โ€œ๊ฒ€์ƒ‰-์ƒ์„ฑ-ํ‰๊ฐ€โ€ ๋‹จ์œ„์˜ RAG ํ๋ฆ„์„ ๋‹ค๋ฅธ ์—์ด์ „ํŠธ๋‚˜ ์‹œ์Šคํ…œ์— ๊ทธ๋Œ€๋กœ ์žฌํ™œ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.


๐Ÿ“ˆ 7. ์„ฑ๋Šฅ ๊ด€๋ฆฌ ํฌ์ธํŠธ

๋Œ€๊ทœ๋ชจ RAG๋ฅผ ์šด์˜ํ•  ๋•Œ๋Š” ๋‹ค์Œ์„ ๋ฐ˜๋“œ์‹œ ๊ณ ๋ คํ•˜์„ธ์š”:

ํฌ์ธํŠธ์„ค๋ช…
์บ์‹ฑ (Cache)๋ฐ˜๋ณต ์งˆ๋ฌธ์— ๋Œ€ํ•œ ์‘๋‹ต ์ €์žฅ
๋น„๋™๊ธฐ ์ฒ˜๋ฆฌ๊ฒ€์ƒ‰/์ƒ์„ฑ ๋™์‹œ ์‹คํ–‰ (asyncio)
๋กœ๊ทธ ์ถ”์ ๊ฐ ๋…ธ๋“œ๋ณ„ latency ๊ธฐ๋ก
์˜ค๋ฅ˜ ๋ณต๊ตฌVectorDB / LLM ํ˜ธ์ถœ ์‹คํŒจ ์‹œ fallback
๋ณ‘๋ ฌ ๊ฒ€์ƒ‰์—ฌ๋Ÿฌ ์†Œ์Šค ๋™์‹œ query

LangGraph๋Š” ๊ตฌ์กฐ์ ์œผ๋กœ ์ด ๋ชจ๋“  ์ œ์–ด๋ฅผ ๊ทธ๋ž˜ํ”„ ๋ ˆ๋ฒจ์—์„œ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.


๐Ÿงญ 8. ๋‹ค์Œ ํšŒ์ฐจ ์˜ˆ๊ณ 

๐Ÿ‘‰ 7ํŽธ: LangGraph + UI ํ†ตํ•ฉ (Gradio, Streamlit)
์—ฌ๊ธฐ์„œ๋Š” ์ง€๊ธˆ ๋งŒ๋“  RAG ๊ทธ๋ž˜ํ”„๋ฅผ ์‹œ๊ฐํ™”ํ•˜๊ณ  ์กฐ์ž‘ ๊ฐ€๋Šฅํ•œ ์ธํ„ฐํŽ˜์ด์Šค๋กœ ์—ฐ๊ฒฐํ•ฉ๋‹ˆ๋‹ค.
์‚ฌ์šฉ์ž๊ฐ€ ์ง์ ‘ ์ž…๋ ฅํ•˜๊ณ  ๊ฒฐ๊ณผ๋ฅผ ์‹ค์‹œ๊ฐ„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋Š” LangGraph ๋Œ€์‹œ๋ณด๋“œํ˜• ํ™˜๊ฒฝ์„ ๋งŒ๋“ค์–ด๋ด…๋‹ˆ๋‹ค.


๐ŸŽ“ 9. ๋” ๊นŠ์ด ๋ฐฐ์šฐ๊ธฐ ์œ„ํ•œ ๊ณ ๊ธ‰ ํ™•์žฅ ํ•™์Šต ๊ฐ€์ด๋“œ

์ฃผ์ œํ•™์Šต ์ด์œ ์ถ”์ฒœ ํ•™์Šต ๋ฐฉํ–ฅ
VectorDB ์ตœ์ ํ™”๊ฒ€์ƒ‰ ์†๋„์™€ ์ •ํ™•๋„ ํ–ฅ์ƒPinecone, Chroma, FAISS ์ธ๋ฑ์‹ฑ ์‹ค์Šต
Evaluation Loop ์„ค๊ณ„Self-RAG ๊ตฌํ˜„ ํ•ต์‹ฌ๋ชจ๋ธ ํ‰๊ฐ€/์Šค์ฝ”์–ด๋ง ๋ฃจํ”„
SubGraph ๋ชจ๋“ˆํ™”๋Œ€ํ˜• ๊ทธ๋ž˜ํ”„ ์œ ์ง€๋ณด์ˆ˜ ํšจ์œจ์„ฑGraph Composition, Nested Graph
LangGraph HITL ์„ค๊ณ„์ธ๊ฐ„ ๊ฐœ์ž…ํ˜• ํ‰๊ฐ€ ์‹œ์Šคํ…œHuman-in-the-Loop workflow
LangGraph + LangServe ๋ฐฐํฌRAG ์‹œ์Šคํ…œ APIํ™”LangServe or FastAPI endpoint
RAG ์„ฑ๋Šฅ ์ง€ํ‘œ ์„ค๊ณ„ํ’ˆ์งˆ ๊ด€๋ฆฌRecall@k, F1, Faithfulness metric

๐Ÿ“š ํ•ต์‹ฌ ์š”์•ฝ

  • LangGraph๋Š” RAG์˜ ๊ฒ€์ƒ‰-์ƒ์„ฑ-ํ‰๊ฐ€ ๋ฃจํ”„๋ฅผ ๊ทธ๋ž˜ํ”„ ๋‹จ์œ„๋กœ ์ œ์–ดํ•  ์ˆ˜ ์žˆ๋‹ค.
  • Adaptive / Self / Corrective RAG๋กœ ํ™•์žฅํ•˜๋ฉด LLM์ด ์Šค์Šค๋กœ ํ‰๊ฐ€ยท์ˆ˜์ •ํ•  ์ˆ˜ ์žˆ๋‹ค.
  • SubGraph, Feedback, HITL๋กœ ์‹ค์ „ ์ˆ˜์ค€์˜ RAG ํŒŒ์ดํ”„๋ผ์ธ ์„ค๊ณ„๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ค.

๐Ÿ’ก RAG๋Š” ๋‹จ์ˆœํ•œ ๋ณด์กฐ ๊ธฐ๋Šฅ์ด ์•„๋‹ˆ๋ผ, LangGraph์—์„œ โ€œ์ง€๋Šฅํ˜• ๋ฃจํ”„์˜ ํ•ต์‹ฌโ€์ด๋‹ค.

profile
okorion's Tech Study Blog.

0๊ฐœ์˜ ๋Œ“๊ธ€