Building Agentic RAG with LlamaIndex - 4. Building a Multi-Document Agent

jihyelee·2024년 6월 1일

retrieval-augmented-generation

목록 보기

8/17

Multi-Document Agent

Document의 개수가 적을 때 (예: 3개)

3개의 문서에 대해 각각 vector tool, summary tool 생성
- 총 6개의 도구를 가진 에이전트
여러 개의 도구를 동시에 사용해 사용자의 질문에 답할 수 있음

from utils import get_doc_tools
from pathlib import Path
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner
 
urls = [
    "https://openreview.net/pdf?id=VtmBAGCN7o",
    "https://openreview.net/pdf?id=6PmJoRfdaK",
    "https://openreview.net/pdf?id=hSyW5go0v8",
]
 
papers = [
    "metagpt.pdf",
    "longlora.pdf",
    "selfrag.pdf",
]
 
paper_to_tools_dict = {}
for paper in papers:
    print(f"Getting tools for paper: {paper}")
    vector_tool, summary_tool = get_doc_tools(paper, Path(paper).stem)
    paper_to_tools_dict[paper] = [vector_tool, summary_tool]
 
initial_tools = [t for paper in papers for t in paper_to_tools_dict[paper]] # 총 3개의 문서에 대해 각각 Vector tool, Summary tool 생성
 
agent_worker = FunctionCallingAgentWorker.from_tools(
    initial_tools,
    llm=OpenAI(model="gpt-3.5-turbo"),
    verbose=True
)
agent = AgentRunner(agent_worker) # 여러 개의 도구를 동시에 사용하며 대답할 수 있음

Document의 개수가 많을 때 (예: 11개)

위에서처럼 각 문서에 대해 도구를 생성하면 아래와 같은 문제가 발생할 수 있음
- 프롬프트에 도구를 다 포함할 수 없음 (context length 제한)
- 비용과 레이턴시 증가 (프롬프트에 사용되는 토큰 수 증가)
- LLM에게 혼란을 가중시켜 적절한 도구를 선택하지 못할 수 있음

Tool Retrieval

텍스트 레벨이 아니라 도구 레벨로 Retrieval Augmentation을 수행함으로써 문제 해결 가능
- 관련성 높은 도구를 retrieve
선택한 일부 도구만 추론 프롬프트에 포함

# define an "object" index and retriever over these tools
from llama_index.core import VectorStoreIndex
from llama_index.core.objects import ObjectIndex
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner
 
obj_index = ObjectIndex.from_objects( # 도구는 Python 객체이기 때문에, String representation으로 바꿔야 인덱싱이 가능해짐
    all_tools,
    index_cls=VectorStoreIndex,
)
 
obj_retriever = obj_index.as_retriever(similarity_top_k=3)
 
agent_worker = FunctionCallingAgentWorker.from_tools(
    tool_retriever=obj_retriever,
    llm=llm,
    system_prompt=""" \
You are an agent designed to answer queries over a set of given papers.
Please always use the tools provided to answer a question. Do not rely on prior knowledge.\ # 추가적인 가이드가 필요하다면 시스템 프롬프트 추가 가능
 
""",
    verbose=True
)
agent = AgentRunner(agent_worker)

jihyelee

Graduate student at Seoul National University, majoring in Artificial Intelligence (NLP). Currently AI Researcher and Engineer at LG CNS AI Lab

이전 포스트

Building Agentic RAG with LlamaIndex - 3. Building an Agent Reasoning Loop

다음 포스트

Building Agentic RAG with LlamaIndex - 4. Building a Multi-Document Agent

retrieval-augmented-generation

Multi-Document Agent

Document의 개수가 적을 때 (예: 3개)

Document의 개수가 많을 때 (예: 11개)

Building Agentic RAG with LlamaIndex - 3. Building an Agent Reasoning Loop

검색한 문서의 순위를 바꾸는 Reranker (Cohere, RankGPT, Cross Encoder Reranker)

0개의 댓글

관련 채용 정보