비디오 | 요약 | 강의 링크 | 슬라이드 |
---|---|---|---|
Part 5 (다중 쿼리) | 다양한 문서 검색을 위해 쿼리 재작성 기법을 설명합니다. | 📌 강의 | 📖 슬라이드 |
Part 6 (RAG Fusion) | 여러 검색 결과를 결합하여 향상된 랭킹을 제공하는 RAG Fusion을 소개합니다. | 📌 강의 | 📖 슬라이드 |
Part 7 (분해) | 복잡한 질문을 세분화된 하위 질문으로 나누어 상세한 답변을 제공하는 방법을 논의합니다. | 📌 강의 | 📖 슬라이드 |
Part 8 (단계적 후퇴) | 근본적인 이해를 이끌어내는 추상적 질문을 생성하는 단계적 후퇴 프롬프팅을 탐구합니다. | 📌 강의 | 📖 슬라이드 |
Part 9 (HyDE) | 인덱스 문서와 더 잘 일치하도록 가설적 문서를 생성하는 HyDE 기법을 소개합니다. | 📌 강의 | 📖 슬라이드 |
Multi-query(다중 쿼리)
에 중점을 두고 설명합니다. 코드 시연
블로그 문서 로드 및 벡터 스토어 생성/검색 준비
import bs4
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
loader = WebBaseLoader(
web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
bs_kwargs=dict(
parse_only=bs4.SoupStrainer(
class_=("post-content", "post-title", "post-header")
)
),
)
blog_docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
chunk_size=300,
chunk_overlap=50
)
splits = text_splitter.split_documents(blog_docs)
vectorstore = Chroma.from_documents(documents=splits,
embedding=OpenAIEmbeddings())
retriever = vectorstore.as_retriever()
bs4
를 이용해 파싱한 후, 해당 데이터를 분할하여 벡터 스토어에 인덱싱합니다.from langchain.prompts import ChatPromptTemplate
template = """
You are an AI language model assistant.
Your task is to generate five different versions of the given user question to retrieve relevant documents from a vector database.
By generating multiple perspectives on the user question, your goal is to help the user overcome some of the limitations of the distance-based similarity search.
Provide these alternative questions separated by newlines.
Original question: {question}
"""
prompt_perspectives = ChatPromptTemplate.from_template(template)
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI
from langchain.load import dumps, loads
generate_queries = (
prompt_perspectives
| ChatOpenAI(temperature=0)
| StrOutputParser()
| (lambda x: x.split("\n"))
)
def get_unique_union(documents: list[list]):
""" Unique union of retrieved docs """
flattened_docs = [dumps(doc) for sublist in documents for doc in sublist]
unique_docs = list(set(flattened_docs))
return [loads(doc) for doc in unique_docs]
retrieval_chain = generate_queries | retriever.map() | get_unique_union
question = "What is task decomposition for LLM agents?"
docs = retrieval_chain.invoke({"question":question})
len(docs)
생성된 여러 질문을 이용해 독립적인 검색을 수행하고, 그 결과를 통합하여 중복되지 않는 문서를 반환합니다.
generate_queries: 주어진 질문에 대해 여러 관점의 쿼리를 생성합니다.
a. ChatOpenAI 모델을 통해 프롬프트를 처리합니다.
b. StrOutputParser로 모델의 출력을 파싱합니다.
c. 결과를 개행 문자(\n)로 분할하여 여러 쿼리로 만듭니다.
retriever: 생성된 쿼리를 사용하여 문서를 검색합니다.
a. retriever.map()
을 사용하여 각 생성된 쿼리에 대해 문서를 검색합니다.
get_unique_union: 검색된 문서들 중 중복을 제거합니다.
a. get_unique_union
함수를 사용하여 검색된 모든 문서에서 중복을 제거합니다.
unique_docs = list(set(flattened_docs))
: 집합(set)을 사용해 중복을 제거한 후, 다시 원래 형식으로 변환합니다.b. dumps
와 loads
를 사용함으로써, Document 객체의 내용을 기반으로 중복을 제거하고, 다시 원래의 객체 형태로 복원할 수 있습니다.
from operator import itemgetter
from langchain_openai import ChatOpenAI
from langchain_core.runnables import RunnablePassthrough
template = """
Answer the following question based on this context:
{context}
Question:
{question}
"""
prompt = ChatPromptTemplate.from_template(template)
llm = ChatOpenAI(temperature=0)
# retrieval_chain = generate_queries | retriever.map() | get_unique_union
final_rag_chain = (
{"context": retrieval_chain,
"question": itemgetter("question")}
| prompt
| llm
| StrOutputParser()
)
final_rag_chain.invoke({"question":question})
retrieval_chain
이 컨텍스트를 제공합니다.itemgetter("question")
가 입력에서 질문을 추출합니다.💡
itemgetter
함수에 대해서 더 알아보자
itemgetter
함수:operator
모듈의 itemgetter
는 딕셔너리나 시퀀스에서 특정 키나 인덱스의 값을 추출하는 callable 객체를 생성합니다.itemgetter("question")
이 사용되어, 입력 딕셔너리에서 "question" 키의 값을 추출합니다.itemgetter
와 .invoke
의 상호작용:final_rag_chain.invoke({"question": question})
가 호출될 때, itemgetter("question")
는 이 입력 딕셔너리에서 "question" 키의 값을 추출합니다.{"context": retrieval_chain, "question": itemgetter("question")}
에서 "question" 키에 itemgetter("question")
이 할당됩니다..invoke({"question": question})
가 호출되면, 이 딕셔너리가 체인에 입력됩니다.itemgetter("question")
는 이 딕셔너리에서 "question" 값을 추출합니다.RAG Fusion의 직관
RAG Fusion의 기본 원리는 다중 쿼리 방법과 유사하게 여러 번의 검색을 통해 각 질문에 대한 문서 리스트를 생성한 후, RRF 알고리즘을 통해 각 문서의 순위를 합산하여 최종 순위가 높은 문서를 선택하는 방식입니다. 이를 통해 각각의 독립적인 검색 결과를 최적화하여 최종 문서를 구성할 수 있습니다.
Reciprocal Rank Fusion (RRF)
RRF는 검색 결과를 "점수"가 아닌 "순위"에 기반해 통합합니다. 이는 검색 결과에서 각 문서가 몇 번째로 중요한지(순위)를 매기고, 이를 활용해 최종적으로 어떤 문서가 더 중요한지를 판단합니다.
RRF에서 각 문서의 점수는 다음과 같이 계산됩니다:
여기서 k는 작은 상수(일반적으로 60 정도), rank(d)
는 문서 d가 각 검색 결과에서 차지한 순위입니다. 순위가 높을수록, 즉 더 상위에 랭크된 문서일수록 점수가 높아집니다.
이 공식이 어떻게 작동하는지 예를 들어보겠습니다:
코드 시연
1. RAG Fusion용 프롬프트 정의 및 다중 쿼리 생성
from langchain.prompts import ChatPromptTemplate
template = """
You are a helpful assistant that generates multiple search queries based on a single input query. \n
Generate multiple search queries related to: {question} \n
Output (4 queries):
"""
prompt_rag_fusion = ChatPromptTemplate.from_template(template)
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI
generate_queries = (
prompt_rag_fusion
| ChatOpenAI(temperature=0)
| StrOutputParser()
| (lambda x: x.split("\n"))
)
(lambda x: x.split("\n"))
부분이 LLM의 출력을 줄바꿈을 기준으로 분리하여 리스트로 만듭니다.2. Reciprocal Rank Fusion (RRF) 함수 정의 및 검색 수행
from langchain.load import dumps, loads
def reciprocal_rank_fusion(results: list[list], k=60):
""" Reciprocal_rank_fusion that takes multiple lists of ranked documents
and an optional parameter k used in the RRF formula """
# Initialize a dictionary to hold fused scores for each unique document
fused_scores = {}
# Iterate through each list of ranked documents
for docs in results:
# Iterate through each document in the list, with its rank (position in the list)
for rank, doc in enumerate(docs):
# Convert the document to a string format to use as a key (assumes documents can be serialized to JSON)
doc_str = dumps(doc)
# If the document is not yet in the fused_scores dictionary, add it with an initial score of 0
if doc_str not in fused_scores:
fused_scores[doc_str] = 0
# Retrieve the current score of the document, if any
previous_score = fused_scores[doc_str]
# Update the score of the document using the RRF formula: 1 / (rank + k)
fused_scores[doc_str] = previous_score + 1 / (rank + k)
# Sort the documents based on their fused scores in descending order to get the final reranked results
reranked_results = [
(loads(doc), score)
for doc, score in sorted(fused_scores.items(), key=lambda x: x[1], reverse=True)
]
# Return the reranked results as a list of tuples, each containing the document and its fused score
return reranked_results
retrieval_chain_rag_fusion = generate_queries | retriever.map() | reciprocal_rank_fusion
docs = retrieval_chain_rag_fusion.invoke({"question": question})
len(docs)
3. 최종 RAG 체인 정의
from langchain_core.runnables import RunnablePassthrough
template = """Answer the following question based on this context:
{context}
Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
final_rag_chain = (
{"context": retrieval_chain_rag_fusion,
"question": itemgetter("question")}
| prompt
| llm
| StrOutputParser()
)
final_rag_chain.invoke({"question":question})
요약
중간 정리
멀티쿼리 기법
과 RAG Fusion
에 대해서 살펴봤는데요. 한번 정리하고 넘어가보도록 하겠습니다.RAG Fusion
멀티쿼리 기법 (Multi-Query Retriever)
문제 정의 및 접근 방법
Decomposition 방식의 직관
코드 시연
1. Decomposition용 프롬프트 정의 / LLM을 이용한 하위 질문 생성
from langchain.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
# prompt template for decomposition
template = """
You are a helpful assistant that generates multiple sub-questions related to an input question. \n
The goal is to break down the input into a set of sub-problems / sub-questions that can be answers in isolation. \n
Generate multiple search queries related to: {question} \n
Output (3 queries):
"""
prompt_decomposition = ChatPromptTemplate.from_template(template)
llm = ChatOpenAI(temperature=0)
generate_queries_decomposition = (
prompt_decomposition
| llm
| StrOutputParser()
| (lambda x: x.split("\n"))
)
# 예시 질문
question = "What are the main components of an LLM-powered autonomous agent system?"
questions = generate_queries_decomposition.invoke({"question":question})
위 예시를 보면, generate_queries_decomposition를 통해서 “LLM 기반 자율 에이전트 시스템의 주요 구성 요소는 무엇인가요?” 이라는 질문이 아래와 같이 3가지 질문으로 분해되는 것을 확인할 수 있습니다:
2. 하위 질문별로 답변 생성 및 연속적인 처리
from operator import itemgetter
from langchain_core.output_parsers import StrOutputParser
# prompt template for RAG
template = """
Here is the question you need to answer:
\n --- \n {question} \n --- \n
Here is any available background question + answer pairs:
\n --- \n {q_a_pairs} \n --- \n
Here is additional context relevant to the question:
\n --- \n {context} \n --- \n
Use the above context and any background question + answer pairs to answer the question: \n {question}
"""
decomposition_prompt = ChatPromptTemplate.from_template(template)
def format_qa_pair(question, answer):
"""
Format question and answer pairs for inclusion in the prompt
"""
formatted_string = ""
formatted_string += f"Question: {question}\nAnswer: {answer}\n\n"
return formatted_string.strip()
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
# Initialize an empty string to accumulate question-answer pairs
q_a_pairs = ""
for q in questions:
rag_chain = (
{
"context": itemgetter("question") | retriever,
"question": itemgetter("question"),
"q_a_pairs": itemgetter("q_a_pairs")
}
| decomposition_prompt
| llm
| StrOutputParser()
)
answer = rag_chain.invoke({"question": q, "q_a_pairs": q_a_pairs})
q_a_pair = format_qa_pair(q, answer)
q_a_pairs = q_a_pairs + "\n---\n" + q_a_pair
위에서 생성된 3개의 하위 질문에 대해 순차적으로 검색을 수행하고, 이전 질문의 답변을 다음 질문에 활용하여 점진적으로 해결해 나갑니다.
q_a_pair = format_qa_pair(q, answer)
를 통해 이전 답변을 감싸고, rag_chain.invoke({"question": q, "q_a_pairs": q_a_pairs
}) 시에 넣어줌으로써 이전 답변을 현재 답변을 해줄때 참고할 수 있게 해줍니다.(예시) 2번째 프롬프트 Input
# 2번 질문의 답변을 위해 q_a_pair로 1번 Question과 Answer를 참고하고 있음
{
"question":
"2. How do autonomous agents integrate LLMs into their architecture?",
"q_a_pairs":
"\n---\nQuestion: 1. What are the core elements of a large language model (LLM)?\n
Answer: The core elements of a large language model (LLM) include:\n\n1. **Architecture**: The foundational design of the LLM, typically involving layers of neural networks such as transformers. This architecture determines how the model processes and generates language.\n\n2. **Training Data**: The corpus of text data used to train the model. This data is crucial for the model to learn language patterns, grammar, facts, and even some reasoning capabilities.\n\n3. **Training Process**: The method by which the model learns from the training data, often involving techniques like supervised learning, unsupervised learning, or reinforcement learning. This process includes fine-tuning and adjusting the model's parameters to improve its performance.\n\n4. **Tokenization**: The process of breaking down text into smaller units (tokens) that the model can understand and process. Tokenization is essential for handling different languages, special characters, and various text structures.\n\n5. **Context Handling**: The mechanism by which the model understands and maintains the context of a conversation or text. This includes managing the finite context length and using techniques like attention mechanisms to focus on relevant parts of the input.\n\n6. **Memory**: Systems that allow the model to store and recall information beyond the immediate context window. This can involve techniques like vector stores and retrieval systems to access a larger knowledge pool.\n\n7. **Inference Mechanism**: The process by which the model generates responses based on the input it receives. This includes the model's ability to perform tasks like text generation, translation, summarization, and more.\n\n8. **Optimization and Planning**: For advanced applications, LLMs may include components for planning, breaking down tasks into subgoals, and refining actions based on self-reflection and feedback.\n\nThese elements work together to enable the LLM to perform a wide range of language-related tasks effectively."
}
최종 답변: 1 → 2 → 3에 대해서 순차적으로 답변 해가면서 고도화해간 답변입니다.
내용을 보니 얼추 맞는 것 같습니다.
The essential technologies supporting an LLM-powered autonomous agent include:
Large Language Models (LLMs):
Planning Technologies:
Memory Systems:
Inter-Agent Communication:
Environment Interaction:
Proof-of-Concept Implementations:
Together, these technologies enable LLM-powered autonomous agents to plan, learn, adapt, and interact effectively, supporting their function as powerful general problem solvers.
3. 하위 질문별로 답변 생성 및 연속적인 처리
# After processing sub-questions and accumulating q_a_pairs
final_prompt_template = """
You are a knowledgeable assistant.
Here is the original question:
{original_question}
Here are the relevant question and answer pairs that may help you:
{q_a_pairs}
Using the information above, please provide a detailed and comprehensive answer to the original question.
"""
final_prompt = ChatPromptTemplate.from_template(final_prompt_template)
# Reuse or initialize the LLM
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
# Create the chain
final_chain = (
final_prompt
| llm
| StrOutputParser()
)
# Invoke the chain to get the final answer
final_answer = final_chain.invoke({"original_question": question, "q_a_pairs": q_a_pairs})
print("Final Answer:\n", final_answer)
답변:
Final Answer:
# Main Components of an LLM-Powered Autonomous Agent System
- An LLM-powered autonomous agent system integrates a variety of technologies and modules to enable the agent to plan, learn, adapt, and interact with its environment and other agents effectively. Below are the main components of such a system:
## 1. Large Language Models (LLMs)
- Core Controller: The LLM acts as the brain of the system, driving the core functionalities. It understands, generates, and parses instructions and responses through natural language interactions.
## 2. Natural Language Interface
- Communication: This interface allows for natural language interactions between the LLM and external components such as memory systems and planning modules. It facilitates effective communication and information exchange within the system.
## 3. Planning
- Task Decomposition: Techniques like Chain of Thought (CoT) and Tree of Thoughts (ToT) are used to break down complex tasks into smaller, manageable subgoals. This helps in planning and executing tasks step-by-step.
- Reflection and Refinement: The agent can perform self-criticism and self-reflection over past actions, learning from mistakes, and refining its approach for future tasks. This continuous improvement enhances the quality and efficiency of the agent's outputs.
## 4. Memory Systems
- Finite Context Length Handling: Due to the finite context length limitation of LLMs, mechanisms such as vector stores and retrieval models are employed to access a larger knowledge pool.
- Retrieval Models: These models surface relevant context based on factors like recency, importance, and relevance to inform the agent's behavior and decision-making processes.
- Reflection Mechanism: This involves synthesizing memories into higher-level inferences that guide future behavior. It generates summaries of past events and uses them for better decision-making.
## 5. Inter-Agent Communication
- Natural Language Statements: The LLM generates natural language statements to facilitate communication between different agents within the system. This enables the sharing of information, triggering new actions and responses.
## 6. Environment Interaction
- Actionable Plans: The LLM translates reflections and environmental information into actionable plans. It takes into account the relationships between agents and observations to optimize both immediate and long-term actions.
## 7. Proof-of-Concept Implementations
- Examples: Implementations like AutoGPT, GPT-Engineer, and BabyAGI demonstrate the potential and capabilities of LLM-powered autonomous agents. These examples highlight the integration of LLMs with other system components to handle complex tasks and improve over time through continuous learning and refinement.
# Summary
An LLM-powered autonomous agent system is composed of several key components that work together to enable sophisticated functionalities. The Large Language Model (LLM) serves as the core controller, interfacing with other modules through a natural language interface. The planning module uses techniques like Task Decomposition and Reflection and Refinement to manage tasks efficiently. Memory systems overcome the finite context length of LLMs by employing vector stores and retrieval models, aiding in better decision-making. Inter-agent communication and environment interaction modules ensure seamless information exchange and actionable planning. Proof-of-concept implementations illustrate the practical applications and continuous improvement potential of these systems. Together, these components create a robust framework for autonomous agents capable of complex problem-solving and adaptive learning.
(참고) 모든 하위 질문에 대한 답변을 개별적으로 처리
from langchain import hub
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI
# RAG 프롬프트
prompt_rag = hub.pull("rlm/rag-prompt")
def retrieve_and_rag(question, prompt_rag, sub_question_generator_chain):
"""하위 질문에 대한 RAG 수행"""
sub_questions = sub_question_generator_chain.invoke({"question":question})
rag_results = []
for sub_question in sub_questions:
retrieved_docs = retriever.get_relevant_documents(sub_question)
answer = (prompt_rag | llm | StrOutputParser()).invoke({"context": retrieved_docs,
"question": sub_question})
rag_results.append(answer)
return rag_results, sub_questions
answers, questions = retrieve_and_rag(question, prompt_rag, generate_queries_decomposition)
def format_qa_pairs(questions, answers):
"""질문과 답변을 포맷팅"""
formatted_string = ""
for i, (question, answer) in enumerate(zip(questions, answers), start=1):
formatted_string += f"Question {i}: {question}\nAnswer {i}: {answer}\n\n"
return formatted_string.strip()
context = format_qa_pairs(questions, answers)
# 최종 RAG 프롬프트
template = """Here is a set of Q+A pairs:
{context}
Use these to synthesize an answer to the question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
final_rag_chain = (
prompt
| llm
| StrOutputParser()
)
final_rag_chain.invoke({"context":context,"question":question})
답변: 관점이 다르게 답하지만 전 이전 버전의 대답이 좀 더 제 취향인걸로😎
Question: 'What are the main components of an LLM-powered autonomous agent system?'
Answer:
An LLM-powered autonomous agent system is composed of several key components that work together to enable the agent's functionality. These include:
1. Hardware Components: The primary hardware components consist of the large language model (LLM) itself, memory storage for managing historical data and interactions, and processing units capable of handling the intensive computational demands required for planning and task decomposition.
2. Software Frameworks: Essential software frameworks for building such a system include AutoGPT, GPT-Engineer, and BabyAGI. These frameworks are proof-of-concept demos that illustrate how LLMs can serve as the core controller of autonomous agents, handling tasks such as planning, task decomposition, and self-reflection to continually improve their performance.
3. Natural Language Processing (NLP) Modules: NLP modules act as the interface between the LLM and other external components like memory and tools. They enable the agent to parse and understand model outputs, which is crucial for effective task decomposition, planning, and interaction with other system components. However, managing the reliability of these outputs is critical, as errors can impact the agent's performance.
Together, these hardware and software components form a cohesive system that supports the complex functionalities required for an autonomous agent to operate effectively.
요약
문제 정의 및 접근 방법
추상화를 통한 추론 개선
: Step-Back Prompting은 모델이 문제를 직접 해결하기 전에, 먼저 문제를 한 단계 뒤로 물러나서 추상화된 고수준 개념이나 원리를 도출하도록 유도합니다. 이러한 추상화 단계는 모델이 복잡한 문제에서 세부적인 오류를 줄이고 더 높은 정확도로 추론할 수 있게 합니다.다양한 도메인에의 적용
: 이 기법은 물리학, 화학, 시간 지식 질문(TimeQA), 다단계 추론(Multi-Hop Reasoning) 등 다양한 작업에 적용 가능하며, 이를 통해 학습된 원리를 다양한 상황에 응용할 수 있음을 보여줍니다.성능 비교
: Step-Back Prompting은 다른 기법들, 특히 Chain-of-Thought(CoT)나 Take-a-Deep-Breath(TDB) 프롬프트와 비교하여, 추론 작업에서 일관되게 더 나은 성능을 보였습니다.Figure 2는 두 가지 작업(물리학 문제와 시간 기반 질문)에 대해 Step-Back Prompting을 어떻게 적용하는지를 보여줍니다. 각각의 예시에서 모델은 문제를 해결하기 위해 "Step-Back" 질문을 생성하고, 이를 바탕으로 추론을 진행합니다.
물리학 문제 (MMLU 물리학 예시)
문제: "이상 기체의 압력 P는 온도가 2배로 증가하고 부피가 8배로 증가하면 어떻게 변하는가?"
이 과정에서 Step-Back Prompting을 통해 모델은 세부적인 계산에서 오류를 피하고, 추상적인 원리로부터 올바른 답을 도출할 수 있게 됩니다.
시간 기반 질문 (TimeQA 예시)
문제: "Estella Leopold는 1954년 8월에서 11월 사이에 어느 학교에 다녔는가?"
이 예시에서 Step-Back Prompting은 세부적인 시간 제한에서 발생할 수 있는 오류를 피하고, 보다 넓은 관점에서 문제를 해결할 수 있도록 도와줍니다.
Step Back 기법의 직관
코드 시연
1. Few-shot 예시 생성
from langchain_core.prompts import ChatPromptTemplate, FewShotChatMessagePromptTemplate
# fewshot
examples = [
{
"input": "Could the members of The Police perform lawful arrests?",
"output": "What can the members of The Police do?",
},
{
"input": "Jan Sindel’s was born in what country?",
"output": "What is Jan Sindel’s personal history?",
},
]
example_prompt = ChatPromptTemplate.from_messages(
[
("human", "{input}"),
("ai", "{output}"),
]
)
few_shot_prompt = FewShotChatMessagePromptTemplate(
example_prompt=example_prompt,
examples=examples,
)
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"""
You are an expert at world knowledge. Your task is to step back and paraphrase a question to a more generic step-back question, which is easier to answer.
Here are a few examples:
""",
),
few_shot_prompt,
("user", "{question}"),
]
)
generate_queries_step_back = prompt | ChatOpenAI(temperature=0) | StrOutputParser()
2. Step Back 질문 생성 및 검색 & 답변 생성
question = "What is task decomposition for LLM agents?"
generate_queries_step_back.invoke({"question": question})
'How do LLM agents handle complex tasks?'
response_prompt_template = """
You are an expert of world knowledge. I am going to ask you a question.
Your response should be comprehensive and not contradicted with the following context if they are relevant.
Otherwise, ignore them if they are not relevant.
# {normal_context}
# {step_back_context}
# Original Question: {question}
# Answer:
"""
response_prompt = ChatPromptTemplate.from_template(response_prompt_template)
chain = (
{
# 원래 질문에 대한 검색
"normal_context": RunnableLambda(lambda x: x["question"]) | retriever,
# Step Back 질문에 대한 검색
"step_back_context": generate_queries_step_back | retriever,
"question": lambda x: x["question"],
}
| response_prompt
| ChatOpenAI(temperature=0)
| StrOutputParser()
)
chain.invoke({"question": question})
요약
가상 문서 생성
: 쿼리가 주어졌을 때, 언어 모델이 해당 쿼리에 답변하는 가상의 문서를 작성합니다. 이 가상 문서는 쿼리와 관련된 내용을 담고 있지만, 사실이 아닐 수도 있습니다.문서 임베딩
: 가상 문서는 대조 학습을 거친 인코더(예: Contriever)를 통해 임베딩 벡터로 변환됩니다. 이 임베딩 벡터는 가상 문서에서 불필요한 세부 사항을 걸러내고, 쿼리와 관련된 실제 문서들을 검색할 수 있도록 돕습니다.문서 검색
: 최종적으로, 임베딩 벡터를 이용해 코퍼스 내의 실제 문서들과의 벡터 유사도를 계산하고, 가장 유사한 문서들을 검색합니다.문제 정의 및 접근 방법
코드 시연
1. HyDE를 위한 문서 생성 프롬프트 정의 및 생성
from langchain.prompts import ChatPromptTemplate
template = """
Please write a scientific paper passage to answer the question
Question: {question}
Passage:
"""
prompt_hyde = ChatPromptTemplate.from_template(template)
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI
generate_docs_for_retrieval = (
prompt_hyde | ChatOpenAI(temperature=0) | StrOutputParser()
)
# 예시 질문
question = "What is task decomposition for LLM agents?"
generate_docs_for_retrieval.invoke({"question":question})
Title: Task Decomposition for Large Language Model (LLM) Agents
Abstract:
Task decomposition for Large Language Model (LLM) agents refers to the systematic process of breaking down complex tasks into smaller, more manageable subtasks, which can be sequentially or concurrently addressed by the model. This methodology aims to enhance the efficiency, accuracy, and overall performance of LLMs when faced with multifaceted queries or tasks. This passage explores the principles, methodologies, and implications of task decomposition in the context of LLM agents.
Introduction:
Large Language Models (LLMs), such as GPT-4, have demonstrated remarkable capabilities in natural language understanding, generation, and various other language-related tasks. However, their performance can be significantly improved through the strategic application of task decomposition. By dividing a complex task into discrete, manageable components, LLM agents can process information more effectively, reduce cognitive load, and minimize errors.
Principles of Task Decomposition:
Task decomposition is grounded in several key principles:
1. Modularity: Breaking down a task into independent or semi-independent modules allows for parallel processing and simplifies error identification and correction.
2. Hierarchy: Establishing a hierarchical structure where higher-level tasks are decomposed into lower-level subtasks ensures a coherent and organized approach to problem-solving.
3. Sequential Dependency: Understanding the dependencies between subtasks enables the LLM to process them in the correct order, ensuring that intermediate results are correctly utilized in subsequent steps.
Methodologies:
There are various methodologies for task decomposition, each tailored to specific types of tasks and LLM capabilities:
1. Top-Down Decomposition: This approach begins with the overarching task and progressively breaks it down into smaller subtasks. For example, answering a complex question might involve identifying key concepts, gathering relevant information, synthesizing data, and constructing a coherent response.
2. Bottom-Up Decomposition: Conversely, this method starts with identifying fundamental subtasks and gradually combines them to form a solution to the larger task. This can be useful in tasks where the basic components are well understood, but their integration is complex.
3. Hybrid Decomposition: Combining top-down and bottom-up approaches can provide a balanced strategy, leveraging the strengths of both methods to handle diverse tasks effectively.
Implications for LLM Performance:
The adoption of task decomposition has several implications for the performance of LLM agents:
1. Enhanced Accuracy: By focusing on smaller, more manageable subtasks, LLMs can provide more precise and accurate responses, reducing the likelihood of errors that may occur when tackling complex tasks holistically.
2. Improved Efficiency: Decomposing tasks allows for parallel processing, which can significantly speed up task completion and optimize resource utilization.
3. Scalability: Task decomposition facilitates the scaling of LLM applications to handle increasingly complex and diverse tasks, making them more versatile and robust.
Conclusion:
Task decomposition is a vital strategy for optimizing the performance of LLM agents. By breaking down complex tasks into smaller, manageable components, LLMs can improve their accuracy, efficiency, and scalability. As LLM technology continues to evolve, the principles and methodologies of task decomposition will play an increasingly important role in harnessing the full potential of these powerful models.
Keywords: Task decomposition, Large Language Models, LLM agents, modularity, hierarchical structure, sequential dependency, top-down decomposition, bottom-up decomposition, hybrid decomposition.
2. 생성된 가상 문서를 사용한 문서 검색
# 검색 체인
retrieval_chain = generate_docs_for_retrieval | retriever
retireved_docs = retrieval_chain.invoke({"question":question})
retireved_docs
3. 검색된 문서를 바탕으로 최종 답변 생성
# RAG 프롬프트
template = """Answer the following question based on this context:
{context}
Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
final_rag_chain = (
prompt
| llm
| StrOutputParser()
)
# 최종 RAG 체인 실행
final_rag_chain.invoke({"context":retireved_docs,"question":question})
요약