[논문 리뷰 - 1] Corrective Retrieval Augmented Generation

shanny·2025년 5월 10일

논문리뷰

논문 리뷰

목록 보기

22/42

‼️ 개인 학습 내용으로, 오류가 있을 수 있습니다.

논문 URL - https://arxiv.org/pdf/2401.15884

Title

Corrective Retrieval Augmented Generation
-> 교정형 RAG

Abstract

Large language models (LLMs) inevitably exhibit hallucinations since the accuracy of generated texts cannot be secured solely by the parametric knowledge they encapsulate.
-> LLM은 내재된 파라메트릭 지식만으로는 생성된 텍스트의 정확성을 보장할 수 없기 때문에, 필연적으로 환각(hallucination) 현상을 보인다.
Although retrieval-augmented generation (RAG) is a practicable complement to LLMs, it relies heavily on the relevance of retrieved documents, raising concerns about how the model behaves if retrieval goes wrong.
-> RAG는 LLM을 보완하는 실용적인 방법이지만, 검색된 문서의 관련성에 크게 의존하므로 검색이 잘못될 경우 모델이 어떻게 동작할지에 대한 우려가 있다.
To this end, we propose the Corrective Retrieval Augmented Generation (CRAG) to improve the robustness of generation.
-> 우리는 생성의 견고성을 향상시키기 위해 Corrective Retrieval Augmented Generation(CRAG)을 제안한다.
Specifically, a lightweight retrieval evaluator is designed to assess the overall quality of retrieved documents for a query, returning a confidence degree based on which different knowledge retrieval actions can be triggered.
-> 구체적으로, 쿼리에 대해 검색된 문서들의 전반적인 품질을 평가하는 경량화된 검색 평가자를 설계하였으며, 이 평가자는 신뢰도 점수를 반환하고, 이를 바탕으로 다양한 지식 검색 동작이 실행될 수 있다.
Since retrieval from static and limited corpora can only return suboptimal documents, large-scale web searches are utilized as an extension for augmenting the retrieval results.
-> 정적이고 제한된 코퍼스에서의 검색은 최적 이하의 문서만을 반환할 수 있기 때문에, 검색 결과를 보완하기 위해 대규모 웹 검색이 확장 방식으로 활용된다.
Besides, a decompose-thenrecompose algorithm is designed for retrieved documents to selectively focus on key information and filter out irrelevant information in them.
-> 또한, 검색된 문서에서 핵심 정보에 선택적으로 집중하고 불필요한 정보를 걸러내기 위해 분해-재구성(decompose-then-recompose) 알고리즘을 설계하였다.
CRAG is plug-and-play and can be seamlessly coupled with various RAG-based
approaches.
-> CRAG는 플러그 앤 플레이 방식으로 다양한 RAG 기반 접근법과도 매끄럽게 결합할 수 있다.
Experiments on four datasets covering short- and long-form generation tasks show that CRAG can significantly improve the performance of RAG-based approaches.
-> 짧은 형식과 긴 형식의 생성 과제를 포함한 네 개의 데이터셋에서 실험한 결과, CRAG가 RAG 기반 접근법의 성능을 크게 향상시킬 수 있음을 보여준다.