Lending Interaction Wings to Recommender Systems with Conversational Agents (NeurIPS 2023)

박상우·2024년 1월 12일

Paper Review

목록 보기

39/51

RS는 매우 유용한 tool, but 현재 연구는 offline history에 한정됨
Alexa, GPT와 같은 conversation tool은 일상에 효과적으로 스며듬
current RL based 접근 방식은 다량의 데이터를 요구
우리는 LLM agent와 RS를 plug-and-ply 방식으로 연결
- 목적은 최소한의 상호작용으로 사용자를 만족시킬 항목을 찾는 것
우리의 CORE는 RS를 offline relevance score estimator로 사용하고, agent를 online relevance score checker로 사용
우리는 agent의 질의를 통해 해소되는 uncertainty 지표를 제시한 뒤, 최대 확실성을 가지는 item을 제시하는 의사 결정 트리 알고리즘을 생성함
사용자는 어떤 속성에 대한 명확한 선호도를 가지지 않거나, 특정 값에 대한 선호도를 가질 수 있음, 이 때에는 binary asking이 더 효과적일 것
우리의 CORE는 RS, LLM에 어떠한 제약이 없으며, 8개의 데이터셋에서 SOTA를 달성

$\Psi_{RE} : U \times V \to R$
- offline estimator로, relevance score를 estimate
$\Psi_{CO} : U \times A \to R$
- online checker로, user에게 relevance한 item이 유저에게 적합한지 check
$U_k := \sum_{\substack{v_m \in V_k}}{\Psi_{RE}(v_m)}$
- 모든 item에 대한 uncertainty의 총합으로, 0이 될수록 좋음
  - 어떤 attribute에 대해 선호를 확인하였으면, 그 item들에 대한 uncertainty는 0이 되는 것
$\min_{\Psi^*_{RE}} K, \text{s.t.}, U_K = 0$
- uncertainty가 0이 되는 turn을 최소화 하는 것이 objective

Conversation과 Recommender system을 align하는 이전 main branch는 systematic하게 결합하는 것
이는 time consuming, high complexity cost 등에서 제약이 존재
우리의 general agent는 item, attribute를 query할 수 있으며 RS에서 relevance score만을 필요로 함
CORE는 간단하게 모든 supervised recommendation platform에 적용할 수 있으며, 어떠한 reward function을 적용할 수 있음

agent의 목표는 item, attribute를 query함으로써 uncertainty를 최소화 하는 것
우리는 expected certainty gain을 도입하여 제거함으로써 기대되는 불확실성을 예측함
$a_{\text{query}} = \arg \max_{a \in V_{k-1} \cup X_{k-1}} \Psi_{CG}(\text{query}(a))$
- 프사이 CG는 certainty gain
attribute에서도 비슷하게 formulization 됨