계속 헷갈리는 추천시스템의 성능지표🥱

기린이·2022년 3월 24일

RecSys

목록 보기

2/5

Hit rate at k

전체 사용자 수 중 예측에 성공한 유저의 수

$\text{Hit}@\ell = \frac{1}{m} \sum_{u \in \mathcal{U}} \textbf{1}(rank_{u, g_u} <= \ell)$

$m$ : 전체 유저의 수

$l$ : cutoff기준

$u$ : 특정 유저

$rank_{u, g_u}$ : ground truth 아이템( $g_u$ )의 유저 $u$ 에서의 rank

유저마다 구매한 아이템 한개씩 빼놓은 후
모델 학습 후 예측
예측결과를 소팅 후 상위 l개안에 빼놓았던 아이템이 들어있으면 +1, 이를 전체 유저수로 나눈다.

AUC(D2L Ver.)

$\text{AUC} = \frac{1}{m} \sum_{u \in \mathcal{U}} \frac{1}{|\mathcal{I} \backslash S_u|} \sum_{j \in I \backslash S_u} \textbf{1}(rank_{u, g_u} < rank_{u, j})$

$\mathcal{I}$ : item set

$S_u$ : candidate items of user $u$

$rank_{u, g_u}$ : ranking of the ground truth item $g_u$ of user $u$ (ideal ranking is 1)

def hit_and_auc(rankedlist, test_matrix, k):
    hits_k = [(idx, val) for idx, val in enumerate(rankedlist[:k])
              if val in set(test_matrix)]
    hits_all = [(idx, val) for idx, val in enumerate(rankedlist)
                if val in set(test_matrix)]
    max = len(rankedlist) - 1
    auc = 1.0 * (max - hits_all[0][0]) / max if len(hits_all) > 0 else 0
    return len(hits_k), auc

유저마다 hit, auc를 구하는 함수
rankedlist : 한 유저의 test data에 대한 추천 결과 sort한 것
test_matrix : test set item 리스트

auc = (한 유저의 구매아이템 전체 개수 - hit한 아이템의 가장 높은 순위) / (한 유저의 구매아이템 전체 개수)

위의 수식과 코드가 어떻게 매치되는지 아직 모르겠다.

비슷한 맥락으로 의문을 가진 댓글에 대한 답변은 아래와 같았다.

Moreover, in the hit_and_auc function we try to find whether the item the net “recommend” is in the top-k list or what’s the rank of that “recommendation” in the hits_al list(which is sorted by the score given by the net)，which is corresponding to the way we calculate the AUC (find out how many false recommendations rank before the ground truth)

GT의 rank보다 상위 rank에 얼마나 많은 잘못된 아이템을 추천하는지 파악한다는 면에서 일맥상통한다고 말하는 것 같다.

위의 AUC 계산법은 해당 교재에서 AUC라고 이름 붙인 것으로 보인다. 엄밀한 의미에서 널리쓰이는 AUC와 성격이 다른 듯 하다.