๐Ÿ“„ AutoRec: Autoencoders Meet Collaborative Filtering ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ

์„œ์€์„œยท2023๋…„ 8์›” 18์ผ
0

Paper Review

๋ชฉ๋ก ๋ณด๊ธฐ
2/6
post-thumbnail

0. Abstract

๋ณธ ๋…ผ๋ฌธ์€ Collaborative filtering(CF)์„ ์œ„ํ•œ AutoRec ์„ ์†Œ๊ฐœํ•œ๋‹ค. AutoRec์€ compactํ•˜๋ฉฐ ํšจ์œจ์ ์œผ๋กœ ๊ธฐ์กด์˜ state-of-the-art CF๊ธฐ์ˆ ๋“ค(biased matrix factorization, RBM-CF, LLORMA)์„ ๋Šฅ๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ์ด๋‹ค.

1. Introduction

ํ˜‘์—…ํ•„ํ„ฐ๋ง ๋ชจ๋ธ์€ ๊ฐœ์ธ์— ๋”ฐ๋ฅธ ์ถ”์ฒœ์„ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•ด ์ƒํ’ˆ์— ๋”ฐ๋ฅธ ์‚ฌ์šฉ์ž์˜ ์„ ํ˜ธ๋„ ์ •๋„๋ฅผ ์ด์šฉํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” AutoRec์„ ์ œ์•ˆํ•˜๋Š”๋ฐ, ์ด๋Š” autoencoder์— ๊ธฐ๋ฐ˜์„ ๋‘” ์ƒˆ๋กœ์šด CF ๋ชจ๋ธ์ด๋‹ค. ์ด๋Š” ๊ธฐ์กด์— ์กด์žฌํ•˜๋˜ CF์— ๋Œ€ํ•œ ์‹ ๊ฒฝ๋ง ๋ณด๋‹ค ํ‘œํ˜„๋ ฅ๊ณผ ๊ณ„์‚ฐ์„ฑ์— ์žˆ์–ด ์žฅ์ ์„ ๋ณด์ธ๋‹ค. ๋˜ํ•œ ๊ธฐ์กด์˜ state-of-the-art CF๊ธฐ์ˆ ๋“ค ๋Šฅ๊ฐ€ํ•จ์„ ์ž…์ฆํ•œ๋‹ค.

Autoencoder๋ž€?

์ž…๋ ฅ์ด ๋“ค์–ด์™”์„ ๋•Œ, ํ•ด๋‹น ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ์ตœ๋Œ€ํ•œ compression ์‹œํ‚จ ํ›„, ๋‹ค์‹œ ๋ณธ๋ž˜์˜ ์ž…๋ ฅ ํ˜•ํƒœ๋กœ ๋ณต์› ์‹œํ‚ค๋Š” ์‹ ๊ฒฝ๋ง ์ด๋‹ค. ์ด๋•Œ ๋ฐ์ดํ„ฐ๋ฅผ ์••์ถ•ํ•˜๋Š” ๋ถ€๋ถ„์„ **Encoder**๋ผ๊ณ  ํ•˜๊ณ , ๋ณต์›ํ•˜๋Š” ๋ถ€๋ถ„์„ **Decoder**๋ผ๊ณ  ํ•œ๋‹ค. - ์••์ถ• ๊ณผ์ •์—์„œ ์ถ”์ถœํ•œ ์˜๋ฏธ ์žˆ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ latent vector๋ผ๊ณ  ๋ถ€๋ฅธ๋‹ค.

Ex) Auto Encoder๋Š” Input X๊ฐ’์„ ์ถ•์†Œ์‹œํ‚จ ๋’ค, ๋‹ค์‹œ ์žฌํ˜„ํ•ด๋‚ด๋Š” ๊ธฐ๋Šฅ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ์œ„์™€ ๊ฐ™์ด 7์ด๋ผ๋Š” ์ˆซ์ž๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋„ฃ์–ด์„œ ๋‹ค์‹œ 7์ด๋ผ๋Š” ์ˆซ์ž๋ฅผ ์žฌํ˜„ ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค.

( ๐Ÿ”— ์ถœ์ฒ˜ : https://pebpung.github.io/autoencoder/2021/09/11/Auto-Encoder-1.html )

2. The AutoRec Model

rating ๊ธฐ๋ฐ˜์˜ ํ˜‘์—…ํ•„ํ„ฐ๋ง์—์„œ๋Š” m users, n items๋ฅผ ๊ฐ€์ง€๋ฉฐ, user-item rating ํ–‰๋ ฌ RโˆˆRmร—nR \in R^{m\times n}์„ ์‚ฌ์šฉํ•œ๋‹ค.

  • User uโˆˆU={1,...,m}u \in U=\{1,... , m\}

    ๊ฐ๊ฐ์˜ vector๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํ‘œ๊ธฐ ํ•œ๋‹ค.
    r(u)=(Ru1,...,Run)โˆˆRnr^{(u)} = (R_{u1}, ... , R_{un}) \in R^n

  • Item iโˆˆI={1,...,n}i \in I=\{1,... , n\}

    ๊ฐ๊ฐ์˜ vector๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํ‘œ๊ธฐ ํ•œ๋‹ค.
    r(i)=(R1i,...,Rmi)โˆˆRmr^{(i)} = (R_{1i}, ... , R_{mi}) \in R^m

๋ณธ ๋…ผ๋ฌธ์˜ ๋ชฉ์ ์€ input๊ฐ’์œผ๋กœ ๋„ฃ์€ r(i)(r(u))r^{(i)}(r^{(u)})๋ฅผ ์ €์ฐจ์›(low demensional laten (hidden))์œผ๋กœ ํˆฌ์˜์‹œ์ผฐ๋‹ค๊ฐ€ r(i)(r(u))r^{(i)}(r^{(u)})๋กœ ๋‹ค์‹œ reconstructํ•˜๋Š” item-based(user-based) autoencoder ๋””์ž์ธ ํ•˜๋Š” ๊ฒƒ์ด๋‹ค.(output space๋Š” missing ratings์„ ์˜ˆ์ธกํ•œ๋‹ค.)

Item-based AutoRec model

  • ๋‹จ์ผ k ์ฐจ์› ํžˆ๋“  ๋ ˆ์ด์–ด๊ฐ€ ์žˆ๋Š” ์ž๋™ ์—ฐ๊ด€ ์‹ ๊ฒฝ๋ง์ด๋‹ค.

Training

โ–ถ๏ธŽ ๊ธฐ๋ณธ์ ์ธ Loss function
minฮธโˆ‘rโˆˆSโˆฅrโˆ’h(r;ฮธ)โˆฅ22min_\theta \sum_{r \in S} \parallel r-h(r;\theta) \parallel^2_2

S : set of vaectors in RdR^d
h(r;ฮธ)=f(Wโ‹…g(Vr+ฮผ)+b)h(r; \theta) = f(Wยทg(Vr+\mu) + b)

  • h(r;ฮธ)h(r; \theta) : reconstruction of input rโˆˆRdr \in R^d
  • f(โ‹…),g(โ‹…)f(ยท),g(ยท) : active function
  • ฮธ={W,V,ฮผ,b}\theta = \{W,V,\mu,b\}
  • WโˆˆRdร—k,VโˆˆRkร—dW \in R^{d \times k},V \in R^{k \times d}
  • ฮผโˆˆRk,bโˆˆRd\mu \in R^k,b \in R^d

์ค‘์š”ํ•œ ์ ์€ ๊ด€์ธก๋œ ๋ฐ์ดํ„ฐ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋งŒ ํ•™์Šตํ•œ๋‹ค๋Š” ์ , ์œ„ ๊ทธ๋ฆผ์—์„œ ํšŒ์ƒ‰ ๋…ธ๋“œ๊ฐ€ ๊ด€์ธก๋œ ๋ฐ์ดํ„ฐ๋ฅผ ๋œปํ•˜๊ณ  ์‹ค์„ ์ด ์—ญ์ „ํŒŒ๋ฅผ ํ†ตํ•ด ์—…๋ฐ์ดํŠธ๋ฅผ ์ง„ํ–‰ํ•œ๋‹ค.
๋˜ํ•œ ์ •๊ทœํ™” term์„ ์ถ”๊ฐ€ํ•จ์œผ๋กœ์จ ๊ด€์ฐฐ๋œ ratings์— ๋Œ€ํ•œ ์˜ค๋ฒ„ํ”ผํŒ…์„ ๋ฐฉ์ง€ํ•  ์ˆ˜ ์žˆ๋‹ค.

โ–ถ๏ธŽ ์ตœ์ข… Loss function
minฮธโˆ‘i=1nโˆฅr(i)โˆ’h(r(i);ฮธ)โˆฅo2+ฮป2โ‹…(โˆฅWโˆฅF2+โˆฅVโˆฅF2),ฮป>0min_\theta \sum_{i=1}^{n} \parallel r^{(i)}-h(r^{(i)};\theta) \parallel^2_o +\frac{\lambda}{2}ยท(\parallel W\parallel^2_F + \parallel V\parallel^2_F),\lambda>0
๐Ÿ‘‰๐Ÿป โˆฅโ‹…โˆฅo2\parallel ยท \parallel^2_o term์€ ์˜ค์ง ๊ด€์ฐฐ๋œ ratings๋งŒ์„ ๊ณ ๋ คํ•˜๊ฒ ๋‹ค๋Š” ์˜๋ฏธ์ด๋‹ค.

โ–ถ๏ธŽ parameters
ํ•™์Šต์ด ์ง„ํ–‰๋˜๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ์ˆ˜๋Š” 2mk+m+k2mk+m+k์ด๋‹ค.

โ–ถ๏ธŽ ์ตœ์ข… ์˜ˆ์ธก๋œ rating
R^ui=(h(r(i);ฮธ^))u\hat{R}_{ui} = (h(r^{(i)};\hat{\theta}))_u

AutoRec vs existing CF(RBM-CF)

๐Ÿ’ก What is RBM?

1)
RBM-CF๋Š” ์ œํ•œ๋œ ๋ณผ์ธ ๋จธ์‹ ์„ ๊ธฐ๋ฐ˜์œผ๋กœํ•œ ํ™•๋ฅ ์ ์ธ ๋ชจ๋ธ์ด๋‹ค.
AutoRec๋Š” autoencoder์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ discriminative model์ด๋‹ค.

2)
RBM-CF๋Š” log likelihood๋ฅผ ์ตœ๋Œ€ํ™” ํ•˜๋Š” ์ตœ์ ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ’์„ ์ถ”์ •ํ•œ๋‹ค.
AutoRec๋Š” RMSE๋ฅผ ์ตœ์†Œํ™” ํ•œ๋‹ค.

3)
RBM-CF๋Š” ์•ˆ์ •์ ์ธ ์ˆ˜๋ ด์„ ํ•„์š”๋กœํ•œ๋‹ค.
AutoRec๋Š” ๋” ๋น ๋ฅธ gradient-based ์—ญ์ „ํŒŒ ์ˆ˜ํ–‰์„ ํ•„์š”๋กœ ํ•œ๋‹ค.

4)
RBM-CF๋Š” ์˜ค์ง discrete rating์—๋งŒ ์ ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.
AutoRec๋Š” r์— ๊ตฌ์•  ๋ฐ›์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ๋” ์ ์€ ํŒŒ๋ผ๋ฏธํ„ฐ์ˆ˜๋ฅผ ํ•„์š”๋กœ ํ•œ๋‹ค.
๐Ÿ‘‰๐Ÿป ๋” ์ ์€ ํŒŒ๋ผ๋ฏธํ„ฐ์ˆ˜๋Š” ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ ๊ฒŒ ์‚ฌ์šฉํ•˜๊ณ  ์˜ค๋ฒ„ํ”ผํŒ…์„ ๋ฐฉ์ง€ํ•  ์ˆ˜ ์žˆ๋‹ค.

AutoRec vs MF

MF์˜ ๊ฒฝ์šฐ user์™€ item์„ ๋™์‹œ์— latent space๋กœ ๋งคํ•‘ํ•˜์ง€๋งŒ, item-based AutoRec model์˜ ๊ฒฝ์šฐ ์˜ค์ง items๋งŒ์„ latent space๋กœ ๋งคํ•‘ํ•œ๋‹ค.
๋˜ํ•œ MF์˜ ๊ฒฝ์šฐ linear latent ํ˜•ํƒœ๋กœ๋งŒ ํ•™์Šตํ•˜์ง€๋งŒ, item-based AutoRec model์˜ ๊ฒฝ์šฐ ํ™œ์„ฑํ™” ํ•จ์ˆ˜g(โ‹…)g(ยท)๋ฅผ ์ด์šฉํ•˜์—ฌ non-linear latent ํ˜•ํƒœ๋กœ ํ•™์Šต ํ•  ์ˆ˜ ์žˆ๋‹ค.

3. Experimental Evaluation

(a) item-based vs user-based


RBM๊ณผ AutoRec์—์„œ ๋ชจ๋‘ item-based ๋ฐฉ๋ฒ•์ด ๋” ์ข‹๊ฒŒ ๋‚˜์˜ด์„ ๋ณผ ์ˆ˜ ์žˆ์—ˆ๋‹ค.

(b) linear vs non-linear (AutoRec)


ํ™œ์„ฑํ™” ํ•จ์ˆ˜ g(โ‹…)g(ยท)๋ฅผ ์‚ฌ์šฉํ•œ AutoRec์˜ ์„ฑ๋Šฅ์ด ๋” ์ข‹๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ค€๋‹ค.

hidden unit์˜ ์ˆ˜์— ๋”ฐ๋ฅธ ์„ฑ๋Šฅ


hidden unit์˜ ์ˆ˜๊ฐ€ ์ฆ๊ฐ€ํ•จ์— ๋”ฐ๋ผ ์„ฑ๋Šฅ์ด ์ข‹์•„์ง€๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.


๐Ÿ‘‰๐Ÿป ์ฐธ๊ณ ํ•ด๋ณด๊ธฐ!
https://github.com/supkoon/AutoRec-tf

profile
๋‚ด์ผ์˜ ๋‚˜๋Š” ์˜ค๋Š˜๋ณด๋‹ค ๋” ๋‚˜์•„์ง€๊ธฐ๋ฅผ :D

0๊ฐœ์˜ ๋Œ“๊ธ€