Graph Convolutional Neural Networks for WebScale Recommender Systems, KDD 18
PinSAGE paper
Goal :
Generate highquality embeddings/reps. of pins(items) that can be used for recommendations with graph
Challenge :
 GCN is infeasible when the graph has billions of nodes and the structure is constantly evolving. ($\because$ GCN requires operating on the full graph Laplacian during training.)
 Inefficiency when generating embeddings for billions of nodes
 Limitation of GraphSAGE is that whole graph is stored in GPU memory.
Solution :
 improve scalability via randomwalk based localized convolution approach
Method :
1. onthefly convolutions
: localized convolutions
 Learn how to aggregate information from neighborhood
 Process of localized convolution operation : transform > aggregator/pooling > concat (aggregated neighborhood & central node) > transform (concatenated vector)
the output is a reps of central node that incorporates both information about itself and its local graph neighborhood.
2.Importance pooling
 Define importancebased neighborhood
: neighborhood of node $u$ is defined as the top $T$ nodes with the highest normalized visit counts
1) Random walks starting from node $u$
2) compute $L_1$normalized visit count of nodes visited by random walk
 장점 :
1) Fixed number of neighborhood > Control the memory footprint during training
2) the importance of neighbors can be considered during localized convolutions (importance based weightedmean on aggregator/pooling)
3. Producerconsumer minibatch construction
 GPU에선 model computation (localized convolution step) 수행
 CPU에선 1) node feature extraction 2) reindexing to create a subgraph for minibatch 3) negativesampling
 multiGPU training
GPU는 producerconsumer pattern을 디자인해서 current iteration 수행
CPU는 next iteration을 위한 computation 수행
4. Maxmarginbased Loss Function
 Maximize inner product of pos item with query item
 query와 연관된 아이템 $i$의 DP가 연관없는 아이템 $n_k$과의 DP보다 커질 수 있도록 학습 ( $z_q\cdot z_{n_k} < z_q\cdot z_{i}$ )
$J_G(z_q,z_i)=\mathbb{E}_{n_k\sim P_n(q)}max\{0,z_q\cdot z_{n_k}z_q\cdot z_i+\Delta\}$
where $n_k$ is negative sample.
5. Curriculum training
: harderandharder samples
At epoch $n$ of the training, we add $n1$ hard negative items to the set of negative items for each item.
 hard negative item :
hard negative item은 random negative item에 비해 query와 더 유사하기 때문에, 모델로 하여금 finer granularity를 학습시킬 수 있음.
 hard negative item sampling 방법
1) Personalized PageRank 스코어 기반 ranking
2) 20005000 ranked items 중 random sampling
 hard negative item을 사용하게 되면 converge 속도가 느려져서, curriculum training scheme을 적용함.
At epoch $n$ of the training, we add $n1$ hard negative items to the set of negative items for each item.
6. MapReduce inference
: efficiently generate embeddings using a trained PinSage model. distribute the fullytrained GCN to generate embeddings for billions of nodes.
Evaluation :
Task 종류
1) Related Pin Recommendation task : query pin과 유사한 Knearest neighbors를 추천하는 것
2) Homefeed recommendation task : 유저가 가장 최근에 pinned한 아이템과 가장 유사한 pin을 추천하는 것
 Offline Evaluation
Evaluate the performance on the related Pin recommendation task
 hitrate : the fraction of queries $q$ where $i$ was ranked among the top $K$ of the test sample.
 Evaluation metric : Scaled MRR(Mean Reciprocal Rank)  query item $q$에 추천된 아이템들 중 item $j$의 랭킹
$MRR= {1\over n}\sum_{(q,i)\in L} {1\over [R_{i,q}/100]}$

User Studies
유저에게 두가지 candidate pins(서로 다른 reco. 알고리즘을 통해 얻은)를 주고 무엇이 query pin과 더 연관되어 보이는지 판단하게 함

Production A/B Test
 homefeed recommendation에서 유저로 하여금 repinned될 가능성이 높은 pins를 추천해주는 것
 metric : repin rate
measures the percentage of homefeed recommendations that have been saved by the users.
출처 :
Graph Convolutional Neural Networks for WebScale Recommender Systems, KDD 18
https://arxiv.org/pdf/1806.01973.pdf