Graph Convolutional Neural Networks for Web-Scale Recommender Systems, KDD 18
PinSAGE paper
Goal :
Generate high-quality embeddings/reps. of pins(items) that can be used for recommendations with graph
Challenge :
- GCN is infeasible when the graph has billions of nodes and the structure is constantly evolving. (∵ GCN requires operating on the full graph Laplacian during training.)
- Inefficiency when generating embeddings for billions of nodes
- Limitation of GraphSAGE is that whole graph is stored in GPU memory.
Solution :
- improve scalability via random-walk based localized convolution approach
Method :
1. on-the-fly convolutions
: localized convolutions
- Learn how to aggregate information from neighborhood
- Process of localized convolution operation : transform -> aggregator/pooling -> concat (aggregated neighborhood & central node) -> transform (concatenated vector)
the output is a reps of central node that incorporates both information about itself and its local graph neighborhood.
2.Importance pooling
- Define importance-based neighborhood
: neighborhood of node u is defined as the top T nodes with the highest normalized visit counts
1) Random walks starting from node u
2) compute L1-normalized visit count of nodes visited by random walk
- 장점 :
1) Fixed number of neighborhood -> Control the memory footprint during training
2) the importance of neighbors can be considered during localized convolutions (importance based weighted-mean on aggregator/pooling)
3. Producer-consumer minibatch construction
- GPU에선 model computation (localized convolution step) 수행
- CPU에선 1) node feature extraction 2) re-indexing to create a sub-graph for mini-batch 3) negative-sampling
- multi-GPU training
-GPU는 producer-consumer pattern을 디자인해서 current iteration 수행
-CPU는 next iteration을 위한 computation 수행
4. Max-margin-based Loss Function
- Maximize inner product of pos item with query item
- query와 연관된 아이템 i의 DP가 연관없는 아이템 nk과의 DP보다 커질 수 있도록 학습 ( zq⋅znk<zq⋅zi )
JG(zq,zi)=Enk∼Pn(q)max{0,zq⋅znk−zq⋅zi+Δ}
where nk is negative sample.
5. Curriculum training
: harder-and-harder samples
At epoch n of the training, we add n−1 hard negative items to the set of negative items for each item.

- hard negative item :
hard negative item은 random negative item에 비해 query와 더 유사하기 때문에, 모델로 하여금 finer granularity를 학습시킬 수 있음.
- hard negative item sampling 방법
1) Personalized PageRank 스코어 기반 ranking
2) 2000-5000 ranked items 중 random sampling
- hard negative item을 사용하게 되면 converge 속도가 느려져서, curriculum training scheme을 적용함.
At epoch n of the training, we add n−1 hard negative items to the set of negative items for each item.
6. MapReduce inference
: efficiently generate embeddings using a trained PinSage model. distribute the fully-trained GCN to generate embeddings for billions of nodes.
Evaluation :
Task 종류
1) Related Pin Recommendation task : query pin과 유사한 K-nearest neighbors를 추천하는 것
2) Homefeed recommendation task : 유저가 가장 최근에 pinned한 아이템과 가장 유사한 pin을 추천하는 것
- Offline Evaluation
Evaluate the performance on the related Pin recommendation task
- hit-rate : the fraction of queries q where i was ranked among the top K of the test sample.
- Evaluation metric : Scaled MRR(Mean Reciprocal Rank) - query item q에 추천된 아이템들 중 item j의 랭킹
MRR=n1∑(q,i)∈L[Ri,q/100]1
-
User Studies
유저에게 두가지 candidate pins(서로 다른 reco. 알고리즘을 통해 얻은)를 주고 무엇이 query pin과 더 연관되어 보이는지 판단하게 함
-
Production A/B Test
- homefeed recommendation에서 유저로 하여금 re-pinned될 가능성이 높은 pins를 추천해주는 것
- metric : repin rate
measures the percentage of homefeed recommendations that have been saved by the users.
출처 :
Graph Convolutional Neural Networks for Web-Scale Recommender Systems, KDD 18
https://arxiv.org/pdf/1806.01973.pdf