(23S) [CS471] Graph Machine Learning and Mining - Labs

구명규·2023년 5월 20일
0

Lecture Notes

목록 보기
3/5
post-thumbnail

Lab #1: PPR (Personalized PageRank)

  • NELL995 dataset 사용
  • Node, Graph class 기본 정의
  • PageRank(Global, Personalized) computing function
    • 각 node의 PPR을 일일히 update해주는 방식
    • Matrix-vector multiplication으로 구하는 방식

xv(k+1)=αwSvxw(k)Tw+1αnqx_v^{(k+1)}=\alpha\sum_{w\in S_v}\frac{x_w^{(k)}}{|T_w|}+\frac{1-\alpha}{n_q} (if vv is in the predefined set)
\cdots SvS_v: set of incoming neighbors of vv, TwT_w: set of outgoing neighbors of ww, nqn_q: # of predefined nodes,


Lab #2: Modularity Optimization

  • scikit-network library의 karate_club dataset 사용
  • Sparse adjacency matrix 생성, Louvain method 적용 (scikit-network-louvain) \rarr Graph clustering
  • Normalized cut 계산
    • 각 link에 대해 일일히 더해주는 방식
    • Matrix multiplication으로 구하는 방식
  • sknetwork library의 svg_graph 통한 visualization

NC(G)c=1kCut(Vc,V\Vc)deg(Vc)=c=1kycT(DA)ycycTDycNC(G)\equiv \sum_{c=1}^k\frac{Cut(V_c, V\V_c)}{deg(V_c)}=\sum_{c=1}^k\frac{y_c^T(D-A)y_c}{y_c^TDy_c}


Lab #3: GraphSAGE

  • Cora dataset 사용
  • GCN / GraphSAGE model 구현 (layer#, input dim., hidden dim., output dim., type of agg.)
  • Training, Validation, Evaluation, Visualization(Confusion matrix)

GCN Layer
hv(l+1)=σ(WluN(v)hu(l)N(v)+Blhv(l))h_v^{(l+1)}=\sigma(W_l\sum_{u\in N(v)}\frac{h_u^{(l)}}{|N(v)|}+B_lh_v^{(l)})
GraphSAGE Layer
hv(l+1)=σ(Wl[AGG(hu(l),uN(v)),hv(l)])h_v^{(l+1)}=\sigma(W_l[\text{AGG}({h_u^{(l)}, \forall u\in N(v)}), h_v^{(l)}])


Lab #4: GAT

  • Cora dataset 사용
  • GAT / GATv2 model 구현

GAT Layer (num_head mm만큼 병렬적으로 두어 multi-head 구현)
αij=exp(LeakyReLU(alT[Wlhi(l)Wlhj(l)]))kNiexp(LeakyReLU(alT[Wlhi(l)Wlhj(l)]))\alpha_{ij}=\frac{\exp(\text{LeakyReLU}(a_l^T\cdot[W_lh_i^{(l)}||W_lh_j^{(l)}]))}{\sum_{k\in N_i}\exp(\text{LeakyReLU}(a_l^T\cdot[W_lh_i^{(l)}||W_lh_j^{(l)}]))}
  \text{ }\text{ }\cdots weight factor of node jj's message to node ii
(aa: shared weight vector, WW: shared linear transformation,
Wlhi(l)W_lh_i^{(l)}: ll-layer에서 node ii의 message)
hi(l+1)=AGG1mM(σ(jNiαijmWlmhj(l)))h_i^{(l+1)}=\text{AGG}_{1\le m\le M}(\sigma(\sum_{j\in N_i}\alpha_{ij}^m W_{l}^m h_j^{(l)}))  \text{ }\text{ }\cdots multi-head attention
*Shared weight vector aa를 nonlinear layer 밖으로 빼면 dynamic attention(GATv2)


Lab #5: Knowledge Graph Embeddings

  • FB15K237 dataset 사용
  • nn.Embedding 함수로 TransE, DistMult 구현
  • Rank-based metric으로 evaluation 수행
  • Negative samples 생성함수 구현

TransE
fr(h,t)=h+rtf_r(h,t)=-||h+r-t||
DistMult
fr(h,t)=hTdiag(r)tf_r(h,t)=h^T\text{diag}(r)t \cdots pairwise interactions

profile
K'AI'ST 학부생까지의 기록

0개의 댓글