[Stanford CS224W] Advanced Topics in GNNs

Jaehee Kim·2024년 5월 5일

Graph

목록 보기
6/8

Limitations of Graph Neural Networks

A "Perfect" GNN Model

  • A k-layer GNN embeds a node based on the K-hop neighborhood structure
  • A perfect GNN should bhild an injective function b/w neighborhood structure (regardless of hops) and node embeddings
  • Therefore, for a perfect GNN
    - Obeservation 1: If two nodes have the same neighborhood structure, they must have the same embedding
    • Observation 2: If two nodes have different neighborhood structure, they must have different embeddings

Imperfections of Existing GNNs

  • However, Observations 1 & 2 are imperfect
  • Observation 1 could have issues:
    - Even though two nodes may have the same neighborhood structure, we may want to assign different embeddings to them
  • Observation 2 often cannot be satisfied:
    - The GNNs we have introduced so far are not perfect

Our Approach

  • We use the following thinking:
    - Two different inputs are labeled differently
    • A "failed" model will always assign the same embedding to them
    • A "successful" model will assign different embeddings to them
    • Embeddings are determined by GNN computiational graphs

Naive solution is not desirable

  • A naive soultion: One-hot encoding
    - Encode each node with a different ID, then we can always differentiate different nodes/edges/graphs
  • Issues
    - Not scalable: Need O(N) feature dimensions (N is the number of nodes)
    • Not inductive: Cannot generalize to new nodes/graphs

Position-aware Graph Neural Networks

Two Types of Tasks on Graphs

  • There are two types of tasks on graphs
    1. Structure-aware task : Nodes are labeled by their structural roles in the graph
    1. Position-aware task : Nodes are labeled by their positions

Structure-aware Tasks

  • GNNs often work well for structure-aware tasks
  • GNNs works

Position-aware Tasks

  • GNNs will always fail for position-aware tasks

Power of "Anchor"

  • Randomly pick a node s1s_1 as an anchor node
  • Represent v1v_1 and v2v_2 via their relative distances w.r.t. the anchor s1s_1, which are different
  • An anchor node serves as a coordinate axis : Which can be used to locate nodes in the graph

Power of "Anchor-sets"

  • Generalize anchor from a single node to a set of nodes
    - We define distance to an anchor-set as the minimum distance to all the nodes in the anchor-set
  • Observation: Large anchor-sets can sometimes provide more precise position estimate
    - We can save the total number of anchors

Anchor Set: Theory

  • Goal: Embed the metric space (V, d) into the Euclidian space RkR^k such that the original distance metric is preserved
  • Bourgain Theorem : The embedding distance produced by f is provably close to the original distance metric (V, d)
  • P-GNN follows the theory of Bourgain theorem
  • P-GNN maintains the inductive capability

→ Position encoding for graphs: Represent a node's position by its distance to randomly selected anchor-sets

How to Use Position Information

  • The simple way: Use position encoding as an augmented node feature (works well in practice)
    - Issue: Since each dimension of position encoding is tied to a random anchor set, dimensions of positional encoding can be randomly permuted, without changing its meaning
    • Imagine you permute the input dimensions of a noraml NN, the output will surely change
  • The rigorous soultion: Requires a speical NN that can maintain the permutation invariant property of position encoding
    - Permuting the input feature dimension will only result in the permutation of the output dimension, the value in each dimension won't change
    • Position-aware GNN paper has more details

Identity-Aware Graph Neural Networks

More Failure cases for GNNs

  • cannot perform perfectly in structure-aware tasks too
  • GNN failure 1: Node-lebel tasks
  • GNN failure 2: Edge-level tasks
  • GNN failure 3: Graph-level tasks

Idea: Inductive Node Coloring

  • Idea: We can assign a color to the node we want to embed
  • This coloring is inductive
  • Inductive node coloring can help node classification
  • Inductive node coloring can help link prediction

Identity-aware GNN

  • Utilize inductive node coloring in embedding computation
    - idea: Heterogeneous message passing
    • An ID-GNN applies different message/aggregation to nodes with different colorings
  • Why does heterogeneous message passing work:
    - Suppose two nodes v1,v2v_1, v_2 have the same computational graph structure, but have different node colorings
    • Since we will apply different neural network for embedding computation, their embeddings will be different

Robustness of Graph Neural Networks

Adversarial Examples

  • Deep convolutional neural networks are vulnerable to adversarial attaks
  • Adversarial examples are also reported in natural language processing and audio processing domains

Robustness of GNNs

  • Adveraries have the incentive to manipulate input graphs and hack GNNs' predictions

Attack Possbilities

  • What are the attack possibilites in real world?
    - Target node: node whose label prediction we want to change
    • Attacker nodes: nodes the attacker can modify

Direct Attack

  • Direct Attack: Attacker node is the target node
  • Modify target node feature
  • Add connections to target
  • Remove connections from target

Indirect Attack

  • Indirect Attack: The target node is not in the attacker nodes
  • Modify attacker node features
  • Add connections to attackers
  • Remove connections from attackers

Formalizing Adversarial Attacks

  • Obejective for the attacker: Maximize (change of target node label prediction) Subject to (graph manipulation is small)

0개의 댓글