ML <P, T, E> performance, task, experience
Machine Learning
learns patterns by training on previous data, makes prediction on unseen cases
conduct feature engineering- , reduce dimension of the features
Process of Feature Extraction: Hand-Crafted
biased
comes from intuition
rigorous:
(cycle)
-> think possible features -> Investigate relations(data mining) -> Vertify & measure degree of importance(w/ ML Model) ->
data mining: basic statistics, graph modeling...
Process of Feature Extraction: Automatic
use raw data
Deep Learning
Types of Graph ML Tasks
Node level
Edge-level
Graph-level prediction, Graph generation
Community(subgraph level)
Graph ML Tasks
Node classification: Predict a property of a node
ex) Categorize online users / items
Link prediction: Predict whether there are missing links between two nodes
ex) Knowledge graph completion
Graph classification: Categorize different graphs
ex) Molecule property prediction
Others: Graph generation: Drug discovery/Graph evolution: Physical simulation
Node-level ML Tasks
Classify/Assign labels for each node
Example: Protein Folding
A protein chain acquires its native 3D structure
amino acids - alpha helix+pleated sheet - proteins
computationally predict a protein's 3D structure based solely on its amino acid sequence
Edge-level ML TasksExample: Recommendation
Users interacts wih items
• Watch movies, buy merchandise, listen to music
• Nodes: Users and items
• Edges: User-item interactions
Graph-level ML Tasks
Generates outputs from the features that characterize the structure of an entire graph
Example: Drug Discovery
Antibiotics are small molecular graphs
• Nodes: Atoms
• Edges: Chemical bonds
분자 구조 그래프 바꿈
Graph classification tasks by GNN: predict promising molecules

Example: Physics Simulation
• Nodes: Particles
• Edges: Interaction between particles
Tasks: Predict how a graph will evolve over

Traditional Pipeline for Graph ML
• Design features for nodes/links/graphs
• Obtain features for all training data
Train on ML model: Random Forest, SVM, Neural Network, etc.
Apply the ML model: given a new node/link/graph, obtain its features and make a prediction
ML in Graphs
Goal: Make prediction for a set of objects
Design choices:
features: d-dimensional vectors
objects: Nodes, edges, sets of nodes, entire graphs
Objective function:?
How to Design Features from a Given Graph
Using effective features for good test performance
ML uses hand-designed features- 뒤에서 알아보자
focus on undirected graphs
Goal: Characterize the structure and position of a node in
the network
Node Features: recap



Summary
importance-based features: capture the importance of a
node is in a graph
node degree, node centrality
-> predict influential nodes in a graph
structure-based features: Capture topological properties
of local neighborhood around a node.
Node degree, clustering coefficient, structural roles
-> predict a particular role a node plays in a graph
Link Prediction as a Task
two formulations

겹치는 이웃 없으면 0 -> 뒤에 해결
3) Global neighborhood overlap
Katz index: count # paths between two nodes

Computing
step 1: Compute #paths of length 1 between each of 𝒖’s neighbor and v
Step 2: Sum up these #paths across u’s neighbors
제곱



Graph Features: Graphlet
Key idea: Count #different graphlets in a graph
allow isolated nodes, not rooted (different from node-level features)


Advanced: Graph Kernel
Kernels: use “similarity-like mechanism” for computing features
ex) inner product
Graph Kernels: Measure similarity between two graphs
Graphlet Kernel, Weisfeiler-Lehman Kernel,...