Roles
functions of nodes in a network
measured by structural behaviors
example: centers of stars, members of cliques 파벌, peripheral nodes 주변 노드
A collection of nodes which have similar positions in a network, similarity of ties, need not in direct
vs. Communities/Groups: well-connected to each other
Structral equivalence
have the same relations to all other nodes, both nodes have tie to k
importance

RoIX: automatic discovery of structural roles
Unsupervised learning approach, no prior knowledge-required
assign mixed-membership of roles to each node, scales linearly in # edges
RoIX: Approach Overview

egonet: 자아
Recursive Feature Extraction
turns network connectivity into structural features

• Neighborhood features: What is a node’s connectivity pattern? 연결 패턴
• Recursive features: To what kinds of nodes is a node connected? 연결된 노드의 종류
Idea: Aggregate features of a node and use them to generate new recursive features
start with the base set of node features and use the set of current node features to generate additional features:
aggregate functions: mean & sum
ex) compute means and sums over all current features, including other recursive features and repeat
base set feature로 시작해서 추가적인 feature 생성
The number of recursive features grows exponentially:
reduce # of features using a pruning technique:
look for highly correlated pairs of features above a threshold, eliminate one

extract features, cluster based on features
Application: Structural Similarity
Task: Cluster nodes
use RoIX to assign each node a distribution, determine similarity between nodes

blue- 엄청 모여있는 그룹
red- bridge
gray- 일반
green- 길쭉한 클러스터 이어줌
elongated: 길쭉한
peripheral: 주변의

Why contacts were often acquaintances rather than close friends?
Granovetter's Answer
two perspectives on friendships:
structural: span different parts of the network
interpersonal: strong or weak
span: 걸쳐있다
Edge Strength in Real Data
Edge Overlap

= neighbors 교집합/합집합
Strong ties are more embedded(higher overlap)
Link Removal by Overlap

high overlap- edge 제거해갈 때 가장 큰 요소의 크기가 천천히 작아짐
Network Communities

set of nodes with lots of internal connections and few external connections
Strong ties/ Weak ties
Not homogeneous- no equal #edges
Local clusters exist
skewed to a few nodes 치우치다
Finding Network Communities
Null Network Configuration Model
real network G, n nodes, m edges, rewired network G'
Same degree distribution but uniformly random connections, multigraph
Modularity
a measure of how well a network is partitioned into communities
given (a partitioning of the network) into (groups disjoint)

Edge Betweenness: # of shortest paths passing over the edge
Divisive hierarchical clustering based on the edge betweenness 분열적 계층적 군집화
works on undirected unweighted networks

Compute Edge Betweenness
randomly selected node, conduct BFS(breath first search)
count the # of shortest paths from A to all other nodes of the network
Compute betweenness by working up the tree from the lowest layers of the tree
if multiple paths, count them fractionally
랜덤으로 선택된 노드로 다른 모든 노드와의 최소 거리 수를 센다
낮은 층부터 betweenness 계산
The algorithm: node flow = 1 + child edges, parent value 기반으로 split the flow up
repeat BFS for each starting node U

Hierachical network decomposition
decide clusters with the highest modularity

Greedy algorithm for community detection
O(nlogn)
support weighted graphs
provides hierarchical communities
large networks- fast, high modularity
greedily maximizes modularity
pase 1- Partitioning
: only local changes to node-communities memberships로 modularity is optimized
put each node into a distinct community
each node i two calculation:
Compute the modularity delta() when putting node i into the community of some neighbor j
Move i to a community of node j that yields the largest gain in
runs until no movement yields a gain- local maxima of modularity
node를 community에 놓음, 이웃 j node의 community로 옮길 때 modularity의 변화량이 가장 큰 곳으로 감, 더이상 gain이 없을 때까지 이동

: C의 between nodes의 link weights합
: C의 노드들의 모든 link weights 합
: node i, C 사이의 link weights의 합
: node i의 모든 link weights

phase 2- Reconstructing
: The identified communities are aggregated into super-nodes
super-nodes끼리 one edge만 있으면 이어진거임
두 super-nodes 사이 edge의 weight는 모든 edges의 weights의 합


Modeling Image Propagation as Tree

significant correlation between # repinners and pin's influence itself(pins, followers/original pin, pin-Tree)
Interest Graph in Pinterest
relations among users

Community Analysis in Interest Graph

Prediction on Pin Consumption
Prediction methods

NOTE: Bipartite Graph Projection

Key Insights of EDA
Similarity analysis among the members in the same community

contents in the same community 는 같은 user한테 다운받아져서 user의 common interests를 포함할지도 모른다

A가 다운받은 contents , t와 같은 community에 있는 contents
t'가 , 빈 테이블 R에 없으면 R에 넣기, C 다 쓸 때까지 반복
popularity로 sort R -> R'
return first n elements in R'
Social Support for Mental Health
mental status 드러내기 힘듦 -> online social support
anonymity, connection beyond acquaintance, interaction with less restriction on space and time
Methodology: Dataset Construction
collect support-seeking posts, replies
assign Informational Support, Emotional Support scores based on BERT-based classification model
Methodology: Graph Modeling

