Finding Critical Users for Social Network Engagement: The Collapsed k-Core Problem

Southgiri·2025년 6월 26일

Graph Paper Reivew

목록 보기
1/2

Background

K-core

  • Remaining maximal sub-graph, after iteratively pruning nodes which have less than k degrees
  • Indicates the influence of node
  • Unique in one graph

Abstract

  • In social networks, the leave of critical users may break network engagement
  • A popular model to measure social network engagement is k-core
  • To identify critical users for social network engagement, propose the collapsed k-core problem
    • Given a graph GG, a positive integer kk and a budget bb,
      aim to find bb vertices in GG such that the deletion of the bb vertices leads to the smallest k-core
  • Proposal : An efficient algorithm which significantly reduces the number of candidate vertices to speed up the computation

Introduction

k-core

  • k-core is a simple and popular model based on degree constraint used to measure the network engagement
  • A user with less than kk friends engaged will drop out, and his leave may be contagious

collapsed k-core problem

  • Given a limited budget bb, how to find bb vertices(users) so that we can get the smallest k-core by removing these bb vertices
    = then, these bb appears to be influencial
  • Aims to collapse the engagement of the network with the greatest extent for a given budget bb
  • Identifying critical users
    = Identifying most valuable users to sustain or destroy the engagement of the networks
  • Also, we can evaluate the robustness of network engagement against the vertex attack

Example

  • The extent of the collapse varies among different users
    • Although u9u_9 has 6 friends in 3-core, the departure of u9u_9 will not further lead to the leave of the other users
    • On the contrary, the leave of u11u_11 will lead to the leave of 7 members
    • In this sense, it is more cost-effecitve to give u11u_11 the incentive to ensure to his/her engagement to leave the group

Challenges and Contributions

  • First proposal to handle the collapsed k-core problem
  • Resort to greedy heuristics where the best vertex is obtained in each iteration
  • Through theoretical analyses, significantly reduce the number of candidate vertices to speed up the computation

Preliminaries

Notation

k-core

  • The k-core of a graph can be obtained with O(m)O(m)
  • The value of kk is determined by users based on their requirement for cohesiveness
    The resulting k-core will be more cohesive if the kk value becomes larger

collapsed k-core

  • In addition to the deletion of the collapsed vertices in AA, more vertices in Ck(G)C_k(G) might be deleted due to the contagious nature of the k-core
  • These vertices are called followers of the collapsed vertices AA
  • The size of the followers reflects the effectiveness of the collapsed vertices

Problem Statement

  • Given a graph GG, a degree constraint kk and a budget bb,
    Find a set AA of bb collapsed vertices in GG so that
    the size of the resulting collapsed k-core, Ck(GA)C_k(G_A) is minimized,
    that is, F(A,G)F(A,G) is maximized

Complexity

Theorem 1. The collapsed k-core problem is NP-hard for any kk

k=1k=1

  • Reduce Collapsed 1-core Problem to Maximum Independent Set Problem
  • Maximum Independent Set
    • 그래프에서 서로 연결되지 않은 정점들 중 최대한 많은 수를 찾는 문제
    • 임의의 두 정점 u,vSu,v \in S 에 대해 (u,v)E(u,v) \notin E 이어야함
  • Reduction
    1. 1-core 에서 임의의 정점 vv 를 제거하려면 vv 의 모든 이웃 제거 필요
    • \rightarrow vv의 차수 = 0
    1. SS 가 Independent Set 이라면
    • SS 내의 정점들은 서로 연결되어 있지 않음
    • = G\SG \backslash S 에는 모든 간선이 몰려 있음
    1. G\SG \backslash S 를 제거하면
    • SS 에 있는 모든 정점은 더 이상 연결된 간선이 없음 (\rightarrow 차수=0)
    • 따라서 1-core 에서 SS 에 속한 모든 정점이 탈락
    1. 즉, SS 를 살리고 G\SG \backslash S를 제거하면 전체 1-core 가 붕괴

k=2k=2

  • Reduce Collapsed 2-core Problem to Collapsed 1-core Problem
  • Reduction
    • Given graph G1G_1, construct another graph G2G_2
    • Each edge (v1,v2)(v_1,v_2) in G1G_1, add two virtual vertices ww and ww' and construct four edges in $G_2 : (v_1,w), (w,v_2), (v_1,w'), (w',v_2)
    • Do not need to include any virtual vertices in the optimal solution of collapsed 2-core
      because the influence of deleting a virtual vertex can be covered by deleting non-virtual vertices
    • Therefore, deletion of edge in G1G_1 is mapped to the deletion of four edges in G2G_2

k3k\geq 3

  • Reduce Collapsed k-core Problem to Maximum Coverage Problem
  • Maximum Coverage Problem
    • Finding at most bb sets to cover the largest number of elements, where bb is a given budget
  • Reduction
    • Instance of maximum coverage problem with ss sets T1,,TsT_1,\dots,T_s and tt elements {e1,,et}=1isTi\{e_1,\dots,e_t\}=\cup _{1 \leq i \leq s}T_i
    • The set of vertices in GG consists of M,V,PM,V,P
      • MM consists of (t+s)4(t+s)^4 vertices in which every pair of vertices in MM are adjacent
      • VV consists of ss vertices, each vertex viv_i corresponds to the set TiT_i
        For each vertex viv_i, add k+tTik+t-|T_i| edges from viv_i to k+tTik+t-|T_i| unique vertices in MM
        • Maximum Coverage 의 원소 eje_j 를 하나의 정점 vjv_j 로 대응
        • mim_i 제거의 결과로 차수가 kk 미만이 되면, follower 로 간주되어 k-core 에서 탈락
      • PP consists of tt parts, each part PiP_i corresponds to the element eie_i and PiP_i consists of ss vertices
        • For each PiP_i add s1s-1 edges, for each 1j<s1\leq j < s and each element eje_j
        • Add edge from pi,jp_{i,j} to pi,j+1p_{i,j+1}
        • For each set TiT_i and each element eje_j, if ejTie_j \in T_i, add an edge (vi,pj,i)(v_i, p_{j,i})
    • Key idea
      - Follower 수는 VV 로만 계산하면 됨
      - MM 은 제거되지 않음, 구조 유지
      - 모든 PiP_i 는 동일한 사이즈 유지
      - viv_i 제거하면 연결된 특정 PjP_j 가 함께 삭제

Theorem 2. Let f(A)=F(A)f(A)=|F(A)|. We have ff is monotone but not submodular for any kk

Monotonicity

  • if AA, then f(A)f(A)\text{if } A \sube A' \text{, then } f(A) \leq f(A')
  • if AA\text{if } A \sube A', AA' removes more vertices than AA
  • Thus f(A)f(A)f(A') \geq f(A) and ff is monotone

Submodularity

  • For two arbitrary collapsers sets AA and BB, f(AB)+f(AB)f(A)+f(B)f(A\cup B)+f(A\cap B) \leq f(A) + f(B)

Solution

Motivation

  • A straightforward solution is to enumerate all possible set AA with size bb
    Time complexity of O((nb)m)O(\begin{pmatrix} n \\ b \end{pmatrix} m) is cost-prohibitive
  • We only need to consider the vertices in Ck(GA)C_k(G_A) since all other vertices will be deleted by degree constraint
  • Thus, a greedy algorithm is O(bnm)O(bnm), nn and mm correspond to the number of candidate collapsers in each iteration and the cost of follower computation
  • The number of vertices in Ck(GA)C_k(G_A) at Line 3 is considerably large

Reducing Candidate Collapsers

Two pruning rules to find the vertex with the largest number of followers

  • Only vertices with degree kk in k-core and their neighbors in k-core can have followers
  • PP denotes the vertices in k-core of GG with degree kk
  • TT : CkC_k 에 있고 이웃 중 적어도 하나가 PP 에 속한 정점들

Theorem 3. Given a graph and the set PP, if a collapsed vertex xx has at least one follower, xx is from TT

  • PP : 차수가 kk 인 vertices set
  • TT : CkC_k 에 있고 이웃 중 적어도 하나가 PP 에 속한 정점들
  • Proof
    • If xG\Ck(G)x \in G \backslash C_k(G), x will be deleted in k-core computation, F(x)=0|F(x)|=0
    • If xCk(G)\Tx \in C_k(G) \backslash T, xx survived in k-core computation and for each xx's neighbor uu within Ck(G)C_k(G), deg(u,Ck(G))>kdeg(u, C_k(G)) > k since xTx\notin T

Theorem 4. Given two vertices xx and uu in graph, we have F(u)F(x)F(u) \sub F(x) if uF(x)u \in F(x)

  • Every vertex which is a follower of a vertex can be excluded from candidate collapsers
  • A vertex with more neighbors in the set PP is more promising because all its neighbors in PP will follow the vertex to be deleted

CKC Algorithm

  • To avoid the re-computation of PP and TT in the following iterations,
    update two sets at the end of each iteration
  • PP
    • P1P_1: vertices whose degrees are decreased to kk during the computation
    • P2P_2: vertices which are discarded during the computation
    • P=P(P1\P2)P=P \cup (P_1 \backslash P_2)
  • TT
    • Include new vertices in NB(P1)NB(P_1)
    • Delete vertices in NB(P2)NB(P_2)
  • Order the candidates by their number of neighbors in PP in each iteration to prune more candidate collapsers

Evaluation

Effectiveness

Baseline

  • Random
    • Randomly chooose collapsers in k-core
  • Degree
    • Choose collapsers in the candidate set TT with the largest degrees

Analysis

  • Degree based approach is outperformed by CKC with a big margin
    • This Implies that it is not effective to find collapsers simply based on degree information

Efficiency

Evaluation of Individual Techniques

  • Baseline+ represents Baseline equipped with Theorem 3
  • Theorem 3, 4 reduces the number of candidate collapsers

Performance Evaluation

  • (a) and (b)
    • CKC is scalable to the growth of the network size
  • (c) and (d)
    • CKC is scalable to the different kk and bb

0개의 댓글