2008/02/21

On the Submodularity of Influence in Social Networks (Mossel, Roch; STOC 2007)

Why does viral marketing work? How can one start a blog roll? Who is the most influential agents in a social network? How can we maximize our influence without paying too many bloggers?

Given the social graph (or any other underlying graph structure, e.g., hyperlinks or feed abonnements), applying easy heuristics (high degree nodes, short average distance to the rest of the network) can identify important nodes.

The general threshold model by Kempe, Kleinberg, and Tardos states the following: every node has a flag stating whether it is activated or not; then, a node measures/asks/checks the values of its neighbors, applies its activation function, and then, if the computed value is higher than a certain threshold, it gets activated (and stays so). Threshold values are random, activation functions (from subsets of all nodes to the rational) are monotone increasing (no node has a negative effect) and submodular (the sum of the function values of two sets U, V is at least the sum of the function values of intersection and union, i.e., f(U)+f(V)>=f(U [or] V)+f(U [and] V) drawing a Venn diagram helps to understand that, function value is not increasing too much but increase is diminishing [if we could draw the graph of f, it was not convex and the second derivative f''<=0 - is that right?]).
Of course, even the best viral marketing would need to take into account that thresholds and functions might change with time and that some people might have a negative effect.

The paper deals with the fourth question from above, how to maximize the influence. Say influence works in rounds and starts with a set S of activated nodes. At some point the number of activated nodes does not change any more (in each round at least one node gets activated or the process stops - due to this model). The number of activated nodes sigma(S) depends on the number of initially activated nodes and on their locations (i.e., S). The algorithm (of a viral marketing tool or software) tries to find this initial set of minimal size. It has been shown that even approximation is NP hard (Kempe et al. 2003). However, if sigma was submodular too (the union of sets does not give a blast to your Guerilla marketing campagin), there is hope to find such an approximate set of fixed size.
Mossel and Roch show that sigma is submodular for this process.

Long story. And the proof is still to come.
Submodular means s(A)+s(B)>=s(A [or] B)+s(A [and] B) (variable names got changed, want to prove it for sigma (s for short here)). Coupling is a technique from statistics to analyze random variables, often used to analyze independent random variables coupled to dependent random variables, which are then analyzed instead.

tbctd

@inproceedings{MosselRoch07,
author = {Elchanan Mossel and
S{\'e}bastien Roch},
title = {On the submodularity of influence in social networks},
booktitle = {Proceedings of the 39th Annual ACM Symposium on Theory of
Computing, San Diego, California, USA, June 11-13, 2007},
year = {2007},
pages = {128--134}
}

No comments: