Graph Mining

Link Prediction in Static and Dynamic Graphs

Contact Student: Jeyanthi Salem Narasimhan,


In simple terms, link prediction problem is about finding whether an edge or link might appear between two nodes of a network or in other words, it amounts to asking when two nodes of a network would communicate. The network under consideration could be a static one or dynamic. When the underlying graph is static, traditionally the problem is addressed with only the topology of the graph into account. In my work, I predominantly work with dynamic graphs, where a number of factors, both exogenous and endogenous to the network, contribute towards formation of links. Conventional way of addressing this problem takes snapshots of graphs over a period of time and tries to predict links in a future time. Out of many ways of analyzing this problem, my trials are inspired from concepts in Mathematical Sociology where the importance of macro-micro linkage is well studied and established. Since I am looking at evolution of networks in general, a domain independent modeling is the one to tackle this problem.

Currently, I am trying to model network evolution with stochastic processes with their dynamics studied using master equation and its variants. What makes such concepts interesting and important is that such concepts have been used to study evolving networks from diverse domains like chemistry (study of chemical reactions), biological networks (evolution of cells, molecular biology), economics (study of social consumption, group sentiments etc), statistical physics and many other physical systems. I think that link prediction among two nodes is a micro portion of the macro evolution of networks and hence if the modeling of network is successful, the link prediction becomes tractable. I aim to build a unified modeling framework which can be applied to study of network evolution from any domain with a little domain related knowledge. Modeling of network evolution is a hard problem because of the parameters involved in the stochastic dynamics and non-availability of closed-form solutions to the equations of interest.

Domains to which the problem is relevant: Link prediction has wide range of applicability. In addition to the physical systems mentioned in the second paragraph of the problem description, it can be used in the study of Internet and abstract networks like WWW, On-line Social Networks, Collaboration networks to mention few.

With the proliferation of social networking sites and some of them competing for the top position populationwise, it is worth noting that one of the key reasons for such success is the usage of link prediction for friends recommendation. 

Datasets (Not all are dynamic):

- SNAP (Stanford Network Analysis Platform)

- KDD cup Datasets -

- On-line Social Network @ Maxplanck -

- MEJ Newman -

- A-L Barabasi -

- Jon Kleinberg - (Scroll down for datasets)

- Pajek -

- A collection :

Relevant Publications:

- Link Prediction problem for social networks, Liben-Nowell et al,

- Evolution of the social network of Scientific collaborations, Barabasi,

- Hierarchical Structure and the prediction of missing links in networks, Clauset et al,

- Learning spectral graph transformations for link prediction, Kunegis et al,

- Link Prediction in relational data, Taskar et al,

- Temporality in Link Prediction: Understanding Social Complexity, Potgieter et al,

- Cold start link  prediction, Leroy et al,