Community Structure Discovery in Facebook

E Ferrara.
International Journal of Social Network Mining, 1(1):67–90 (2012).

Useful links: PDF | BibTex | Journal page.

Download Facebook Datasets !

In this paper I presented the analysis of the community structure of Facebook.

Data have been collected directly from the Facebook social network, adopting two different sampling techniques (i.e., uniform sampling and breadth-first search sampling).

Once obtained the datasets, I unveiled the community structure of Facebook by adopting two computationally efficient algorithms, respectively, “Label Propagation Algorithm” (LPA) and “Fast Network Community Algorithm” (FNCA), well-suited for large scale community detection tasks.

Results have been compared in order to evaluate the bias introduced both by the sampling and the clustering processes, assessing the validity of obtained analysis.

The main findings of this works can be summarized as follows:

  • The distribution of the size of the communities on a large scale real-world online social networks such as Facebook follows a power law.
  • The community structure of Facebook is well-defined: the results obtained by using different sampling methods and different community detection algorithms are hightly overlapping and share a high degree of similarity.
  • Algorithms based on network modularity optimization (such as FNCA) introduce some bias in the community detection due to the well-known resolution limit.
  • The uniform sampling produces an unbiased sample of a large scale network and well-reflects the so-called community structure.