Datasets

This page is a collection of datasets acquired for my reseach activity.

The two only conditions of use of these data are the followings:

  • The corresponding paper is cited (BibTex is provided to this purpose)
  • Data are not redistributed before asking our consent.

Facebook Datasets

During August 2010, I have collected two samples of the Facebook friendship graph, by adopting two techniques:

  1. Breadth First Search (BFS) traversal algorithm

    It contains about 7 millions of nodes and 12 millions of edges

  2. Uniform (UNI) sampling approach (rejection sampling)

    It contains about 7 millions of nodes and 7 millions of edges

Related Papers

The following papers are based on these datasets.
Please cite those which are relevant to your purposes if you use these Facebook datasets.

  1. S. Catanese, P. De Meo, E. Ferrara, and G. Fiumara.
    Analyzing the Facebook friendship graph.
    CEUR Workshop Proceedings 685:14-19 (MIFI ’10: 1st International Workshop on Mining the Future Internet), 2010.

    Useful links: PDF | BibTex | Presentation | Arxiv

  2. S. Catanese, P. De Meo, E. Ferrara, G. Fiumara, and A. Provetti.
    Extraction and analysis of Facebook friendship relations.
    Computational Social Networks: Mining and Visualization.
    Springer Verlag, (In press).

    Useful links: PDF | BibTex

  3. S. Catanese, P. De Meo, E. Ferrara, G. Fiumara, and A. Provetti.
    Crawling Facebook for social network analysis purposes.
    WIMS ’11: Proceedings of the International Conference on Web Intelligence, Mining and Semantics, 2011.

    Useful links: PDF | BibTex | Presentation | ACM | Arxiv

  4. E. Ferrara.
    Community structure discovery in Facebook.
    International Journal of Social Network Mining, 1(1):67-90, 2012.

    Useful links: PDF | BibTex | Journal page

  5. E. Ferrara.
    A large-scale community structure analysis in Facebook.
    EPJ Data Science (under review).

    Useful links: PDF | Arxiv

Download Facebook datasets

Data are provided using the “edge list” format, tab divided.
UserIDs are anonymized.

BFS sample

ReadMe
Request BFS sample

UNI sample

ReadMe
Request UNI sample