Multimedia & Internetworking Research Group
University of Oreg
on





iOn-OSN: Measurement & Characterization of Online Social Networks


Overview

Over the last few years, Online Social Networks (OSNs) such as MySpace and Facebook have attracted hundreds of millions of users and have been responsible for a new wave of popular application over the Internet. The dramatic increase in the popularity of OSNs have prompted network researchers and practitioners to examine the connectivity structure of well-known OSNs and the growth of their user population over time.

Online Social Network (OSN) are often represented as graph where nodes correspond to users and links reflect the friendship or interactions among users. Such graph representations can be annotated by various node or link attributes and are usually derived from a snapshot that is obtained through measurements of the system. These graph representations enable researchers to characterize the connectivity of these systems using various techniques from graph analysis and relying on an array of graph metrics such as degree distribution, diameter, clustering co-efficient.

Previous studies on networked systems that have adopted this approach typically suffer from three important shortcomings:

  • Distorted Data: Captured snapshots of large-scale networked systems are likely to be distorted which in turn leads to inaccurate representation of the system and thus to potentially incorrect results.
  • Indirect Characterization: commonly-used graph metrics or statistics, characterize the connectivity structure, only if the connectivity graph is indirect and thus have only limited use; One can produce a set of synthetic graphs which have the exact same metrics or statistics but exhibit fundamentally different connectivity structures.
  • Graph Dynamics: Evolution of OSNs is in general ignored despite the fact that most real-world OSNs evolve over time, often at a fast rate. In the absence of a more careful and meaningful characterization of graph connectivity, it is difficult to examine the structural properties of these system, study their evolution over time, or compare the connectivity graphs of different systems.

The goal of this multi-disciplinary project is to design, develop and rigorously evaluate theoretically grounded techniques to accurately measure and properly characterize the connectivity structure and in large-scale and dynamic networked systems with multi-attribute nodes or links. Furthermore, we explore social and technical root-causes of observed properties.




Projects

Evolution of User Population & Their Activities in MySpace

While some empirical studies on Online Social Networks (OSNs) have examined the growth of these systems, little is known about the patterns of decline in user population or user activity (in terms of visiting their OSN account) in large OSNs, mainly because capturing the required information is challenging.

In this study, we examine the evolution of user population and user activity in a popular OSN, namely MySpace. Leveraging more than 360K randomly sampled profiles, we characterize both the pattern of departure and the level of activity among MySpace users. Our main findings can be summarized as follows: (i) A significant fraction of accounts have been deleted and a large fraction of valid accounts have not been visited for more than three months. (ii) One third of public accounts are owned by users who abandon their accounts shortly after creation (i.e., tourists). We leverage this information to estimate the account creation time of other users from their user IDs. (iii) We demonstrate that the growth of allocated user IDs in MySpace was exponential, followed by a sudden and significant slow-down in April 2008 due to an increase in the popularity of Facebook. If such up- and down-turns are symptomatic of OSNs, they raise the obvious question: What are the main forces that enable some systems to compete and OSN eco-system, while others d decline and ultimately die out? see our wosn'09 paper for more details.


Characterizing Fan-Owner Interactions in Flickr

Most of the existing literature on empirical studies of Online Social Networks (OSNs) have focused on characterizing and modeling the structure of their inferred friendship graphs. However, the friendship graph of an OSN does not demonstrate what fraction of its users actively interact with other users, how these users interact, and how these active users and their interactions evolve over time.

In this study, we characterize indirect fan-owner interactions through photos among users in a large photo-sharing OSN, namely Flickr. Our results show that a very small fraction of users in the main component of the friendship graph is responsible for the vast majority of fan-owner interactions; moreover, these interactions involve only a small fraction of photos in Flickr. We also characterize some of the temporal properties of fan arrival. For example, we show that there is no strong correlation between age and popularity of a photo and that most photos gain a majority of their fans during the first week after their posting. Overall, our findings provide new insights into the fan-owner interactions among Flickr users. see our wosn'09 paper for more details.


Unbiased Sampling of User Properties

Reliable characterization of OSNs requires capturing the accurate snapshots of their desired properties. OSN owners are often not willing to share this information due to privacy concerns.if Given the large size and dynamic nature of OSNs, it is often difficult to capture accurate snapshots of the system. Sampling is the most promising approach to characterize "user properties" without capturing complete snapshots of the system. In this project, we investigate the accuracy of sampling techniques that use some variations of random-walk, namely Metropolized Random Walk (MRW) Sampling and Respondent Driven Sampling (RDS), over graphs that OSNs are likely to exhibit.





Project Members




Publications




Code and Data

Please contact Masoud Valafar for data and code.




Acknowledgment and Disclaimer

This material is based upon work supported by the National Science Foundation under Grant No. 0917381. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.