Over the last few years, Online Social Networks (OSNs) such as MySpace and
Facebook have attracted hundreds of millions of users and have been
responsible for a new wave of popular application over the Internet. The
dramatic increase in the popularity of OSNs have prompted network
researchers and practitioners to examine the connectivity structure of
well-known OSNs and the growth of their user population over time.
Online Social Network (OSN) are often represented as graph where nodes
correspond to users and links reflect the friendship or interactions among
users. Such graph representations can be annotated by various node or link
attributes and are usually derived from a snapshot that is obtained through
measurements of the system. These graph representations enable researchers
to characterize the connectivity of these systems using various techniques
from graph analysis and relying on an array of graph metrics such as degree
distribution, diameter, clustering co-efficient.
Previous studies on networked systems that have adopted this approach
typically suffer from three important shortcomings:
Distorted Data: Captured snapshots of large-scale networked systems are
likely to be distorted which in turn leads to inaccurate representation of
the system and thus to potentially incorrect results.
Indirect Characterization: commonly-used graph metrics or statistics,
characterize the connectivity structure, only if the connectivity graph is indirect
and thus have only limited use; One can produce a set of synthetic graphs which
have the exact same metrics or statistics but exhibit fundamentally
different connectivity structures.
Graph Dynamics: Evolution of OSNs is in general ignored despite the fact
that most real-world OSNs evolve over time, often at a fast rate. In the
absence of a more careful and meaningful characterization of graph
connectivity, it is difficult to examine the structural properties of these
system, study their evolution over time, or compare the connectivity graphs
of different systems.
The goal of this multi-disciplinary project is to design, develop and
rigorously evaluate theoretically grounded techniques to accurately measure
and properly characterize the connectivity structure and in large-scale and
dynamic networked systems with multi-attribute nodes or links. Furthermore,
we explore social and technical root-causes of observed properties.
Projects
Evolution of User Population & Their Activities in MySpace
While some empirical studies on Online Social Networks (OSNs) have examined
the growth of these systems, little is known about the patterns of decline
in user population or user activity (in terms of visiting their OSN account)
in large OSNs, mainly because capturing the required information is
challenging.
In this study, we examine the evolution of user population and user activity
in a popular OSN, namely MySpace. Leveraging more than 360K randomly sampled
profiles, we characterize both the pattern of departure and the level of
activity among MySpace users. Our main findings can be summarized as
follows: (i) A significant fraction of accounts have been deleted and a
large fraction of valid accounts have not been visited for more than three
months. (ii) One third of public accounts are owned by users who abandon
their accounts shortly after creation (i.e., tourists). We leverage this
information to estimate the account creation time of other users from their
user IDs. (iii) We demonstrate that the growth of allocated user IDs in
MySpace was exponential, followed by a sudden and significant slow-down in
April 2008 due to an increase in the popularity of Facebook. If such up- and
down-turns are symptomatic of OSNs, they raise the obvious question: What
are the main forces that enable some systems to compete and OSN eco-system,
while others d decline and ultimately die out? see our wosn'09 paper for
more details.
Characterizing Fan-Owner Interactions in Flickr
Most of the existing literature on empirical studies of Online Social
Networks (OSNs) have focused on characterizing and modeling the structure of
their inferred friendship graphs. However, the friendship graph of an OSN
does not demonstrate what fraction of its users actively interact with other
users, how these users interact, and how these active users and their
interactions evolve over time.
In this study, we characterize indirect fan-owner interactions through
photos among users in a large photo-sharing OSN, namely Flickr. Our results
show that a very small fraction of users in the main component of the
friendship graph is responsible for the vast majority of fan-owner
interactions; moreover, these interactions involve only a small fraction of
photos in Flickr. We also characterize some of the temporal properties of
fan arrival. For example, we show that there is no strong correlation
between age and popularity of a photo and that most photos gain a majority
of their fans during the first week after their posting. Overall, our
findings provide new insights into the fan-owner interactions among Flickr
users. see our wosn'09 paper
for more details.
Unbiased Sampling of User Properties
Reliable characterization of OSNs requires capturing the accurate snapshots
of their desired properties. OSN owners are often not willing to share this
information due to privacy concerns.if Given the large size and dynamic
nature of OSNs, it is often difficult to capture accurate snapshots of the
system. Sampling is the most promising approach to characterize "user
properties" without capturing complete snapshots of the system. In this
project, we investigate the accuracy of sampling techniques that use some
variations of random-walk, namely Metropolized Random Walk (MRW) Sampling
and Respondent Driven Sampling (RDS), over graphs that OSNs are likely to
exhibit.
Sizing-up Online Social Networks
R. Rejaie, M. Torkjazi, M. Valafar, and W. Willinger,
IEEE Network Special Issue on Online Social Networks, Volume 24, Number 5,
September/October 2010.
This material is based upon work supported by the National Science
Foundation under Grant No. 0917381. Any opinions, findings, and conclusions
or recommendations expressed in this material are those of the author(s) and
do not necessarily reflect the views of the National Science Foundation.