Multimedia & Internetworking Research Group
University of Oreg
on




The Ion P2P Project: Empirical Characterizations of P2P Systems

Home Global Sampling Properties Data Publications People Links

The ION P2P Project aims to improve our understanding of Peer-to-Peer systems by answering two fundamental questions:

  • How do we collect accurate measurements from P2P systems?
  • What are useful models to characterize their components?

To answer these questions, we pursue the following goals: (i) develop and verify accurate measurements techniques, (ii) present empirical results which may be used to evaluate models and improve our understanding of existing systems, and (iii) suggest useful models based on the empirical observations.

About Peer-to-Peer

Peer-to-peer systems are becoming increasingly popular, with millions of simultaneous users and covering a wide range of applications, from file-sharing programs like LimeWire and eMule to Internet telephony services such as Skype. Understanding existing systems and devising new P2P techniques relies on having access to representative models derived from empirical observations of existing systems. However, the large and dynamic nature of P2P systems makes capturing accurate measurements challenging.

While some prior studies have attempted to characterize different aspects of P2P systems, they have not taken the first step of critically examining their measurement tools (i.e., answering the first question), leading to conflicting results or conclusions based on measurement artifacts (e.g., power-law degree distributions may be the result of measurement error).

Developing Accurate Measurement Techniques

There are two basic approaches to collecting measurements from P2P systems, each with advantages and disadvantages. The first approach is to capture a global view by collecting data about every peer in the system. The advantage of this approach is that all the information is available. The typical problem with this approach is that the state changes while the measurement tool communicates with the peers, leading to a distorted view of the system. The longer the tool requires to capture the global view, the more distorted the data. Additionally, capturing a global view does not scale well. As P2P systems grow larger, capturing a global view becomes more time consuming, leading to greater distortion. To be at all practical, capturing a global view requires an exceptionally fast tool able to gather data from a large number of peers very quickly.

The second approach is to collect local samples. The advantage of this approach is that it scales well. The Law of Large Numbers from statistics tells us that the average from a large number of samples will closely approximate the true average, regardless of the population size. One disadvantage of sampling is that we cannot easily use samples to examine certain properties which are fundamentally global in nature. For example, we cannot compute the diameter of a graph based on local observations at several peers. More importantly, if the collected samples are biased in some way, the resulting data may lead us to incorrect conclusions. For this reason, capturing samples requires validation of the sampling tool, which must be carefully designed to avoid bias.

Characterizing Properties

Systematically tackling the problem of characterizing P2P systems requires a structured organization of the different components. At the most basic level, a P2P system consists of a set of connected peers. We can view this as a graph with the peers as vertices and the connections as edges. One fundamental way we can divide the problem space is into properties of the peers versus properties of the way peers are connected. Another fundamental division is examining the way the system is versus the way the system evolves. The table below presents an overview of several interesting properties categorized by whether they are static or dynamic, and whether they are peer properties or connectivity properties.

  Peer Properties Connectivity Properties
Static Properties
Dynamic Properties

Acknowledgments

This material is based upon work supported in part by the National Science Foundation (NSF) under Grant No. Nets-NBD-0627202 and an unrestricted gift from Cisco Systems. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF or Cisco.