Multimedia & Internetworking Research Group
University of Oreg

The Ion P2P Project: Empirical Characterizations of P2P Systems

Home Global Sampling Properties Data Publications People Links


Cruiser captures snapshots of overlay topologies. Each snapshot is an undirected graph with peers as vertices and overlay links as edges. Crawler snapshots can be used for two basic types of analysis:

Static Properties:
The snapshots provide a way to examine static graph properties of the overlay, such as the degree distribution, average path lengths, resilience, and the clustering coefficient.
Dynamic Properties:
The snapshots also provide a way to examine dynamic properties of the overlay. By running the crawler back-to-back, observing the differences between consecutive snapshots allows us to examine peer arrival and departure patterns (churn), and to explore the way the overlay topology evolves and changes over time.

Because the network is changing as the crawler runs, a certain level of distortion is introduced into the snapshots, analogous to the blur in an overexposed picture. If the level of distortion is too high, erroneous conclusions may be drawn about the nature of the topology. As with any measurements, it's critical to understand the amount of error in the measurements before drawing conclusions. Prior overlay topology studies did not address this fundamental concern, taking an hour or more to capture a snapshot.

To achieve rapid crawls, Cruiser runs on a small server farm, with each server contacting hundreds of peers in parallel. In [1], we describe some of the details of Cruiser's implementation.

Our initial version of Cruiser was tailored to the Gnutella network, where it can capture 1 million nodes in under 7 minutes. In [2], we develop techniques to evaluate the accuracy of P2P crawlers and show that the Gnutella snapshots captured by Cruiser have a level of distortion of around 4%. Additionally, we demonstrate how crawling the network too slowly can lead to erroneous conclusions.

More recently, we modified Cruiser to user a plug-in architecture. By writing a small system-specific plug-in, Cruiser can capture snapshots of other P2P systems. In addition to Gnutella, Cruiser can now crawl the Kad DHT network. Due to the large size and rich routing tables in Kad, Cruiser cannot capture the entire Kad topology in a reasonable amount of time. Therefore, we added a feature that allows Cruiser to focus on a particular zone with in a DHT network, capturing the topology within a certain region of the DHT geometry [3].

[1]Daniel Stutzbach and Reza Rejaie, "Capturing Accurate Snapshots of the Gnutella Network", the Global Internet Symposium, March, 2005.
[2]Daniel Stutzbach and Reza Rejaie, "Evaluating the Accuracy of Captured Snapshots by Peer-to-Peer Crawlers", Passive & Active Measurement Workshop (PAM) Extended Abstract, March, 2005. Expanded Tech Report version.
[3]Daniel Stutzbach and Reza Rejaie, "Improving Lookup Performance over a Widely-Deployed DHT", to appear at INFOCOM 2006.


While originally designed to capture snapshots of the overlay topology, with some changes Cruiser can be adapted to capture other peer properties. With File-Cruiser, we modified Cruiser to capture the list of files shared at each peer [4] in Gnutella. To our knowledge, this is the only tool which attempts to capture a complete list of all the files shared in a large P2P network.

[4]Shanyu Zhao, Daniel Stutzbach, and Reza Rejaie, "Characterizing Files in the Modern Gnutella Network: A Measurement Study", SPIE/ACM Multimedia Computing and Networking, San Jose, January 2006.