Because successful peer-to-peer systems are so large, it is often
prohibitive to try to capture the entire system. Sampling is a
logical technique to try to work around this difficulty. However,
capturing unbiased samples is challenging in peer-to-peer systems.
The non-regular degree distribution and skewed session time
distribution cause the most obvious and simple sampling techniques to
be unrepresentative. We call these temporal and topological
causes of bias. We have developed techniques for gathering unbiased
samples, and developed them into a tool called ion-sampler.
In , we present some of preliminary results where we
demonstrate the heavy bias introduced by conventional techniques and
explore a few promising techniques. In this preliminary study, we
examine the temporal and topological causes of bias separately.
In , we present a modification of the
Metropolis--Hastings random walk technique and demonstrate its
effectiveness for collecting unbiased samples. We test it under a
wide range of simulation scenarios using a dynamic overlay
simulation. Building this technique into our ion-sampler tool,
we also conduct empirical validations.
Source code for the ion-sampler tool is now available, as well
as the raw measurement data gathered for our IMC paper .