|
Luca Dall'Asta |
|
|
Università Parigi Sud |
Abstract
SMapping
the Internet generally consists in sampling the network from
a limited set of probes by using traceroute-like methods, a strategy akin
to the merging of different spanning trees to a set of destination. This
methodology has been argued to introduce uncontrolled sampling biases that
might produce statistical properties of the sampled graph which sharply
differ from the original ones.
Here
we explore these biases and provide a statistical analysis of their origin.
We derive a mean-field analytical approximation for the probability
of edge detection that exploits the role of the number of probes
and targets and allows us to relate the globaltopological properties of
the underlying network with the statistical accuracy of the sampled graph.
In particular we show that shortest path routed sampling allows a clear
discrimination of heavy-tailed underlying graphs.
We
complement the analytical discussion with a throughout numerical investigation
of simulated mapping strategies in different network models. We show
that sampled graphs provide a fair qualitative characterization of the
statistical properties of the original networks in a fair range of different
strategies and exploration parameters. The numerical study also allows
the identification of intervals of the exploration parameters that optimize
the fraction of nodes and edges discovered in the sampled graph.
This finding might hint the steps toward more efficient mapping
strategies.