|
Donato Debora |
|
|
Università di Roma - La Sapienza |
Abstract
Since its first inception, understanding the topological structure of the World Wide Web has been one of the major challenging research topic. It has been noticed that some of the Web properties can differ from sample to sample and can be biased by different factors as the domain analyzed, the crawl strategy used, the initial pages set. Sampling methods allow researcher to extract small but representative samples of Web pages. In this talk we present a comparative study of a number of sampling methods accomplished in order to evaluate the effectiveness of each of them and to estimate the fraction of nodes that have to be picked, according to each schema, to obtain a set of measures that resemble the ones performed over the whole starting graph.