On the Usage of Global Document Occurrences in
Peer-to-Peer Information Systems

Odysseas Papapetrou, Sebastian Michel, Matthias Bender, Gerhard Weikum
Max-Planck Institut für Informatik
66123 Saarbrücken, Germany

{odysseas, smichel, mbender, weikum}@mpi-inf.mpg.de

Abstract: There exist a number of approaches for query processing in Peer-to-Peer information systems that efficiently retrieve relevant information from distributed peers. However, very few of them take into consideration the overlap between peers: as the most popular resources (e.g., documents or files) are often present at most of the peers, a large fraction of the documents eventually received by the query initiator are duplicates. We develop a technique based on the notion of global document occurrences (GDO) that, when processing a query, penalizes frequent documents increasingly as more and more peers contribute their local results. We argue that the additional effort to create and maintain the GDO information is reasonably low, as the necessary information can be piggybacked onto the existing communication. Early experiments indicate that our approach significantly decreases the number of peers that have to be involved in a query to reach a certain level of recall and, thus, decreases user-perceived latency and the wastage of network resources.

