Full-text indexing and Information Retrieval in P2P Systems

Odysseas Papapetrou
Forschungszentrum L3S
Appelstrasse 9a
30167 Hannover

Abstract: Current distributed IR approaches are not readily applicable for P2P scenarios. The high dynamics in these networks and the high cost for building and maintaining indices over Distributed Hashtables make full text indexing and information processing difficult to scale for large P2P networks. My work will propose new approaches for enabling distributed IR over P2P without limiting the network size or mutilating the IR. The basis of these approaches is an innovative distributed clustering algorithm, which can cluster peers in a P2P network based on their content similarity. This clustering enables significant network savings and enables new families of distributed IR algorithms.

Keywords: Peer-to-peer, Distributed Information Retrieval

