1
MappingtheGnutellaNetwork:
MacroscopicPropertiesofLarge-ScalePeer-to-PeerSystems
MateiRipeanu,IanFoster
{matei,foster}@cs.uchicago.edu
Abstract
Despiterecentexcitementgeneratedbythepeer-to-peer
(P2P)paradigmandthesurprisinglyrapiddeployment
of some P2P applications, there are few quantitative
evaluations of P2P systems behavior. The open
architecture, achieved scale, and self-organizing
structureoftheGnutellanetworkmakeitaninteresting
P2P architecture to study. Like most other P2P
applications,Gnutellabuilds,attheapplicationlevel,a
virtualnetworkwithitsownroutingmechanisms.The
topology of this overlay network and the routing
mechanisms used have a significant influence on
applicationpropertiessuchasperformance,reliability,
andscalability.Wedescribetechniquestodiscoverand
analyze the Gnutella’s overlay network topology and
evaluategeneratednetworktraffic.Ourmajorfindings
are: (1) although Gnutella is not a pure power-law
network,itscurrentconfigurationhasthebenefitsand
drawbacks of a power-law structure, (2) we estimate
theaggregatedvolumeofgeneratedtraffic,and(3)the
Gnutellavirtualnetworktopologydoesnotmatchwell
the underlying Internet topology, hence leading to
ineffective use of the physical networking
infrastructure.Webelievethatourfindingsas wellas
our measurement and analysis techniques have broad
applicabilitytoP2Psystemsandprovideusefulinsights
intoP2Psystemdesigntradeoffs.
1. Introduction
Unlike traditional distributed systems, P2P
networks aim to aggregate large numbers of
computers that join and leave the network
frequently. In pure P2P systems, individual
computers communicate directly with each other
andshareinformationandresourceswithoutusing
dedicated servers. A common characteristic of
thisnewbreedofsystemsisthattheybuild,atthe
application level, a virtual networkwith its own
routingmechanisms.Thetopologyofthisoverlay
networkandtheroutingmechanismsusedhavea
significant impact on application properties such
as performance, reliability, scalability, and, in
some cases, anonymity. The topology also
determines the communication costs associated
with running the P2P application, both at
individualhostsandintheaggregate.Notethatthe
decentralized nature of pure P2P systems means
that these properties are emergent properties,
determined by entirely local decisions made by
individual resources, based only on local
information:wearedealingwithaself-organized
networkofindependententities.
Theseconsiderationsmotivateustoconducta
macroscopic study of a popular P2P system:
Gnutella (described succinctly in Section 2). In
this study, we benefit from Gnutella’s large
existinguser base and openarchitecture,and, in
effect,usethepublicGnutellanetworkasalarge-
scale,ifuncontrolled,testbed.
OurmeasurementsandanalysisoftheGnutella
network are driven by two primary questions
(Section4). The first concerns its connectivity
structure. Recent research [1] shows that
networksasdiverseasnaturalnetworksformedby
moleculesinacell,networksofpeopleinasocial
group,ortheInternet,organizethemselvessothat
mostnodeshavefewlinkswhileatinynumberof
nodes,calledhubs,havealargenumberoflinks.
[2] finds that networks following this
organizational pattern (power-law networks)
display an unexpected degree of robustness: the
abilityoftheirnodestocommunicateisunaffected
even by extremely high failure rates. However,
random error tolerance comes at a high price:
these networks are vulnerable to attacks, i.e., to
the selection and removal of a few nodes that
providemostofthenetwork’sconnectivity.We
show that, although Gnutella is not a pure
power-law network, it preserves good fault
tolerance characteristics while being less
dependent than a pure power-law network on
highlyconnectednodesthatareeasytosingleout
(andattack).
Thesecondquestionconcernshowwell(ifat
all)theGnutellavirtualnetworktopologymapsto
thephysicalInternetinfrastructure.Therearetwo
reasons for analyzing this issue. First, it is a
questionofcrucialimportanceforInternetService
Providers (ISP): if the virtual topology does not
follow the physical infrastructure, then the
additional stress on the infrastructure and,
consequently, the costs for ISPs, are immense.