Chord：一种可扩展的P2P查找服务协议

需积分: 9 173 浏览量更新于2024-10-14 1 收藏 246KB PDF 举报

"Chord是一种可扩展的点对点（P2P）查找服务，用于互联网应用程序，由Ion Stoica、Robert Morris、David Karger、M. Frans Kaashoek和Hari Balakrishnan等人在MIT Laboratory for Computer Science开发。" 在分布式系统和点对点网络中，Chord协议提供了一种高效的方法来定位存储特定数据项的节点。其核心功能是通过键（key）映射到相应的节点，从而实现数据定位。这通常通过将键与每个数据项关联，并将键/数据项对存储在键所映射的节点上来实现。Chord协议的一个显著特点是它能自适应地处理节点的加入和离开，即使系统持续变化，也能有效地响应查询。 Chord的设计基于一致性哈希（Consistent Hashing）原理，它将环形空间分为多个槽位，每个节点负责一部分槽位。每个节点根据其ID（通常为IP地址或唯一标识符）在环上找到自己的位置。当一个查询到达时，Chord使用一种简单的指针跳跃算法，沿着环路按顺序查找，直到找到存储目标键的节点。这个过程中的通信成本和每个节点需要维护的状态都是对数级的，因此随着Chord网络中节点数量的增加，其性能依然保持良好。 Chord协议的关键特性包括： 1. **扩展性**：Chord的设计使得其在节点数量大规模增长时仍能保持高效的查找性能。理论分析、模拟和实验结果均证明了这一点。 2. **稳定性**：节点的动态加入和离开不会严重影响系统的整体性能。Chord通过重新分配槽位和指针来平衡负载，确保系统的稳定运行。 3. **容错性**：如果某个节点故障，其负责的槽位和数据可以通过其他节点接管，保证服务的连续性。 4. **简单性**：Chord协议的结构相对简单，易于理解和实现，这有利于系统的部署和维护。尽管Chord在设计时考虑了很多实际应用中的挑战，但它也存在一些潜在的问题。例如，由于节点间的通信主要是基于IP网络，网络延迟可能会影响查找效率。此外，尽管Chord在节点数量较少时表现出色，但在极其庞大的网络中，可能会出现热点问题，即某些节点因负责大量数据而成为性能瓶颈。为了克服这些问题，后续的研究和发展提出了许多改进方案，如使用虚拟节点来均衡负载，以及结合其他数据结构（如Skip Graphs）来提高查找效率。这些发展进一步增强了Chord在P2P网络中的实用性和适应性，使其成为构建大规模分布式系统的重要工具。

Server

Chord

Chord Chord

File System

Block Store Block Store Block Store

Client Server

Figure 1: Structure of an example Chord-based distributed

storage system.

machine is only occasionally available, they can offer to store

others’ data while they are up, in return for having their data

stored elsewhere when they are down. The data’s name can

serve as a key to identify the (live) Chord node responsible

for storing the data item at any given time. Many of the

same issues arise as in the Cooperative Mirroring applica-

tion, though the focus here is on availability rather than load

balance.

Distributed Indexes to support Gnutella- or Napster-like keyword

search. A key in this application could be derived from the

desired keywords, while values could be lists of machines

offering documents with those keywords.

Large-Scale Combinatorial Search, such as code breaking. In

this case keys are candidate solutions to the problem (such as

cryptographic keys); Chord maps these keys to the machines

responsible for testing them as solutions.

Figure 1 shows a possible three-layered software structure for a

cooperative mirror system. The highest layer would provide a ﬁle-

like interface to users, including user-friendly naming and authenti-

cation. This “ﬁle system” layer might implement named directories

and ﬁles, mapping operations on them to lower-level block opera-

tions. The next layer, a “block storage” layer, would implement

the block operations. It would take care of storage, caching, and

replication of blocks. The block storage layer would use Chord to

identify the node responsible for storing a block, and then talk to

the block storage server on that node to read or write the block.

4. The Base Chord Protocol

The Chord protocol speciﬁes how to ﬁnd the locations of keys,

how new nodes join the system, and how to recover from the failure

(or planned departure) of existing nodes. This section describes a

simpliﬁed version of the protocol that does not handle concurrent

joins or failures. Section 5 describes enhancements to the base pro-

tocol to handle concurrent joins and failures.

4.1 Overview

At its heart, Chord provides fast distributed computation of a

hash function mapping keys to nodes responsible for them. It uses

consistent hashing [11, 13], which has several good properties.

With high probability the hash function balances load (all nodes

receive roughly the same number of keys). Also with high prob-

ability, when an





node joins (or leaves) the network, only an







fraction of the keys are moved to a different location—

this is clearly the minimum necessary to maintain a balanced load.

successor(2) = 3

successor(6) = 0

successor(1) = 1

Figure 2: An identiﬁer circle consisting of the three nodes 0, 1,

and 3. In this example, key 1 is located at node 1, key 2 at node

3, and key 6 at node 0.

Chord improves the scalability of consistent hashing by avoid-

ing the requirement that every node know about every other node.

A Chord node needs only a small amount of “routing” informa-

tion about other nodes. Because this information is distributed, a

node resolves the hash function by communicating with a few other

nodes. In an



-node network, each node maintains information

only about

 



other nodes, and a lookup requires





messages.

Chord must update the routing information when a node joins or

leaves the network; a join or leave requires







messages.

4.2 Consistent Hashing

The consistent hash function assigns each node and key an



-bit

identiﬁer using a base hash function such as SHA-1 [9]. A node’s

identiﬁer is chosen by hashing the node’s IP address, while a key

identiﬁer is produced by hashing the key. We will use the term

“key” to refer to both the original key and its image under the hash

function, as the meaning will be clear from context. Similarly, the

term “node” will refer to both the node and its identiﬁer under the

hash function. The identiﬁer length



must be large enough to

make the probability of two nodes or keys hashing to the same iden-

tiﬁer negligible.

Consistent hashing assigns keys to nodes as follows. Identiﬁers

are ordered in an identiﬁer circle modulo



. Key



is assigned to

the ﬁrst node whose identiﬁer is equal to or follows (the identiﬁer

of)



in the identiﬁer space. This node is called the successor node

of key



, denoted by successor







. If identiﬁers are represented as

a circle of numbers from







, then









is the

ﬁrst node clockwise from



Figure 2 shows an identiﬁer circle with







. The circle has

three nodes: 0, 1, and 3. The successor of identiﬁer 1 is node 1, so

key 1 would be located at node 1. Similarly, key 2 would be located

at node 3, and key 6 at node 0.

Consistent hashing is designed to let nodes enter and leave the

network with minimal disruption. To maintain the consistent hash-

ing mapping when a node



joins the network, certain keys previ-

ously assigned to



’s successor now become assigned to



. When

node



leaves the network, all of its assigned keys are reassigned



’s successor. No other changes in assignment of keys to nodes

need occur. In the example above, if a node were to join with iden-

tiﬁer 7, it would capture the key with identiﬁer 6 from the node

with identiﬁer 0.

The following results are proven in the papers that introduced

consistent hashing [11, 13]:

THEOREM 1. For any set of



nodes and



keys, with high

probability:

1. Each node is responsible for at most



 "!









keys

剩余11页未读，继续阅读

ferret2008

粉丝: 36
资源: 10

Chord：一种可扩展的P2P查找服务协议

Chord算法实现

(Chord)A scalable peer-to-peer lookup service for Internet applications.ppt

Chord Peer-to-peer Lookup Protocol

Peer-to-Peer Computing

The Chord-DHash Project - Chord HOWTO_files

P2P（peer-to-peer 对等网络）综述

ionic-chord-scales-dict

Spotify-Chord-Finder-App

chord-paper-fe

chord-cheat-sheet

最新资源