理解Paxos：从抽象到实现的分布式一致性算法解析

需积分: 9 192 浏览量更新于2024-07-25 收藏 309KB PDF 举报

"这篇文档是关于Paxos算法的经典介绍，由Butler W. Lampson撰写，他通过简洁的方式解释了这个在分布式领域至关重要的算法。Paxos算法主要用于实现复制状态机，这是一种广泛用于容错的技术。文章首先提供了一个抽象版本的Paxos算法，然后逐步推导出拜占庭、经典和磁盘版本的Paxos，并讨论它们之间的关系、安全性和性能。" 在分布式计算中，Paxos算法是一个基础且关键的共识算法，它解决了在不可靠网络环境下如何达成一致性的难题。Paxos算法最初由Leslie Lamport提出，其核心目标是在存在网络延迟、消息丢失或错误的异步环境中，确保一组进程（节点）能够就一个值达成一致。 1. 抽象Paxos算法（Abstract Paxos, AP）文档中的抽象Paxos算法（AP）概述了Paxos的基本思想，但不直接适用于实际实现，因为它包含了一些理想化的动作。AP算法的核心是通过提案者（Proposer）、接受者（Acceptor）和学习者（Learner）这三个角色来协调决策过程。提案者提出值，接受者接收并投票，学习者最终学习并应用达成共识的值。 2. 拜占庭Paxos 拜占庭Paxos扩展了基本的Paxos算法，处理更复杂的故障情况，包括恶意节点的行为。在拜占庭故障模型中，节点可能不仅会失败，还可能发送错误或误导性的信息。拜占庭Paxos通过引入额外的验证机制来确保即使在这样的环境中也能达成一致。 3. 经典Paxos 经典Paxos是指Lamport最初描述的简化版Paxos，适用于非拜占庭环境，假设节点只会意外失败而不会故意破坏系统。它通常分为两个阶段：准备阶段（Prepare）和承诺阶段（Promise），以确保只有一个提案被接受。 4. 磁盘Paxos 磁盘Paxos考虑了持久化存储的问题，因为在实际系统中，数据通常需要写入磁盘以防止因节点故障而丢失。磁盘Paxos通过在磁盘上记录信息来确保在节点重启后仍然能够恢复和继续共识过程。 5. 安全性与活性安全性指的是一旦达成的共识不能改变，而活性则确保系统能够继续前进并最终达成共识。文档分析了每个版本的Paxos在这些方面的特性，包括可能存在的阻塞点和解决方法。 6. 性能性能讨论了不同版本Paxos的效率，包括通信开销和延迟。例如，经典Paxos通常比拜占庭Paxos效率更高，因为后者需要处理更多的验证步骤。 7. 关键词与分类文章涵盖了软件正确性证明、容错性、理论和安全性等多个领域，关键词包括Paxos、异步共识、容错、复制状态机、拜占庭、状态机等。这篇文档提供了一个全面而深入的Paxos算法理解，对于任何想了解或实现分布式系统一致性的人都是宝贵的资源。

doesn’t falsify Q#G. If G

⇒ G

then Q#G

⇒ Q#G

. It’s natural to

define

= {q | q ∉ Z

}, and similarly for Q~

Quorum sets

Q and Q are (mutually) exclusive if we can’t have

both a

Q quorum for G and a Q quorum for its negation:

(∀G | Q#G ⇒ ~Q #~G). This holds if every Q quorum intersects

every

Q quorum in a set of processes that can’t all be faulty:

∀q ∈ Q, q ∈ Q | q ∩ q ∈ Q~

This is how we lift local exclusion G

⇒ ~G

to global exclusion

Q#G

⇒ ~Q #G

. Exclusion is what we need for safety.

For liveness we need to relate various quorums to the sets of

possibly faulty or stopped processes.

To ensure

G holds at some non-faulty process, we need to

hear it from a good quorum, one that can’t all be faulty, that

is, one in

. If g = G

is independent of m, then Q

#G ⇒ g;

this is how we establish g by hearing from some processes.

To ensure that henceforth there’s a visible

Q quorum satisfy-

ing a predicate G, we need a quorum Q

satisfying G that still

leaves a

Q quorum after losing any set that can fail:

= {q | (∀z ∈ Z

| q – z ∈ Q}.

If Q



{} then Q is live: there’s always some quorum in Q

that isn’t failed.

The most popular quorum sets are based only on the size of the

quorums:



= {q | |q|



i}. If there are n processes, then for Q



and

M

to be exclusive, we need i + j > n + f. If Z

= Z

I

then

= Q

I

. If Z

= Z

V

then Q

L

= Q



s+i

and Q



live requires



n – s, since Q

= {}. So we get n + f < i + j



2(n – s), or

n > f + 2s. Also i > n + f – j



n + f – (n – s), or i > f + s. With the

minimum n = f + 2s + 1, f + s < i



f + s + 1, so we must have i =

f + s + 1. If s = f, we get n = 3f + 1 and i = 2f + 1.

With

f = 0 there are exclusive ‘grid’ quorum sets: arrange the

processes in a rectangular grid and take Q to be the rows and Q

the columns. If Q must exclude itself, take a quorum to be a row

and a column, minus the intersection if both have more than two

processes. The advantage: a quorum is only

n or 2(

n – 1) proc-

esses, not n/2. This generalizes to f > 0 because quorums of i rows

and

j columns intersect in 2ij processes [15].

For the Intel-Microsoft example, an exclusive quorum must be

the union of an exclusive quorum on each of the two sides.

3 The specification for consensus

The external actions are Input, which provides an input value from

the client, and

Decision, which returns the decision, waiting until

there is one.

Consensus collects the inputs in the input set, and the

internal Decide action picks one from the set.

type X =… values to agree on

var d : (X ∪ {nil}) := nil Decision; x/nil, not out

input : set X := {}

Name Guard State change

Input(x) input := input ∪ {x}

Decision: X d



nil ret d

Decide d = nil ∧ x ∈ input

d := x

For replicated state machines, the inputs are requests from the

clients. Typically there is more than one at a time; those that don’t

win are carried over to

input for the next step.

A different spec would allow it to return nil if there’s no decision, but

then it must be able to return nil even if there has already been a decision,

since a client may do the Decision action at a process that hasn’t yet heard

about the decision. For this paper it makes no difference.

It’s interesting to observe that there is a simpler spec with iden-

tical behavior.

It has the same d and Decision, but drops input and

Decide, doing the work in Input.

var d : (X ∪ {nil}) := nil Decision; x/nil, not out

Input(x) if d = nil then optionally d := x

Decision: X d



nil ret d

A simulation proof that the first spec implements the second,

however, requires a prophecy variable or backward simulation.

This spec says nothing about liveness, because there is no live

algorithm for asynchronous consensus [4].

4 Abstract Paxos

As we said in section 1.2, the idea of Paxos is to have a sequence

of views until one of them forms a quorum that is noticed. So

each view has three stages:

Choose an input value that is anchored: guaranteed to be the

same as any previous decision.

Try to get a decision quorum of agents to accept the value.

If successful, finish the algorithm by recording the decision

at the agents.

This section describes AP, an abstract version of Paxos. AP

can’t run on your computers because some of the actions refer to

non-local state (marked like this so you can easily see where the

implementation must differ). In particular,

Choose and c

are com-

pletely non-local in AP. In later sections we will see different

ways to implement AP with actions that are entirely local. The

key problem is implementing

Choose.

AP has external actions with the same names as the spec, of

course. They are almost identical to the actions of the spec.

Name Guard State change

Input(x) input := input ∪ {x}

Decision

: X d



nil ret d

4.1 State variables

type V = ... View; totally ordered

Y = X ∪ {out, nil}

A ⊆ M = … Agent

Q = set A Quorum

const Q

dec

: set Q := ... decision Quorum set

out

: set Q := ... out Quorum set

: V := ... smallest V

The views must be totally ordered, with a first view v

. Q

dec

and

out

must be exclusive.

var r

: Y := nil, but r

:= out Result

: Y := nil Decision; x/nil, not out

: Y := nil Choice; x/nil, not out

input : set X := {}

active

: Bool := false

Each agent has a decision d

, and a result r

for each view; we

take

= out for every a. AP doesn’t say where the other vari-

ables live.

4.2 State functions and invariants

We define a state function r

that is a summary of the r

: the

view’s choice if there’s a decision quorum for that among the

agents, or

out if there’s an out quorum for that, or nil otherwise.

I am indebted to Michael Jackson for a remark that led to this idea.

剩余15页未读，继续阅读

dotapresident

粉丝: 0
资源: 1

理解Paxos：从抽象到实现的分布式一致性算法解析

Paxo:距离场建模工具

Paxo：构建OpenGL距离场的建模工具

Paxo状态机是什么

YOLO算法-城市电杆数据集-496张图像带标签-电杆.zip

(177406840)JAVA图书管理系统毕业设计(源代码+论文).rar

(35734838)信号与系统实验一实验报告

YOLO算法-椅子检测故障数据集-300张图像带标签.zip

基于小程序的新冠抗原自测平台小程序源代码（java+小程序+mysql+LW）.zip

YOLO算法-俯视视角草原绵羊检测数据集-4133张图像带标签-羊.zip

(171674830)PYQT5+openCV项目实战：微循环仪图片、视频记录和人工对比软件源码

最新资源