并发游戏中的随机策略：巴黎-萨克莱大学博士论文HALId:tel-01519354

168 浏览量更新于2023-12-03 收藏 12.92MB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

0HAL Id: tel-015193540https://theses.hal.science/tel-015193540于2017年5月6日提交0HAL是一个多学科开放存取档案，用于存储和传播科学研究文档，无论其是否已发表。这些文档可以来自法国或国外的教育和研究机构，也可以来自公共或私人研究中心。0HAL（开放式多学科存档）是用于存储和传播法国或国外教育和研究机构、公共或私人研究中心的研究级科学文献的开放存档。0并发游戏中的随机策略0Daniel Stan0引用此版本：0Daniel Stan. Randomized strategies in concurrent games. Computer Science and Game Theory[cs.GT]. Université Paris Saclay (COmUE), 2017. English. �NNT : 2017SACLN011�. �tel-01519354�10NNT : 2017SACLN0110巴黎-萨克莱大学博士学位论文，由Cachan高等师范学校（巴黎-萨克莱高等师范学校）准备。0第580号信息与通信科学与技术博士学位学院计算机科学专业Daniel Stan 随机策略在并发游戏中的应用0博士论文于2017年3月30日在Cachan提交和答辩。0评审委员会成员：0Nathalie Bertrand女士（评审员）INRIA Antonín Ku£era先生（评审员）Masaryk大学Johanne Cohen女士（审查员）CNRS Olivier Serre先生（审查员）CNRS Joost-PieterKatoen先生（主席）RWTH大学 Nicolas Markey先生（导师）CNRS PatriciaBouyer-Decitre女士（导师）CNRS0规约与验证实验室法国巴黎-萨克莱高等师范学校，CNRS 61avenue du Président Wilson, 94235 Cachan Cedex, France0ii1Vive Tikz !2mˆeme lorsque l’on refuse de goˆuter `a la gastronomie du restaurant universitaire3parfois mˆeme des trolls4car il nous arrive de ne mˆeme plus savoir nous nommer autrementiii0致谢0首先，我要感谢我的评审委员会成员，他们同意参加我的答辩，特别是Nathalie Bertrand和AntonínKucera，他们在很短的时间内给了我很多反馈和修改建议。Nicolas和Patricia是我认识了四年的人，我非常感激他们的指导和建议，他们教会了我如何在口头和书面上展示我的工作。更重要的是，他们一直给予我动力和热情，使我能够享受研究，即使我经常错误地认为自己陷入了困境。我当然也与其他研究人员建立了这种动力，比如Arnaud Sangnier和MickaelRandour，我们一起进行了许多热烈的讨论和发现。作为助教，教学也给了我很大的灵感，得益于充满活力的导师，以前是我的教授，比如Serge Haddad、Paul Gastin、Stefan Schwoon、SylvainSchmitz和AlainFinkel，以及其他很多资源丰富的学生。LSV是一个非常愉快的实验室，人们很容易融入，咖啡休息时可以轻松展开讨论和辩论。我对过去或现在的“走廊底部”的博士生和实习生们留下了难忘的回忆，与他们一起度过了很多美好的时光：Anthony、Antoine、Samy、Simon、Remy、Guillaume。多亏了他们，我能够认识到Nadine，尤其是Pierre-JacquesHenri先生，他们给了我许多宝贵的建议。我也不会忘记其他博士生，特别是Jérémy、Lucca、Yann、Patrick、Pierre C.、François T.，还有Marie van DenBogaard，她几乎在整个博士期间都指导着博士生团队，并在我自己的写作和答辩之前给了我宝贵的建议。由于研究人员有时对行政事务感到害怕，我要感谢Catherine、Imane，尤其是Virginie的工作，以及他们的耐心和教学。Hugues和Francis在计算机方面也做出了杰出的工作，我必须承认，我很喜欢与他们讨论一些技术问题，有些人可能称之为“油腻”。说到油腻，我很难忽视Crans，它让我在整个ENS学习期间能够进行另一种形式的应用计算机科学，这与我在博士期间所学的理论完全不同。但是，这个伟大的Cachan协会没有“极客”团队将一无是处，我只用他们的化名来称呼他们：Nit、olasd、iota、Zelda、Harry、Tilgaht、Bernie、Chirac、Hamza、Guinness。这个朋友团队也是与Pauline、PEB和20-100一起合租的起源，我们在过去三年里共同分享了许多难忘的回忆，特别是与Zadou和Corentin。我也要感谢Chloé、Baptiste、Tilgaht和Lucile，虽然我很少见到他们，但我总是很喜欢与他们重逢，还有我的父母和妹妹，他们没有推动我走上今天的道路，但始终支持我的选择。encourag´e `a pers´ev´erer.Le d´ebut de ma th`ese a aussi ´et´e marqu´ee par ma rencontre avec Maud, qui n’a jamaiscess´e de croire en moi, de me r´econforter et de m’encourager, mˆeme apr`es que nos cheminsaient diverg´es, je ch´eris ces souvenirs. Je souris encore aux nombreux moments pass´es avecles ≪ Grotas ≫, mais aussi la ≪ Trolloc ≫, Ping, Marion, trolin, Ara, Nolwenn, Ariane, maisaussi Misc et Blupon. Ce dernier m´eritant ainsi d’ˆetre cit´e comme mon ﬁd`ele compagnond’infortune (je lui souhaite bon courage pour sa future th`ese) lors de cette derni`ere ann´ee der´edaction, de soutenance, et enﬁn de d´epart pour une nouvelle vi(ll)e.ivContents1Introduction11.1Game theory and computer science . . . . . . . . . . . . . . . . . . . . . . . .21.2Non-determinism, stochasticity and quantitative problems . . . . . . . . . . .31.3Strategic optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .31.4Outlines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41.5Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51.6Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .62Preliminaries92.1Usual notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .92.2.1Monoid over words . . . . . . . . . . . . . . . . . . . . . . . . . . . . .92.2.3Paths. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .102.3Sets and multisets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .102.5Probability theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .113Concurrent games153.1Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .163.3Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .193.5Two-player zero-sum games . . . . . . . . . . . . . . . . . . . . . . . . . . . .213.6.1Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .263.6.3Subgame perfect equilibrium. . . . . . . . . . . . . . . . . . . . . . .28v01.7 科学出版物 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602.2 幺半群上的运算 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 902.2.2 集合上的关系 . . . . . . . . . . . . . . . . . . . . . . . . . . . 1002.2.4 部分函数 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1002.4 良序 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100I 并发游戏中的混合策略 1303.2 行动的可见性 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1803.4 游戏的结果 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1903.6 纳什均衡 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2503.6.2 子博弈的特征 . . . . . . . . . . . . . . . . . . . . . . . . . . 2703.6.4 平衡的例子 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294Decidability of Nash Equilibria314.1Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .324.1.1One-shot games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .324.1.2k-action matching-pennies games . . . . . . . . . . . . . . . . . . . . .334.1.3Embedded game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .344.2Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .404.2.1Rescale game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .404.2.2Testing game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .434.2.3Counting modules. . . . . . . . . . . . . . . . . . . . . . . . . . . . .444.2.4Description of the reduction . . . . . . . . . . . . . . . . . . . . . . . .464.3Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .504.3.1Unconstrained problem. . . . . . . . . . . . . . . . . . . . . . . . . .504.3.2Qualitative objectives. . . . . . . . . . . . . . . . . . . . . . . . . . .514.3.3Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .555Games that almost-surely terminate575.1Avoiding cycling behaviours . . . . . . . . . . . . . . . . . . . . . . . . . . . .575.1.1Non-cycling games . . . . . . . . . . . . . . . . . . . . . . . . . . . . .595.1.2Strong components . . . . . . . . . . . . . . . . . . . . . . . . . . . . .615.1.3Fixed point analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . .615.2.1Restricting to memoryless deviations . . . . . . . . . . . . . . . . . . .655.2.3Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .68IIParametrized Stochastic Systems717Parametrized register protocols797.1Non-deterministic transition system. . . . . . . . . . . . . . . . . . . . . . .807.3Monotonicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .827.3.2Non-atomicity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .837.4.1Qualitative analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . .868Probabilistic reachability and safety898.1Existence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .89vi0内容05.2 在不精确偏差下的均衡 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6405.2.2 存在定理 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6705.3 在不精确偏差下计算稳态均衡 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6806 交互模型 7307.2 参数化可达性：整体概览 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8207.3.1 向上封闭可达性目标 . . . . . . . . . . . . . . . . . . . 8207.4 概率转移系统 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8507.4.2 截断性质 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8708.2 符号图 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 899Almost-sure reachability939.1First examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .939.1.2Symbolic graph is powerless . . . . . . . . . . . . . . . . . . . . . . . .949.3Bound examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .979.3.1Linear cut-oﬀ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .979.3.2Counter machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .999.3.3PSPACE-hardness . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1029.4Decision procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1059.4.1Reﬁned symbolic graph. . . . . . . . . . . . . . . . . . . . . . . . . .1059.4.2Symbolic based algorithm . . . . . . . . . . . . . . . . . . . . . . . . .1069.4.3Complexity bounds on covering . . . . . . . . . . . . . . . . . . . . . .1079.4.4General bounding scheme . . . . . . . . . . . . . . . . . . . . . . . . .10910 Extensions and discussions11310.1 Model checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11310.2 r-register protocol. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11410.2.1 Tools enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11410.2.2 Operations over r registers. . . . . . . . . . . . . . . . . . . . . . . .11510.2.3 Discussion on the r-register extension. . . . . . . . . . . . . . . . . .11810.2.4 Comparison with non-atomic protocols . . . . . . . . . . . . . . . . . .11910.3 Process identiﬁers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11910.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11911 Toward Strategy Synthesis12111.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12111.1.1 Allowed actions and randomization . . . . . . . . . . . . . . . . . . . .12111.1.2 Local strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12211.1.3 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12411.1.4 Cut-oﬀ property. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12511.2 Reachability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12611.2.1 Mixed strategies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12611.2.2 Pure strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12611.2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12811.3 Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12911.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .132Bibliography135List of ﬁgures142List of tables145R´esum´e en fran¸cais146vii0目录09.1.1 原子性防止截断存在 . . . . . . . . . . . . . . . . . . . 9309.2 存在性 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95CONTENTSviiiChapter 1IntroductionStochastic behaviours are a particular form of uncertainty that also allow desynchronizationand symmetry breaking mechanisms.Imagine for example two persons playing the well-known rock-paper-scissors game as in Figure 1.1a, on a daily basis. The game consists in aseries of rounds, where each of them consists in the two persons picking a symbol among,and. The game continues until a winner is determined, that is when a tie is not triggered.As these players often play this game against each other, they may start to learn from eachother and infer over time how their opponent reasons. Each day that passes, the players candecide to change their mind according to what was played the day before, which will resultin an endless oscillation of their played symbols, or keep playing the same way. This mayrequire them to remember the beginning of the previous plays, or at least the beginning ofthe sequence of played symbols. One day, one of the players is bored of this game and insteadof trying to “outsmart” their opponent as usual, decides at each round, to roll a dice and play;;;;2;2;2;10如果值为1或2，则为1；如果值为3或4，则为2；否则为其他。一旦他决定按照这个方案进行游戏，他就能以1的概率确保自己的胜利。02.无论对手做什么，试图猜测对手的行为或记住他前一天玩过的内容都没有意义，因此游戏对他们两个人来说变得更容易。随机玩也可能节省时间，因为现在玩家每天平均只需要花费32轮。在本论文中，我们研究了具有随机行为的系统的形式化验证和合成。在计算机科学领域，随机化确实是一种解除进程同步、避免死锁或通信协议中的冲突的关键工具。全局0(a) 石头剪刀布游戏。平局会重置游戏。0两个共享同一媒介的实体之间的机制。任何碰撞都会重置游戏。0图1.1 - 需要随机化的简单游戏201.1. 博弈论和计算机科学0在这种情况下，上述通信协议及其相关协议与我们之前的石头剪刀布游戏非常相似，如图1.1b所示。许多协议依赖于随机化，例如CSMA/CD[CSM00]，它要求参与者在发生碰撞时随机发送。因此，考虑到该协议中的随机性对于其分析是一个基本特征[DKN +12]。通过随机化进行的另一个解同步的示例是由Zeroconf协议[BSHV03，AGC05]指定的分布式链路本地地址分配。随着开源软件的兴起和代码反汇编的进展，以及通信协议的公开描述，合理地假设任何程序的源代码实际上是公共信息是合理的。这对安全性有影响，因为一些算法和数据结构在平均情况下具有合理的复杂性，但在最坏情况下具有巨大的复杂性，可以被攻击者触发。例如，使用有限的计算能力和带宽，攻击者可以在服务器端程序的确定性哈希表中触发许多碰撞，并破坏大型计算结构[CW03]。在这种情况下，随机化不仅是确保传输公平性的一种方式，而且也是对抗攻击者的一种自然对策。01.1 博弈论和计算机科学0博弈论和战略推理是一个多产的领域，由冯∙诺伊曼[VonNeumann]通过他著名的极小极大定理引入，后来通过纳什、塞尔滕等人[Nas50，Sel65]引入了更完整的形式化方法[VNM47]。这个框架已经成功地应用于经济学和社会学，因为它为人们和实体的行为和互动提供了一个相关的模型，这些人和实体被视为“玩家”或“代理”，每个人都做出决策，这些决策影响所有玩家。每个玩家根据他们的策略选择决策或行动，这些策略可能考虑到游戏及其环境的一些观察。每个玩家都有一个目标或目标，通常是一个效用函数或更一般的偏好关系。根据所有玩家的行动，给予每个玩家奖励。这些游戏通常以支付矩阵的形式给出，也就是说，游戏被描述为时间上的一个点事件。前面的例子和相关的通信协议包括一个顺序方面，在计算机科学的观点中通常用自动机理论框架表示。例如，教堂问题询问是否可以根据逻辑公式实现对无限字序列的二元关系，该电路将第一个无限序列作为输入，并产生一个相应的可能无限序列作为输出。该问题可以通过在两个玩家之间的图上进行的游戏来表示，第一个玩家通过选择标有输出序列相应字母的边来提出游戏的输出，而第二个玩家则试图逐字逐字地展示一个与输出不对应的输入序列。实现电路对应于第一个玩家的获胜策略[BL69]，从而解决了游戏。这种方法通常非常适合“反应性系统”，我们在这里对合成一个控制器感兴趣，该控制器应满足与不可预测的环境相对的逻辑属性[ALW89]。在这里，第一个玩家是“存在的”，我们将她称为“夏娃”，因为我们对单一可能的获胜策略感兴趣，而第二个玩家是“普遍的”，即“亚当”，我们考虑他所有可能的行为。1, −1−1, 1s0ℏs, rwrsℏw30第1章引言0图1.2 - H没有最优策略01.2 非确定性、随机性和定量问题0当我们说一个玩家在所有可能的策略下都能获胜时，我们采用的是最坏情况的方法，因为考虑了所有可能的行动。可以合理地认为，这种不确定性非常强大，因为我们可能考虑了在实践中可能不会发生的无关场景。例如，前面描述的简单传输模型可以假设具有公平性属性，即假设存在无限多个可用于发射的时间槽。不确定性的中间概念是随机性，即通常被引入为一个不做决策但具有随机行为的特殊玩家的环境。与普通玩家相反，环境没有目标，可以帮助或阻碍存在性玩家。将概率引入模型有两个主要后果：首先，从最坏情况的方法中丢弃了一些可能的执行集合。例如，在剪刀石头布游戏中，只要一个玩家随机出牌，平局是不可能的。这可以与某些形式的公平性属性相关联，否则很难建模。另一方面，随机性带来了新的定量问题。例如，我们可以询问一个玩家是否可以以大于给定阈值的概率获胜。01.3 战略优化0已经提出了几个关于策略最优性的概念，这取决于所需的属性。最优策略是第一个自然的解决方案概念，其中给定玩家的策略必须最大化该玩家的效用，无论其他玩家的行动如何。这个概念导致了价值的概念[Sha53]。在这种情况下，假设游戏由两个具有对立目标的玩家进行，即零和游戏。尽管两个玩家的结果值不一定总和为零，但确定性结果表明，当玩家按顺序轮流进行而没有随机化但具有常规目标时，这种直觉是正确的[Mar75]。当允许并发移动和随机化时，即使游戏是确定的，最优策略可能不存在，因为值可能渐近地达到[Mar98]。例如，考虑图1.2中所示的并发2人零和游戏H（躲藏或逃跑）。玩家1可以选择躲藏（�）或回家逃跑（r），而玩家2则试图用雪球射击他。如果玩家2在玩家1躲藏时射击，她会失去雪球并输掉游戏。如果玩家2在另一个玩家逃跑时射击，她会赢。[KS81]已经证明，这个游戏没有最优策略，尽管玩家1可以确保以接近1的概率获胜。当游戏涉及超过两个玩家或者目标从一个玩家到另一个玩家是兼容的时，我们可以认为最优策略和价值的概念不再适用：玩家可能不再彼此对抗。纳什引入的均衡概念[Nas50]在非零和游戏的研究中起着重要作用。在这样的均衡中，我们对特定策略感兴趣，每个玩家都有一个，当与其他玩家的特定策略一起玩时，她的策略是最优的，而其他玩家的策略已经固定。在图结构上，均衡形成了玩家之间的特定合作机制，以访问有利的状态。当一个玩家不遵守协议时会发生什么取决于所考虑的确切概念，因为在纳什均衡中，玩家更喜欢报复以维持他们的均衡，而不是继续优化自己的效用。这个观察正是引入纳什均衡的特定概念，称为子博弈完美均衡[Sel65]的理由，其中必须从任何状态确保均衡。其他玩家的假设行为，可能希望优化自己的效用，被称为理性。这个假设在计算机科学领域特别合理，因为玩家是程序、自治系统或个体设备，通过协议或一些预定义的模式相互交互。因此，玩家的策略可以被看作是一个程序或固件，根据研究的问题，我们可以尝试合成或仅仅假设存在[KPV15]。one player to another, we may argue that the notions of optimal strategy and value are notsuited anymore: players may not necessarily play against each other anymore. The equilib-rium notion introduced by Nash [Nas50] plays a central role for the study of non-zero-sumgames. In such an equilibrium, we are interested in particular strategies, one for each player,such that for each player

下载后可阅读完整内容，剩余1页未读，立即下载