赢者坚守，输者学习：策略更新促进囚徒困境空间合作

102 浏览量更新于2024-09-01 收藏 391KB PDF 举报

本文探讨了一种新颖的策略更新规则——"Win-Stay-Lose-Learn"（WSLL），在空间囚徒困境游戏中的应用。囚徒困境是博弈论中的经典问题，描述了两个嫌疑人面临的选择：合作或背叛。通常情况下，个体倾向于坚持当前策略，如果它在过去带来了成功和满意度。然而，在传统的进化博弈模拟中，玩家常常倾向于频繁调整策略，即使微小的收益差异也可能促使他们改变。 WSLL规则的核心思想是基于个人的期望和满足感来决定策略的坚持或改变。当个体通过某种策略获得优于对手的收益时，他们会继续保持该策略；反之，如果他们的收益低于预期或者不如其他玩家，他们才会学习并尝试新的策略。这种规则更贴近现实情况，因为它反映了人类行为的固有特性——人们往往在确定的优势面前维持现状，而非盲目追求短期利益。研究者们在北京航空航天大学、西安长安大学、国际应用系统分析研究所和马里博尔大学等多所机构的团队合作，将WSLL规则引入到空间囚徒困境游戏中，以观察其对合作行为的影响。通过模拟实验，他们发现相比于传统策略更新方法，WSLL规则能促进合作现象，因为玩家更可能在稳定且有利的情况下坚守策略，减少了因频繁变化导致的不稳定性和效率降低。此外，这项研究还可能有助于理解和改进现实世界中的合作与竞争动态，比如在经济决策、社会互动和资源分配等领域，人们的行为可能受到类似规则的影响。通过理解和模拟WSLL原则，理论和实践者可以设计出更具策略性、更能促进合作的环境，从而提升整体的社会效率和福祉。因此，这篇研究论文不仅提供了关于博弈论的新见解，也为解决实际问题提供了潜在的理论支持。

Win-Stay-Lose-Learn Promotes Cooperation in the

Spatial Prisoner’s Dilemma Game

Yongkui Liu

1,2,3

*, Xiaojie Chen

*, Lin Zhang

, Long Wang

, Matjaz

Perc

1 School of Automation Science and Electrical Engineering, Beihang University, Beijing, China, 2 School of Electronic and Control Engineering, Chang’an University, Xi’an,

China, 3 Center for Road Traffic Intelligent Detection and Equipment Engineering, Chang’an University, Xi’an, China, 4 Evolution and Ecology Program, International

Institute for Applied Systems Analysis, Laxenburg, Austria, 5 State Key Laboratory for Turbulence and Complex Systems, College of Engineering, Peking University, Beijing,

China, 6 Faculty of Natural Sciences and Mathematics, Department of Physics, University of Maribor, Maribor, Slovenia

Abstract

Holding on to one’s strategy is natural and common if the later warrants success and satisfaction. This goes against

widespread simulation practices of evolutionary games, where players frequently consider changing their strategy even

though their payoffs may be marginally different than those of the other players. Inspired by this observation, we introduce

an aspiration-based win-stay-lose-learn strategy updating rule into the spatial prisoner’s dilemma game. The rule is simple

and intuitive, foreseeing strategy changes only by dissatisfied players, who then attempt to adopt the strategy of one of

their nearest neighbors, while the strategies of satisfied players are not subject to change. We find that the proposed win-

stay-lose-learn rule promotes the evolution of cooperation, and it does so very robustly and independently of the initial

conditions. In fact, we show that even a minute initial fraction of cooperators may be sufficient to eventually secure a highly

cooperative final state. In addition to extensive simulation results that support our conclusions, we also present results

obtained by means of the pair approximation of the studied game. Our findings continue the success story of related win-

stay strategy updating rules, and by doing so reveal new ways of resolving the prisoner’s dilemma.

Citation: Liu Y, Chen X, Zhang L, Wang L, Perc M (2012) Win-Stay-Lose-Learn Promotes Cooperation in the Spatial Prisoner’s Dilemma Game. PLoS ONE 7(2):

e30689. doi:10.1371/journal.pone.0030689

Editor: Ju

rgen Kurths, Humboldt University, Germany

Received November 23, 2011; Accepted December 21, 2011; Published February 17, 2012

Copyright: ß 2012 Liu et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted

use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: YL acknowledges financ ial support from the Special Fund for the Basic Scientific Research of Central Colleges of Chang’an University (Grant

CHD2010JC134). LW acknowledges financial support from the National Natural Science Foundation of China (NSFC) (Grants 60736022 and 10972002). MP

acknowledges financial support from the Slovenian Research Agency (ARRS) (Grant J1-4055). This work was additionally supporte d by the National Natural

Science Foundation of China (NSFC) (Grants 61074144, 51005012 and 61103096). The funders had no role in study design, data collection and analysis, decision to

publish, or preparation of the manuscript.

Competing Interests: The authors have declared that no competing interests exist.

* E-mail: ykliu@chd.edu.cn (YL); chenx@iiasa.ac.at (XC); matjaz.perc@uni-mb.si (MP)

Introduction

Evolutionary game theory provides a powerful mathematical

framework for studying the emergence and stability of cooperation in

social, economic and biological systems [1–5]. The prisoner’s

dilemma game, in particular, is frequently considered as a paradigm

for studying the emergence of cooperation among selfish and

unrelated individuals [6]. The outcome of the prisoner’s dilemma

game is governed by pairwise interactions, such that at any instance

of the game two individuals, who can either cooperate or defect, play

the game against each other by selecting their strategy simulta-

neously and without knowing what the other player has chosen. Both

players receive the reward R upon mutual cooperation, but the

punishment P upon mutual defection. If one player defects while the

other cooperates, however, the cooperator receives the sucker’s

payoff S while the defector receives the temptation T~b.Since

TwRwPwS, there is an innate tension between individual

interests (the rational strategy, yielding an optimal outcome for the

player regardless of what the other player chooses, is defection) and

social welfare (for the society as a whole the optimal strategy is

cooperation) that may result in the ‘‘tragedy of the commons’’ [7].

Five prominent rules for the successful evolution of cooperation,

which may help avert an impeding social decline, are kin selection,

direct and indirect reciprocity, network reciprocity as well as group

selection, as comprehensively reviewed in [8].

Since the pioneering work of Nowak and May [9] spatial games

have received ample attention, and they have become inspirational

for generations of researchers trying to reveal new ways by means of

which cooperation can prevail over defection [10–12]. In the

context of spatial games, network topology and hierarchies have

been identified as a crucial determinant for the success of

cooperative behavior [13–28], where in particular the scale-free

topology has proven very beneficial for the evolution of cooperation.

In fact, payoff normalization [29–31] and conformity [32] belong to

the select and very small class of mechanisms that can upset the

success of cooperators on such highly heterogeneous networks.

Other approaches facilitating the evolution of cooperation include

the introduction of noise to payoffs and updating rules [33–38],

asymmetry between interaction and replacement graphs [39,40],

diversity [41–44], differences between time scales of game dynamics

[44–47], as well as adoption of simultaneous different strategies

against different opponents [48]. Somewhat more personally-

inspired features supporting the evolution of cooperation involve

memory effects [49], heterogeneous teaching activity [50–52],

preferential learning [53,54], mobility [55–59], myopically selective

interactions [60], and coevolutionary partner choice [61–63], to

name but a few examples studied in recent years.

Regardless of the details of mechanisms that may promote the

evolution of cooperation in the spatial prisoner’s dilemma game,

most frequently, it was assumed that individual players learn from

PLoS ONE | www.plosone.org 1 February 2012 | Volume 7 | Issue 2 | e30689

下载后可阅读完整内容，剩余7页未读，立即下载

weixin_38684335

粉丝: 1
资源: 932

赢者坚守，输者学习：策略更新促进囚徒困境空间合作

"推动机器人基础科技研究的平台-机器人杯研究与实践

"基于Web开发的企业管理系统：信息技术与管理科学的融合

"基于遥感的土地利用空间数据库设计与实现

Strategy changing penalty promotes cooperation in spatial prisoner's dilemma game

Strategy imitation behavior driven influence adjustment promotes cooperation in spatial prisoner’s dilemma game

Social selection of game organizers promotes cooperation in spatial public goods games

Adaptive role switching promotes fairness in networked ultimatum game

Estrogen promotes the growth of decidual stromal cells through estrogen receptor beta-mediated IL-24 down-regulation in human early pregnancy

PUMA PROMOTES BAX ACTIVATION IN A FOXO3a-DEPENDENT MANNER IN STS-INDUCED APOPTOSIS

6-Hydroxydopamine Promotes Iron Traffic in Primary Cultured Astrocytes

最新资源