多智能体Monte Carlo Go：提升游戏搜索空间探索

需积分: 9 117 浏览量更新于2024-09-11 收藏 756KB PDF 举报

本文主要探讨了Monte Carlo方法在游戏，特别是围棋（Go）中的应用，通过提出一种多智能体（Multi-Agent）版本的UCT Monte Carlo Go算法。这项创新旨在提升模拟质量，增强人工智能玩家的整体实力。传统的Monte Carlo方法是通过单个智能体与自身对弈进行探索，而在这种新型算法中，多个简单的智能体被用于模拟阶段，这显著增加了搜索空间的探索深度和广度。每个智能体都执行独立但协同的游戏策略，它们的集体行为导致了更全面的决策分析。这种方法借鉴了"emergent behavior"（涌现行为）的概念，即众多简单个体的行为共同作用下产生的复杂且高效的系统行为。通过这种方式，研究者们能够有效地超越当时顶尖的计算机围棋软件Fuego，预示着计算机围棋技术的一个新发展阶段。文章类别和主题涉及了人工智能的应用领域，特别是在游戏和专家系统方面，以及实验算法和集体智能。关键词包括涌现行为、集体智能等，突出了这种方法的核心理念和潜在影响。这篇论文不仅提供了一种创新的算法，还对未来人工智能在棋类游戏中的发展提出了新的思考，展示了Monte Carlo方法在强化学习和游戏AI领域的前沿应用前景。

Multi-Agent Monte Carlo Go

Leandro Soriano Marcolino

Matsubara Laboratory – Intelligence Information

Science Department

Future University of Hakodate

Hakodate, Japan

g2209001@fun.ac.jp

Hitoshi Matsubara

Matsubara Laboratory – Intelligence Information

Science Department

Future University of Hakodate

Hakodate, Japan

matsubar@fun.ac.jp

ABSTRACT

In this paper we propose a Multi-Agent version of UCT

Monte Carlo Go. We use the emergent behavior of a great

number of simple agents to increase the quality of the Monte

Carlo simulations, increasing the strength of the artiﬁcial

player as a whole. Instead of one agent playing against it-

self, diﬀerent agents play in the simulation phase of the al-

gorithm, leading to a better exploration of the search space.

We could signiﬁcantly overcome Fuego, a top Computer Go

software. Emergent behavior seems to be the next step of

Computer Go development.

Categories and Subject Descriptors

I.2.1 [Artiﬁcial Intelligence]: Applications and Expert

Systems—Games

General Terms

Algorithms, Experimentation

Keywords

Emergent Behaviour, Collective Intelligence

1. INTRODUCTION

Go is a two-player turn-based strategy board game, that

is famous for being one of the main challenges in Artiﬁcial

Intelligence. A small set of simple rules

leads to a game

amazingly complex for a human being and a search tree

that is unbearably large for a computer. There are many

reasons for this diﬃculty of developing a strong artiﬁcial

player. First, Go is played in a large board, 19x19, with

361 intersections, creating diﬃculties for tree search based

algorithms. Second, generally most of the intersections are

valid movements, increasing the number of possible states

from a given state of the board. Third, the stones interact

in complex ways during the game; one stone may inﬂuence

a distant group, for example in situations where there is a

Available at many places, for example:

http://www.pandanet.co.jp/English

Cite as: Multi-Agent Monte Carlo Go, Leandro Soriano Marcol-

ino, and Hitoshi Matsubara, Proc. of 10th Int. Conf. on Au-

tonomous Agents and Multiagent Systems (AAMAS 2011),

Tumer, Yolum, Sonenberg and Stone (eds.), May, 2–6, 2011, Taipei, Tai-

wan, pp. 21-28.

 2011, International Foundation for Autonomous Agents and

ladder. Besides, building an evaluation function is not triv-

ial. Even end of game situations, that intuitively should be

simpler, were proved to be PSPACE-hard [31]. According

to [1], compared to the complexity of Chess (10

), the com-

plexity of Go (10

160

) is bigger by a factor of 10

110

. We can

see, therefore, how challenging it is to create an artiﬁcial

player of Go.

However, recently, with the development of evaluations

of the board state based on simulations (known as Monte

Carlo techniques), the strength of Computer Go players im-

proved signiﬁcantly. Thanks to artiﬁcial players like MoGo,

Crazy Stone, Fuego, Many Faces of Go, and Zen, the best Go

programs are now considered amateur level 2 dan. Further

improvement was achieved by parallelization, as it increases

the computational power, allowing a deeper exploration of

the possible movements. In February 2009, Many Faces of

Go, running on a 32-core Xeon cluster, beat the professional

player James Kerwin, in a 19x19 board with a handicap of

7 stones. Many recent works are now investing in the par-

allelization of Monte Carlo techniques. However, there is

always a limit in the amount of speed-up that can be gained

in a parallelization design.

Generally, there are two ways to increase the strength

of an artiﬁcial player: advances in computational power,

which can be achieved by parallelization, and advances in

the theory, which can be achieved by new algorithms and

methods. Nowadays, the research in Monte Carlo techniques

seems to be focused on the parallelization of the current

approaches. However, it is always desirable to advance the

theory with the creation of better algorithms, that lead to

stronger players even when the computational power has not

necessarily increased. We believe that the next theoretical

step lies in the investigation of Multi-Agent methodologies.

Multi-Agent systems have been used to solve a great range

of problems in Artiﬁcial Intelligence. The emergent behavior

of a great number of simple agents have been applied in al-

gorithms like Ant Colony Optimization [11], Particle Swarm

Optimization [20], etc, in order to solve diﬃcult optimiza-

tion problems. It is also notable how emergence can lead to

complex and intricate group behavior [21, 22, 23, 28].

Emergence is a powerful concept, not only in Computer

Science, but also in a variety of disciplines, like philosophy,

systems theory and art. The stock market and the Internet

are important systems to modern life that arise thanks to

the emergence of simple components. Emergence is also fun-

damental in biological systems. A notable example is an ant

colony. It is known that the queen does not order directly

the ants. Each ant is always reacting to stimuli generated

下载后可阅读完整内容，剩余7页未读，立即下载

xk_Casanova

粉丝: 0
资源: 8

多智能体Monte Carlo Go：提升游戏搜索空间探索

Amazon S3：S3静态网站托管教程.docx

基于支持向量机SVM-Adaboost的风电场预测研究附Matlab代码.rar

基于花朵授粉优化算法FPA优化TCN-BiGRU-Attention实现光伏数据回归预测附Matlab代码.rar

【粗糙面】基于matlab一维介质粗糙面双站散射系数计算【含Matlab源码 9130期】.mp4

CPPC++_半透明效果，大多数的win32飞出.zip

mondo rescue离线安装及系统恢复并且问题解决参考

VID_20241112_234319.mp4

【SCI2区】基于凌日优化算法TSOA优化TCN锂电池健康寿命预测算法研究Matlab实现.rar

基于C/C++实现根据人类手写汉字图片-使用机械臂复写汉字+源码+项目文档（毕业设计&课程设计&项目开发）

mysql5.7 win版本压缩包

最新资源