没有合适的资源?快使用搜索试试~ 我知道了~
首页基于强化学习的无地图机器人导航
基于强化学习的无地图机器人导航
需积分: 49 800 浏览量
更新于2023-05-29
评论 2
收藏 11.94MB PDF 举报
导航是移动机器人所需要的最基本的功能之一,允许它们从一个源穿越到一个目的地。传统的办法严重依赖于预先确定的地图的存在,这种地图的取得时间和劳力都很昂贵。另外,地图在获取时是准确的,而且由于环境的变化会随着时间的推移而退化。我们认为,获取高质量地图的严格要求从根本上限制了机器人系统在动态世界中的可实现性。本论文以无地图导航的范例为动力,以深度强化学习(DRL)的最新发展为灵感,探讨如何开发实用的机器人导航。
资源详情
资源评论
资源推荐

Reinforcement Learning Based
Mapless Robot Navigation
Linhai Xie
Kellogg College
University of Oxford
A thesis submitted for the degree of
Doctor of Philosophy
Trinity 2019

This thesis is dedicated to my parents
for their indispensable, altruistic support

Acknowledgements
In the epilogue of my PhD career, I feel thankful to many people. Those
who deserve the most of my gratefulness are my supervisors Niki and
Andrew. Their diligent supervision trains me and grows me, strict but
full of tenderness. Listening to their “crazy” ideas is one of the biggest
pleasure, though they frightened me a lot at the beginning. However,
gradually I realised that they taught me through those ideas, as what the
bishop has taught the arrested Valjean in Les Miserables, to see things in a
higher plan. In their dual supervision, Niki watches more on my research
direction, discussing all the possibilities with me patiently based on her
long-accumulated academic experience. Andrew, on the other hand, can
always catch the core concepts behind my poor expressions and help me
to polish them on the paper. Thanks to them, otherwise many of the
achievements in this thesis would have died before the birth.
Second, I’d like to thank Dr. Sen Wang and Dr. Yishu Miao. Sen guided
my first step to robotics, like Venus to sailors in the dark sea, illuminating
my way on this unknown territory. Those memorable days when he was
sitting behind me in the office were the peak of my progress in robotic re-
search. Yishu, my advisor and collaborator in machine learning, catalysed
my study on deep reinforcement learning and bridging it with robotics.
Now he becomes my boss for my internship in MO Intelligence which is a
wonderful working place for me. Sen and Yishu are also my friends and
mentors who I deeply appreciate their help on my life in Oxford.
Next, I want to thank to Xiaoxuan, Zhihua and other colleagues in Cyber-
Physical Systems Group. Their enthusiasm to the work and exciting
achievements encouraged me to dive deeper into my research. Besides,
the endless anecdotes and fun gossips in the office also brought me laugh
and happiness. Thanks to each of my friends in Oxford. I miss all the
hotpot parties, drinks in the pub and our online reunions in DotA. My
gratitude also goes to my girlfriend Yuanyuan for embracing me warmly

during the tough period and my parents for their indispensable and altru-
istic support.
4

Abstract
Navigation is the one of the most fundamental capabilities required for
mobile robots, allowing them to traverse from a source to a destination.
Conventional approaches rely heavily on the existence of a predefined map
which is costly both in time and labour to acquire. In addition, maps are
only accurate at the time of acquisition and due to environmental changes
degrade over time. We argue that this strict requirement of having access
to a high-quality map fundamentally limits the realisability of robotic
systems in our dynamic world. In this thesis, we investigate how to develop
practical robotic navigation, motivated by the paradigm of mapless
navigation and inspired by recent developments in Deep Reinforcement
Learning (DRL).
One of the major issues for DRL is the requirement of a diverse experi-
mental setup with millions of repeated trials. This clearly is not feasible to
acquire from a real robot through trial and error, so instead we learn from
a simulated environment. This leads to the first fundamental problem
which is that of bridging the reality gap from simulated to real environ-
ments, tackled in Chapter 3. We focus on the particular challenge of
monocular visual obstacle avoidance as a low-level navigation primitive.
We develop a DRL approach that is trained within a simulated world yet
can generalise well to the real world.
Another issue which limits the adoption of DRL techniques for mobile
robotics in the real world is the high variance of the trained policies. This
leads to poor convergence and low overall reward, due to the complex and
high dimensional search space. In Chapter 4 we leverage simple classi-
cal controllers to provide guidance to the task of local navigation with
DRL, avoiding purely random initial exploration. We demonstrate that
this novel accelerated approach greatly reduces sample variance and sig-
nificantly increases achievable average reward.
剩余138页未读,继续阅读















安全验证
文档复制为VIP权益,开通VIP直接复制

评论0