基于学习自动机理论的MANET稳定节能路由算法

126 浏览量更新于2024-07-15 收藏 4.12MB PDF 举报

"A Stable and Energy-Efficient Routing Algorithm Based on Learning Automata Theory for MANET" 在移动自组织网络（MANET）中，由于其节点的移动性和分布随时间变化的特性，设计稳定且高效的路由协议始终是研究的热点。当前的路由算法在处理这种动态环境时，往往无法有效地保持路由的稳定性。此外，MANET中的每个节点都具有有限的初始能量，因此，能源的节约和平衡是设计路由算法时必须考虑的关键因素。针对上述挑战，本文提出了一种基于学习自动机（LA）理论的稳定且节能的MANET路由算法。首先，文章构建了一个新的节点稳定性度量模型，定义了一个有效能量比函数，该函数能够评估节点在传输数据时的稳定性与能量效率。这个度量模型有助于识别那些既能够保持路由稳定性又具有高能效的节点。学习自动机理论在此算法中起着核心作用。LA是一种计算模型，它通过与环境的交互学习最佳策略，这在MANET中意味着节点可以学习和适应不断变化的网络条件，以优化路由选择。通过引入反馈机制，节点能够根据前一次传输的成功与否调整其路由决策，从而提高路由的稳定性和持久性。算法的具体实现包括以下几个步骤： 1. 初始化阶段：所有节点利用LA初始化自己的路由策略。 2. 交互阶段：节点间进行数据传输，并根据LA理论更新其路由决策。 3. 反馈阶段：节点接收关于传输成功或失败的反馈，据此调整其路由表。 4. 能量管理：结合有效能量比函数，节点会优先选择能效高的路径，同时考虑整体网络的能量平衡。 5. 稳定性评估：新提出的稳定性测量模型用于评估路由的稳定性，避免频繁的路由更新导致的网络拥塞和能耗增加。通过仿真和分析，该算法在保持路由稳定性的同时，显著提高了能量效率，降低了网络的能量消耗，并且在各种网络环境下表现出了良好的性能。这一工作对MANET的路由协议设计提供了新的思路，对于提升网络的生存能力和服务质量具有重要的实际意义。

A Stable and Energy-Efﬁcient Routing Algorithm Based on Learning Automata Theory for MANET 45

is inﬂuenced by not only the relative mobility between nodes

but also the distribution of nodes. Overall energy conserva-

tion and balance should also be taken into consideration. In

addition to the above two points, it must be stressed that the

traditional heuristic techniques and machine learning meth-

ods used to design MANET routing protocols generally lack

expansibility. They have minimal hand-tuning, and incur rela-

tively high computation costs. To resolve these existing prob-

lems, this paper proposes a stable and energy-efﬁcient routing

algorithm for MANET using LA theory. Compared with tra-

ditional machine learning methods and heuristic algorithms,

LA theory has the following advantages: (1) LA theory is sup-

ported by a completed mathematics proof

[23-26]

; (2) LA theory

is capable of global optimization and results in relatively low

costs, in a dynamic environment; (3) LA theory has good ex-

pansibility, which is needed to optimize large-scale MANET

routing performance; (4) LA theory can map the computa-

tion space to a probability space, ensuring normalization. The

optimization efﬁciency of traditional heuristic algorithms and

machine learning methods (e.g., ACO, PSO, GA) rely on the

construction of a heuristic function. Considering the dynamic

environment, it is difﬁcult to construct a good enough heuris-

tic function. In addition, these methods cannot always ensure

normalization. As a general rule, the LA theory is used in

bio-computation and stochastic system control, which can be

regarded as a dynamic environment. Owing to the dynamic

features of MANET, we are able to use LA theory to ﬁnd the

optimal route from available paths.

In our solution, we begin by constructing a new node sta-

bility measurement model and deﬁning the effective energy

ratio function. On this basis, we introduce a node-weighted

value function, which is used as the iteration parameter. We

then use LA theory to construct a MANET environment feed-

back model. In this feedback model, each node is equipped

with learning automaton, enabling it to take action by sensing

the surrounding environment. Based on LA theory, this pro-

cess can be represented through a rigorous linear probability

iteration; in other words, the relay node in the available paths

updates its weighted value according to the feedback signal,

which represents the result after sensing the environment (we

have used a judging function to distinguish the type of feed-

back signal explained in part A of section IV). When the feed-

back signal is a reward signal, this node will add its weighted

value. Conversely, it will reduce its weighted value. Thus,

the current node can decide which node should be chosen as

the next hop node from a group of available hop nodes. Ac-

cordingly, the path value deﬁned in this feedback mechanism

will be added or reduced (before executing this mechanism,

we found all available paths from the source node to the des-

tination node explained in part A of section IV). Because of

the convergence of LA, we can ﬁnally choose the optimal path

with the highest path value from all of the available paths. The

chosen path will be stable enough to ensure overall energy

saving and balance.

III. PRELIMINARIES AND BACKGROUND

In this section, we present a brief overview of routing pro-

tocols and some preliminary information on LA theory.

A. Overview of Routing Protocols

Based on the relation with information

[20,27]

, routing pro-

tocols can be divided into several categories. In general, we

classify the protocols into three kinds: proactive, reactive and

hybrid protocols. Proactive routing protocols periodically up-

date the message so that it can ensure the data packets trans-

mission. Reactive protocols initiate route discovery on de-

mand. That means when the source node has data packets to

be sent to a given destination node, it initiates route discov-

ery by broadcasting the route request packet. While receiving

the request packet, the relay nodes will rebroadcast it again.

This process continues until the request packet arrives at the

destination node. Similar to the handshake mechanism, the

destination node generates a route reply packet and sends it

to the source node. In other words, the reply packet tracks

the reverse route already taken by the corresponding request

packet. As a compromised scheme, hybrid routing protocols

combine these two routing protocols, which can be used in hi-

erarchical structure networks. Generally, proactive protocols

cause more energy consumption, which, of course, degrades

the network life cycle. Hybrid routing protocols use more con-

trol information than reactive protocols needing a hierarchical

structure network.

B. Learning Automata

LA theory is a self-learning mechanism based on the theory

of stochastic process

[23,26]

. As an adaptive decision-making

system, LA can enhance the performance by using previous

knowledge to choose the best action from a limited set of

actions through repeated interactions with a random environ-

ment. Basic LA contains three key factors: a random environ-

ment, an automaton and a feedback system. The automaton

chooses actions based on the random environment and the en-

vironment responds to these actions by producing a feedback

signal. Based on the effect on the automaton, the feedback

signal can be divided into ‘positive signal’ (reward signal) or

‘negative’ signal (penalty signal). Over a period of time, the

automaton can learn from the feedback signal to ﬁnd an opti-

mal action (Fig. 1 shows the operating principle of LA).

Deﬁnition 1 (environment) The random environment is

an object interacting with the automaton. Usually, we set

E = {A,B,C} to describe the random environment. Where

A = {α

,α

,··· ,α

} represents the limited sets of inputs per-

formed by the automaton, α

represents the action in time

剩余14页未读，继续阅读

weixin_38746387

粉丝: 333
资源: 1308

基于学习自动机理论的MANET稳定节能路由算法

An Energy-Efficient Clustering algorithm Based on Cross-Monotonic Cost Sharing Game

An Energy-Efficient Routing Algorithm for Underwater Wireless Sensor Networks Inspired by Ultrasonic Frogs

A Clustering Routing Algorithm Based on Depth and Energy for Three-Dimensional Underwater Sensor Network

Truthful Cost sharing for Optimal Energy-Efficient Routing Solution in Clustered Wireless Sensor Networks

Energy-efficient delay-constrained routing for sensor networks with low duty cycles

An-Energy-Efficient-and-Load-Balancing-Routing-Al_单片机开发_PDF_

Energy-Efficient-Multi-Path-Routing-in-Wireless-S_单片机开发_PDF_

Ant-Based Routing Algorithm to Achieve the Lifetime Bound for Target Tracking Sensor Networks

A multicast routing algorithm based on searching in directed graph

DataGrid---ASP.NET-MVC---How-to-configure-routing-based-on-Http-Verb-attributes-to-support-CRUD

最新资源