ZoneDefense：二维网格的无虚拟通道容错路由新策略

需积分: 5 106 浏览量更新于2024-08-26 收藏 1.48MB PDF 举报

"ZoneDefense是一种无虚拟通道的二维网格容错路由算法，旨在为多核处理器的片上通信提供可靠的解决方案。该算法面对的主要问题是处理网络边缘的故障，防止网络死锁的发生。传统的解决方法可能通过禁用故障节点或把所有故障聚集成一个故障块，但这会牺牲大量无故障节点。ZoneDefense路由策略则创新性地将故障包含在凸形故障块中，并将故障块的位置信息分散到对应的列中，形成防御区。数据包能够预知故障位置，从而避开故障，提高容错能力。与现有算法相比，ZoneDefense在容忍更多故障的同时，减少了无故障节点的损失。此外，即使在无故障情况下，ZoneDefense也不会降低网络性能，而在有故障的情况下，其性能表现与现有算法相当。" ZoneDefense路由算法的核心在于它的防御区概念，这是一种利用故障信息提前规划路径的方法。当网络中出现故障时，受影响的节点会形成一个故障块，而这个故障块的信息会被传播到相应的列中。每个知道这些信息的节点都会成为防御区的一部分，帮助数据包在传输过程中避开故障区域，确保通信的连续性和效率。在二维网格网络-on-chip (NoC)架构中，路由策略至关重要，因为这些网络通常面临着高密度、高带宽和低延迟的要求。ZoneDefense通过不依赖虚拟通道，降低了硬件复杂度，这对于资源受限的片上系统尤其有利。同时，它采用的二维网格拓扑结构有利于简化路由逻辑和减少功耗。无虚拟通道的设计使得ZoneDefense在资源利用率和效率方面具有优势，因为虚拟通道的使用通常会增加额外的存储需求和调度复杂性。此外，通过考虑网络边缘的故障情况，ZoneDefense增强了网络的边缘鲁棒性，这是许多传统方案忽视的。 Turn model是路由算法中的另一个关键元素，它定义了数据包在网格网络中转向的规则。ZoneDefense可能结合了一种特定的转向模型，允许数据包在遇到故障时灵活改变方向，以确保路由的成功。 ZoneDefense是一种创新的容错路由技术，它在保持网络性能的同时，提高了对故障的容忍度，减少了对无故障节点的牺牲，对于构建可靠和高效的多核处理器通信系统具有重要意义。

FU et al.: FAULT-TOLERANT ROUTING FOR 2-D MESHES 115

Fig. 4. Faulty blocks without shared boundary channels. Dark nodes represent

faults and gray nodes indicate unsafe nodes.

According to Deﬁnition 4, node (4, 3) changes to unsafe in

the second iteration because it has a danger neighbor (3, 3)

in x-dimension and a semi-safe-y neighbor (5, 3). Meanwhile,

nodes (5, 3), (3, 4), and (4, 4) also change to unsafe according

to Deﬁnition 4. Faulty blocks are formed in two iterations.

It is worthy to note that allowing faulty blocks to share

boundaries could further reduce the number of sacriﬁced

fault-free nodes. However, shared boundaries will signiﬁcantly

increase the routing complexity. The discussion about the

tradeoffs between the number of sacriﬁced fault-free nodes

and the routing complexity is left as the future work.

III. R

ELATED WORK

Since the NoCs this paper concerns do not have vir-

tual channels, the fault-tolerant routing algorithms rely-

ing on virtual channels, such as [12], [15]–[17], are not

reviewed in this section. Furthermore, there is another kind

of fault-tolerant routing algorithms, stochastic algorithms.

Stochastic routing algorithms enhance NoC reliability by

sending multiple replicated packets through redundant routes,

such as the probabilistic gossip ﬂooding algorithm [18] and

N-Random walk algorithm [19], or by deﬂection, such as [20],

[21]. Although stochastic routing algorithms can be highly

resilient, they also face some design challenges, such as

high energy and bandwidth consumption. Thus, this paper

mainly focuses on nonstochastic routing algorithms. There

are some ﬂow control techniques also can be used to avoid

deadlock, such as the bubble ﬂow control [22] and the one

proposed in [23]. Since this paper focuses on wormhole-

switched networks, those ﬂow control techniques will not be

reviewed in this section.

Fault-tolerant routing algorithms designed for networks

without virtual channels can be categorized into two classes,

turn model-based and segment-based. For example, Glass

and Ni [9] proposed a nonminimal version of negative-ﬁrst

routing [6].Wu proposed a fault-tolerant routing based on

odd–even turn model [10]. Zhang et al. [11] proposed a

reconﬁgurable router to tolerate one faulty block.

Fick et al. [24], [25] proposed a distributed algorithm

to reconﬁgure the routing table. Fu et al. [26] proposed a

multiple-round dimension-order routing.

Segment-based routing classiﬁes networks into subnets, and

subnets into segments [27]. By placing a bidirectional turn

restriction in each segment, the network can be guaranteed

deadlock free. Cooperating with the logic-based distributed

routing [28] or universal LBDR [29], segment-based routing

provides a way to improve the reliability of NoCs.

We should note that fault-tolerant routing algorithms are

expected to be high resilience, high performance, high scala-

bility, and low cost. However, these objectives are somewhat

conﬂicting. Therefore, there is a tradeoff in designing fault-

tolerant routing.

For example, algorithms relying on off-line analysis with

global fault information, such as those segment-based routing

algorithms [26], [27], [29], can tolerate more faults. However,

for NoCs which cannot afford virtual channels, collecting and

dumping global fault information is usually too expensive.

Routing table provides the ﬂexibility to reconﬁgure the

network in the presence of faults. However, algorithms relying

on a routing table, such as [24], [30], are not suitable for large-

scale NoCs, especially for those without virtual channels, due

to the cost problem [28].

Logic-based fault-tolerant routing algorithms, such as

in [9]–[11], is low cost. However, the main problem

in [9] and [11] is that only one fault can be tolerated.

Zhang et al. [11] claimed that their algorithm can be extended

to tolerate multiple faults by including them into one convex

faulty block. However, this usually leads to a large number

of disabled fault-free nodes. The main problem in [10] is the

way that is used to handle the faults locating on four network

edges as well as the two columns that are adjacent to the left

and right network edges. For example, if a fault appears at

these places, nodes of the corresponding edge or column are

all disabled. Unfortunately, the number of disabled nodes is

large.

This paper concerns the NoCs without virtual channels. We

believe that this kind of NoC is quite cost sensitive. Therefore,

we select the logic-based fault-tolerant routing algorithms,

such as [10] and [11], as the baseline algorithms. The main

difference between the proposed ZoneDefense routing and

previous work [10], [11] is the use of defense zones. The main

beneﬁt of ZoneDefense routing is the signiﬁcantly reduced

number of disabled fault-free nodes.

IV. D

EFENSE ZONES

According to [7], a network is deadlock free if all rightmost

columns are removed. As shown in Fig. 3, ES, SW, EN,

and NW turns are necessary to form rightmost columns. To

distinguish them from others, they are called unexpected turns.

Unfortunately, unexpected turns may be introduced if a packet

hits the boundary of a faulty block. To avoid unexpected turns,

we introduce the defense zones, so that packets could ﬁnd the

faulty block and route around it in advance.

The formation of defense zones is triggered by the detection

of faults using such as build-in self-test techniques [31]. In this

paper, we utilize the dynamic fault model, but assume that no

new fault occurs during a routing process like [10]. However,

in practice, faults may occur at any time. To support dynamic

faults, one can exploit more reliable ﬂow control techniques,

such as APCS [32] and the one proposed in [33]. These

剩余13页未读，继续阅读

weixin_38637805

粉丝: 4
资源: 952

ZoneDefense：二维网格的无虚拟通道容错路由新策略

channel overlapping

ZoneDefense：二维网格无虚拟通道的容错路由策略

点阵路由算法

云计算第三版精品课程配套PPT课件含习题（32页）第10章 云计算第三版核心算法（二）.pptx

二维网格虫孔路由算法的实时扩展：MPP下的性能比较

无虚拟通道的故障容错芯片网络路由设计

蜂窝网络的无死锁虫孔容错路由算法研究

2D Mesh网络中的负向优先容错路由算法

非典型分层路由协议：无线传感器网络的全面回顾

三维片上网络拓扑结构：研究综述与关键问题

最新资源

云计算第三版精品课程配套PPT课件含习题（32页）第10章云计算第三版核心算法（二）.pptx