TriBA:高效能群体局部性导向的多处理器系统级芯片连接架构

26 浏览量更新于2024-08-28 收藏 808KB PDF 举报

本文探讨了"Computationally Efficient Locality-Aware Interconnection Topology for Multi-Processor System-on-Chip (MP-SoC)"，这是在多处理器系统级芯片（MP-SoC）设计领域的一项创新研究。论文主要关注的是Triplet-Based Architecture (TriBA)，这是一种新颖的芯片多处理器架构和一类直接互连网络（Direct Interconnection Network，简称DIN）。TriBA的核心概念是构建一个二维网格，由小型、可编程的处理单元组成，每个处理单元与它周围的三个邻居物理相连，这样可以充分利用群体局部性带来的优势。在计算机科学和技术领域，局部性是衡量通信模型性能的关键特性。任何通信模式都可以通过局部性属性来描述，而不同的网络拓扑则具有其固有的结构性和局部性特征。TriBA的设计旨在提供计算效率，通过将处理单元组织成一个紧密连接的网格，能够有效地支持数据共享和协同工作，从而提高整体系统的性能。研究人员提出了一个新的性能评估标准，该标准考虑了局部性对MP-SoC性能的影响。通过TriBA的实现，设计者可以在保持高性能的同时，优化内存访问和通信延迟，这对于现代嵌入式系统、云计算环境以及大规模并行计算至关重要。此外，VLSI（Very Large Scale Integration）布局也被考虑在内，确保了硬件的高效实现和能耗优化。这篇2010年10月发表的文章，由北京理工大学计算机科学技术学院的研究团队撰写，Haroon-Ur-Rashid Khan等人，他们在文中分享了他们的研究成果，并于2009年3月16日接收了稿件，最终于2010年5月1日被接受。该成果不仅对于学术界，也对工业界设计高性能MP-SoC具有实际指导意义，为构建更高效的多核系统提供了新的设计思路和优化策略。

SPECIAL TOPICS:

Computer Science & Technology

October 2010 Vol.55 No.29: 3363–3371

doi: 10.1007/s11434-010-4118-z

Computationally efficient locality-aware interconnection topology

for multi-processor system-on-chip (MP-SoC)

Haroon-Ur-Rashid Khan

, SHI Feng, JI WeiXing, GAO YuJin, WANG YiZhuo, LIU CaiXia,

DENG Ning & LI JiaXin

School of Computer Sicence and Technology, Beijing Institute of Technology, Beijing 100081, China

Received March 16, 2009; accepted May 1, 2010

This paper evaluates the Triplet Based Architecture, TriBA – a new idea in chip multiprocessor architectures and a class of Direct

Interconnection Network (DIN). TriBA consists of a 2D grid of small, programmable processing units, each physically connected

to its three neighbors so that advantageous features of group locality can be fully and efficiently utilized. Any communication

model can be well characterized by locality properties and, any topology has its intrinsic, structural, locality characteristics. We

propose a new criterion in performance evaluation that is based on the concept of locality in an interconnection network, the

“lower layer complete connect”. Our proposed criterion depicts how completely a processing node is connected to all its

neighbors. TriBA is compared with 2D Mesh and Binary Tree as static interconnection network. The comparison / evaluation is

enumerated from three orthogonal view points, viz., computational speed, physical layout and cost. Our analysis concludes that

TriBA is computationally efficient interconnection strategy that exploits group locality in processing nodes.

multiprocessor, locality, interconnection network, VLSI layout, performance evaluation

Citation: Khan H U R, Shi F, Ji W X, et al. Computationally efficient locality-aware interconnection topology for multi-processor system-on-chip (MP-SoC).

Chinese Sci Bull, 2010, 55: 3363−3371, doi: 10.1007/s11434-010-4118-z

Multiprocessor Systems on Chip (MPSoC) combine the

advantages of parallel computing of multiprocessors with

single chip integration of SoCs. MPSoCs are employed in

embedded system that requires high performance data

processing capabilities [1–4]. Examples include network

processors (NPs), parallel multimedia processors (PMPs)

and other application specific array processors (ASAPs).

Improvements in semiconductor technology have made it

possible to include multiple processor cores on a single die.

Chip Multi-Processors (CMP) are an attractive choice for

future billion transistor architectures due to their low design

complexity, high clock frequency, and high throughput.

Multi-Processor (MP-SoC) platforms are emerging as the

latest trend in SoC design. These MP-SoCs consist of a

large number of Intellectual Property (IP) blocks in the form

of functionally homogenous/heterogeneous embedded

*Corresponding author (email: haroon@bit.edu.cn)

processors. In this new design paradigm, IP blocks need to

be integrated using a structured interconnect template, for

example, according to high-performance parallel computing

architectures. A formal evaluation process is required before

adopting a specific parallel architecture to SoC domain

[1,5].

Complex Systems on Chip (SoCs) can be realized con-

sisting of billions of transistors in 65 nm technology [5,6].

The emergence of SoC platforms consisting of large, het-

erogeneous sets of embedded processors is imminent

[1,2,6]. A key focus of such multiprocessors SoC platform

is the interconnect topology. Therefore, the on-chip inter-

connect topology should resemble the interconnect archi-

tecture of high-performance parallel computing systems

[1,2]. Many interconnection networks for on-chip multi-

processor architecture have been proposed in the literature,

over the past three decades. Extensive accounts of these

networks and their performance evaluation have been re-

下载后可阅读完整内容，剩余8页未读，立即下载

weixin_38726007

粉丝: 6
资源: 929

TriBA:高效能群体局部性导向的多处理器系统级芯片连接架构

IRAS-2010：高效开源河流-地下水系统模拟

现代电子对抗环境下雷达信号脉内参数估计算法研究及优化方案

"MATLAB-M文件与C编程在SAR图像形成中的并行性能比较研究

l-曲线matlab代码-A-LSQR-type-method-provides-a-computationally-efficient-au

A computationally efficient source localization method for a mixture of near-field and far-field narrowband signals

Computationally efficient model predictive control algorithm-Maciej_Ławryńczuk

MATLAB代码：机组组合 关键词:电力系统优化调度 机组组合 电力系统入门代码 参考文档：A computationally efficient mixed integer linear form

A_Robust_and_Computationally_Efficient_Motion_Dete

On the Ergodic Capacity of Antenna Selection Aided Massive Multi-User MIMO Systems with Imperfect CSI in Correlated Time-Varying Channels

Characteristics-based effective applause detection for meeting speech

最新资源

MATLAB代码：机组组合关键词:电力系统优化调度机组组合电力系统入门代码参考文档：A computationally efficient mixed integer linear form