改进的Winograd傅立叶变换算法：3,5,7点统一架构

44 浏览量更新于2024-08-27 1 收藏 1.3MB PDF 举报

"这篇研究论文提出了一种改进的统一架构，应用于3点、5点和7点的Winograd傅立叶变换算法（WFTA）。通过定义输入和输出的映射规则并利用矩阵变换修改WFTA中的矩阵，该设计既减少了硬件资源消耗，将多路复用器数量从7个减少到3个，还简化了控制流程，通过特定规则实现数据输入和端口映射。关键词包括：WFTA、统一架构和简化控制流。" 本文介绍的是针对Winograd傅立叶变换算法的一种优化方法，特别是在3点、5点和7点的变换规模上。Winograd傅立叶变换算法（WFTA）是快速傅立叶变换（FFT）的一种变体，它在计算离散傅立叶变换（DFT）时提供了一种更有效的方法。在现代无线通信系统中，如正交频分复用（OFDM）技术，DFT是核心组成部分，而FFT通常用于执行DFT。傅立叶变换在信号处理和通信领域有着广泛的应用，它可以将信号从时域转换到频域，帮助分析信号的频率成分。然而，传统的FFT算法需要大量的乘法和加法操作，尤其是在大尺寸变换时，计算复杂度较高。相比之下，Winograd算法通过精心设计的矩阵运算，减少了乘法操作的数量，从而降低了计算量。本论文提出的改进统一架构旨在进一步优化WFTA，通过修改算法中的矩阵，实现了硬件资源的节省。通过减少多路复用器的数量，降低了硬件实现的复杂性，这在资源有限的嵌入式系统或集成电路设计中尤其重要。同时，通过制定数据输入和端口映射的规则，控制流程得以简化，这有助于提高系统的运行效率和实时性能。此外，简化控制流也是设计的关键优化之一。在数字信号处理系统中，控制逻辑的复杂性往往直接影响到系统的速度和可编程性。减少控制逻辑的复杂性可以使系统设计更易于理解和实现，同时也降低了错误发生的可能性。这篇研究论文为实现高效、低资源消耗的Winograd傅立叶变换提供了新的设计思路，对于需要快速傅立叶变换但又受限于硬件资源的场合，如无线通信、图像处理和音频编码等领域，具有重要的实践意义。通过这样的优化，可以在不牺牲性能的前提下，提高系统的资源利用率和整体效能。

Improved Unified Architecture for 3, 5, and 7-point

Winograd Fourier Transform Algorithm

Qiang Zhang, Changyin Liu, Peng Zhang, Yuanzhi Chen, Jianhe Du

School of Information and Communication Engineering

Communication University of China

Beijing, China

qiangzhang@cuc.edu.cn, liuchy@cuc.edu.cn, zhangpeng@cuc.edu.cn, chenyuanzhi@cuc.edu.cn, dujianhe1@gmail.com

Abstract—This paper proposes an improved unified

architecture for 3, 5, and 7-point Winograd Fourier Transform

Algorithm (WFTA). We define the mapping rules of inputs and

outputs and modify the matrices in WFTA by matrix

transformation. This design not only reduces hardware resource

consumption by reducing multiplexers from 7 to 3, but also

simplifies the control flow by performing data input and ports

mapping through certain rules.

Keywords—WFTA; unified architecture; simplified control

flow

I. INTRODUCTION

Orthogonal Frequency Division Multiplexing (OFDM)

technology, which has strong immunity to frequency selective

fading, is widely used in modern wireless communication

systems. One of its core technologies is the DFT. In general,

FFT is used to reduce the complexity of the DFT.

FFT can be divided into two categories: power-of-two and

non-power-of-two. For power-of-two FFTs, algorithm

simulation and hardware implementation are very mature.

Currently, some standards involve non-power-of-two FFTs.

E.g., 3G Long Term Evolution (LTE) involves 1536-point

DFT

[1]

, Chinese Digital Terrestrial/Television Multimedia

Broadcasting (DTMB) involves 3780-point DFT

[2]

, and Digital

Radio Mondiale (DRM) involves 228-point DFT

[3]

. Non-

power-of-two FFTs are mostly calculated using Cooley-Tukey

Algorithm and Prime Factor Algorithm (PFA) to decompose an

N-point DFT into smaller DFTs, which are calculated by the

Winograd Fourier Transform Algorithm (WFTA). E.g., for

DTMB, 3780-point is decomposed into 3, 4, 5, 7, and 9-point

[4,

WFTA calculates small point DFTs through matrix

decomposition, which greatly reduces the number of

multiplications. References [4] and [5] need to use different

circuits to calculate different small point WFTAs, and the

architecture is not uniform. Reference [6] proposed a unified

architecture of 2, 3, 4, 5, and 7-point WFTA. It can calculate 2,

3, 4, 5, and 7-point WFTA in the same circuit, reducing

resource consumption. However, it is just a blunt combination

of 2, 3, 4, 5 and 7-point signal flow graph. The input data and

ports mapping have no regularity, which leads to complicated

timing control. We think that 2 and 4-point DFTs needs 4

adders to achieve, which takes less resources. So there are no 2

and 4-point in the improved unified architecture. When using

the architecture in reference [6] for 3, 5, and 7-point WFTA, 36

complex adders, 16 real multipliers and 7 multiplexers are

needed, which results in great circuit differences and complex

control flow.

Section II introduces N-point WFTA. The improved unified

architecture design of N-point WFTA is proposed in the section

III. The last section concludes the paper.

II. N-

POINT WFTA

N-point WFTA can be expressed as：

T T

01 -101 -1

== ,





X CBA

xxXxXX ΛΛ

( 1 )

where A is an M × N matrix related to the input (M ≥ N), C is

the N×M matrix associated with the output, B is an M×M

diagonal matrix about θ (θ = 2/N), and the diagonal element is

denoted as {B

= 1, B

, ..., B

M -1

}, where B

, …, B

M-1

are purely

real or purely imaginary.

For 3-point WFTA, M = 3, N = 3, and the Winograd

algorithm gives the following matrices



11 1 1 0 0 10 0

=01 1, =0cos 1 0 , =11-1 .

01-1 0 0 j

2/3

2/3sin 1 1 1







 



 



 



 

 



AB C

(2)

Bringing (2) into (1), we can get a 3-point WFTA signal

flow graph, as shown in Fig. 1. It can be seen that 3-point

WFTA involves 6 complex adders and 4 real multipliers.

× X

× +

a = A[x

]

b=Ba

]

=Cb

Ⅰ Ⅱ Ⅲ

Fig. 1 3-point WFTA signal flow graph

The operation rules of the adder in Fig. 1 is explained in

Fig. 2. The input of the upper branch minus the input of the

lower branch is shown in Fig. 2(a). The addition is shown in

Fig. 2(b).

‐

−D

(a)

(b)

Fig. 2 Symbol description

1545

2019 IEEE 3rd Advanced Information Management,Communicates,Electronic and Automation Control Conference (IMCEC 2019)

下载后可阅读完整内容，剩余3页未读，立即下载

weixin_38571603

粉丝: 3
资源: 925

改进的Winograd傅立叶变换算法：3,5,7点统一架构

Winograd FFT算法

vc++ 傅立叶变换 没有任恶意代码

快速傅立叶变换算法

傅立叶变换

FFT算法傅立叶正变换和逆变换

基于FPGA的快速傅立叶变换

快速傅立叶变换Fast Fourier Transforms

快速傅立叶变换原理及其在工程计算中的应用

C语言的快速傅立叶算法程序

winograd 算法

最新资源

vc++ 傅立叶变换没有任恶意代码