快速BiCS解码与并行硬件设计优化

需积分: 9 73 浏览量更新于2024-08-26 收藏 128KB PDF 举报

本文主要探讨了二进制输入压缩感测（Binary-Input Compressive Sensing，BiCS）在无线通信中的应用，特别是在实现无缝速率自适应的调制编码方案中。相比于传统的使用逻辑或（XOR）操作生成二进制符号的通道编码，BiCS通过加权和运算产生多级符号，这提供了更高效的信息处理能力。然而，BiCS的解码过程涉及到计算概率函数的卷积，这是一项复杂的任务，导致其在实际应用中遇到高解码复杂度的问题。为了克服这一挑战，研究者提出了一个快速的BiCS解码算法。算法的核心在于建立查找表，通过这些表，将概率函数的卷积转换成多项式形式，显著减少了计算量。具体来说，通过利用对数似然比作为消息传递解码中的信息，作者开发了一种近似计算方法，从而实现了快速解码。这种方法极大地降低了乘法操作的需求，理论上可以减少近90%的计算负担。此外，文章还关注了硬件设计方面的改进。为了减少内存冲突并提高并行性，研究者提出了多级循环移位策略来生成压缩感测（Compressive Sensing, CS）的测量矩阵。在硬件实现上，采用了水平单元处理器，结合建议的表格设计，使得迭代计算更加高效。最终，作者的现场可编程门阵列（Field-Programmable Gate Array, FPGA）设计实现了与现代无线网络通信速率相当的解码速度。本文的主要贡献在于提出了一种有效降低BiCS解码复杂性的算法以及相应的硬件优化，这对于推动这项技术在无线通信领域中的实际应用具有重要意义。通过减少计算负载和提升硬件效率，BiCS有望在未来的无线通信系统中发挥更大的作用，实现更高数据传输速率和更低能耗的无缝速率适应。

WA NG et al.: FAST DECODING AND HARDWARE DESIGN FOR BINARY-INPUT COMPRESSIVE SENSING 593

Inspired by the simi larity between CS sampling and en-

coding of channel codes, some research has started to relate

CS to channel coding. Sarvotham et al. [36] design Sudocodes,

which can be used as erasure codes for real-valued data. By

limiting

to sparse binary matrixes, Sudo codes dramatically

reduce both e n coding (sampling ) and decoding (recovery)

complexity. In particular, the worst-case decoding complexity

is only

. By using bipartite expander graphs,

Xu et al. further reduce decoding com plex ity to

[37], [38]. In addition to the binary ,binary is also con-

sidered in the literature. Wu et al. [39] characterize the CS

decoding process by a set of differential equations and derive

its closed-form form ulat ion. Liu et al. [40] improve the formu-

lation by lev eraging th e asymmetrical property on decoding

of bits “1” and “0.” However, these works all assume erasure

channels (i.e., a measurement i s either correctly received or

completely lost) and decoding in noisy channels (i.e., measure-

ments are contaminated with noise) is not discussed.

Although the problem of CS decoding from noisy m easure-

ments has been heavily discussed [41]–[44], CS -BP [45], [46]

is the ﬁrst work that reduces decoding com plexity by adding

certain restrictions. In particular, CS-BP adopts a sparse

whose nonzero entries are draw n from Rademacher distribu-

tion, and adopts m essage passing algorith ms for decoding.

As such, CS-BP decoding algorithm uses

mea-

surements and

computation. Please note that

although the complexity is h igher than that of Sudocodes, it is

achieved in noisy settings. Cui et al. [7] apply CS to practical

wireless communications, and design a sparse integer

from

the channel capacity perspective. The decoding algorithm is

a variation of CS-BP and therefore has similar com pl exity as

CS-BP. In this application of CS, the input signal

is a block

of bits, so we name it binary input CS or BiCS.

Although the message passing algori thm can tolerate additi ve

noise in measurements, the messages at weighted sum nodes

have to be calcu lated by convolution of probabilities. It greatly

increases the decoding complexity com pared with the message

passing algorithms for binary channel codes. Furthermore, the

fast decoding algorith ms for channel codes h ave n ot been ap-

plied to CS decoding yet.

C. Fast Decoding of Channel Codes

Due to the sim ilarity between BiCS and channel codes, we re-

view the fast decoding algorithms for two typical channel codes,

namely Turbo codes and LDPC codes. The two key techniques

to realize fast decoding are to use LLR as me ssages and to use

various approximations for probability computation.

The optimal decoding algorithm for Turbo codes is maximum

a poster ior i probability (MAP) estimation. Its per for mance is

close to Shannon limit in terms of bit er ror rate [12 ]. However,

the MAP algorithm needs to calculate a l arge number of mul-

tiplications of probabilities, which not only request inten sive

computation but also result in numerical representation prob-

lems. The key idea to solv e these problems is to calculate them

in the log domain . This variant is well known as the Log-MAP

algorithm [ 47] , where the multip lications of probabilities are

converted to the additions of their log values.

Furthermore, the

function in the Log-MAP algorithm

can be approximated by a maxim um term and a correction term.

The Max-Log-MAP algorithm [48], [49] is proposed to only

keep the maximum term and discard the correction term. Thus

it is suboptimal and has performance degradation about 0.5 dB

[47]. Subsequently, there are more approach es reported to im-

prove the approximation. For example, Vogt et al. introduce the

scaling op eration [50]. Cheng et al. [51] and Wang et al. [52]

propose to approxim ate the c orrection term as a linear function

and a piece-wise function respectively.

The MAP algorithm can also be applied to the decoding of

binary LDPC codes and be calculated in log domain [53]. Later

on, the Log-MAP algorithm is further extended f or LDPC codes

over

with (known as nonbinary LDPC codes)

[54]. Similar to Turbo codes, different approximations used in

the Log-MAP alg orith ms for LDPC codes provide different

trade-offs among implementation feasibility, comp utat ion al

complexity and num erical stability.

BiCS differs from Turbo codes and LDPC codes in that t he

operations to generate symbols are arithm etic not logical. Since

the m essages at w eig hted sum nodes are calculated by convolu-

tion, directly using LLR as m essages cannot bring any beneﬁt

to the decoding of CS. This paper is the exact one that solves

this problem. To our best knowledge, there are few papers that

study the fast massage passing algorithm for CS.

D. Hardware Design for LDPC Codes

Since both LDPC codes and CS can be represented by a bi-

partite graph, we follow the hardware design for LDPC codes.

The architectures of LDPC hardware decoders can be catego-

rized into two classes: full-parallel decoding and partial-parallel

decoding.

In full-parallel decoding [55], each row and each column of

the parity check ma tr ix is directly mapped to a different pro-

cessing unit and all these pro cessing units operate in parallel.

In partial-parallel decoding [56], the pa ri ty check matrix is par-

titioned into som e nonoverlap regions such that a set of check

nodes and variable nodes are updated per cycle. In general, most

of the partial-parallel decoders have lower decoding throughput

and higher energy dissipation than the full-parallel decoders.

However, the full-parallel decoders have much larger silicon

area than the partial-parallel decoders [57].

Unfortunately, high parallelism inevitably introdu ces the col-

lision problem in memory access. This is a well-known problem

that has been addressed in the parallel implementation [58].

Two main approaches are proposed to deal with the collisio n

problem. The ﬁrst one is to desig n collision fr ee codes. The

second one is to design a decoder architecture able to avoid or

at l east m i tig ate collision effects. The second approach is well

suited to ﬂexible and general architectures [59].

Current researches on decoder architecture and hardware im-

plementation for L DPC codes have m ade them suitable to high-

speed communication in modern wireless systems. In this paper,

we will propose the ﬁrst hardware d esign for fast BiCS de-

coding. It is a small but solid step to make it applicable to prac-

tical systems.

剩余12页未读，继续阅读

weixin_38746926

粉丝: 12
资源: 994

快速BiCS解码与并行硬件设计优化

cisco binary game 二进制游戏 8421码练习游戏

基于M进制哈弗曼编码解码程序

二进制编码与数据压缩技术初探

multisim二进制输入bcd

java 二进制 压缩 zlib

Verilog语言 8位二进制输入转3位十进制并输出在3个数码管

二进制压缩编码举个例子

电压模拟信号的二进制输入

如果不用二进制打开文件，但是文件输入用二进制可行吗？

二进制ldpc bp解码 matlab

最新资源

java 二进制压缩 zlib