新型无逗号系统编码：通用F(n,s,t)码

10 浏览量更新于2024-08-27 收藏 184KB PDF 举报

"这篇研究论文探讨了一种名为‘通用的系统无逗号代码’(AGeneralizedSystematicCommaFreeCode)的新编码方法，旨在解决通信通道中的替换、插入和删除错误导致的错误同步问题。文章由Tianbo Xue和Francis C.M. Lau共同撰写，他们分别来自香港理工大学电子与信息工程系和香港理工大学深圳研究院。" 在无线通信和数据传输领域，信道往往会引入各种类型的错误，如替换错误（substitution errors）、插入错误（insertion errors）和删除错误（deletion errors）。这些错误会破坏编码的同步性，可能导致接收端误解发送的信息。为了解决这一问题，编码设计师的任务是创建能够最小化错误同步概率的编码方案。本文提出的通用的系统无逗号代码（generalized F(n,s,t) code）是一种新的系统性无逗号码，首先被证明为异步码（asynchronous code），这意味着它能在一定程度上抵抗由于信道噪声引发的同步丢失。随后，作者通过理论分析和模拟实验，推导出在发生替换、插入或删除错误时错误同步的概率。他们对比了不同参数下的理论结果和仿真结果，以展示新编码在各种条件下的性能。此外，研究还对比了提出的通用无逗号代码与经典的F码（F-code）的性能。F-code是一种已知的无逗号码，用于避免错误的同步问题，但新提出的编码方案可能会提供更优的性能，尤其是在对抗特定类型的信道错误时。通过深入研究这种新的编码结构，该论文为设计高效、抗干扰的通信系统提供了理论基础。这不仅有助于提高数据传输的准确性和可靠性，还有可能推动未来通信技术的发展。这项工作为理解和优化通信系统的错误控制编码策略提供了新的视角和工具。

A Generalized Systematic Comma Free Code

Tianbo Xue

∗†

and Francis C. M. Lau

∗†

∗

Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hong Kong

†

The Hong Kong Polytechnic University Shenzhen Research Institute, China

Email: tianbo.xue@connect.polyu.hk and francis-cm.lau@polyu.edu.hk

Abstract—For channels that introduce substitution, in-

sertion and deletion errors, one challenging problem for

a code designer is to avoid false code synchronization. In

other words, the probability of false codewords occurring

should be minimized with appropriate code design. In this

paper, we propose a new class of systematic comma free

code called generalized F (n, s, t) code. We ﬁrst prove that

this code is a synchronous code and then we derive the

probabilities of false synchronization when a substitution,

insertion or deletion error occur. We compare the theoret-

ical and simulation results under different parameters. We

also compare the performance of our proposed code with

the classical F code.

I. INTRODUCTION

Many communication channels corrupt transmitted

signals by adding noise, causing intersymbol interfer-

ence, introducing adjacent channel interference, etc.

Some channels cause errors by substituting the transmit-

ted symbols with other symbols. There are also channels

that not only substitute symbols, but also insert new sym-

bols and remove transmitting symbols. One such exam-

ple is deoxyribonucleic acid (DNA)-based data storage

system where data are stored as DNA sequences [1–

4]. During the synthesis process and sequencing process

of the DNA sequences, nucleotides storing information

can be substituted, added or removed, causing substi-

tution, insertion and deletion errors, respectively. When

insertion and/or deletion errors occur, synchronization

between codewords will be lost.

One approach toward maintaining synchronization in

the presence of errors is to transmit “commas” between

codewords [5]. Another approach that do not depend

on “commas” was ﬁrst proposed by Golomb et al. [6].

Such codes that do not depend on “commas” are called

“comma-free codes”. Speciﬁcally, a comma-free code

(CFC) is a set of codewords of length n over an alphabet

such that given any two codewords u = u

···u

and

v = v

···v

belonging to this set, the n letter con-

catenation w = u

···u

···v

k−1

(k =2, 3,...,n)

will not form a codeword. The class of CFCs proposed

by Golomb et al., however, does not provide ﬁxed

locations for carrying information.

In [7], systematic ﬁxed block length CFCs are intro-

duced. A systematic CFC is a subclass of CFC which

use ﬁxed locations in each codeword for maintaining

synchronization and the remaining locations for trans-

mitting information. One strategy often used is that the

ﬁrst few positions of a codeword are ﬁxed to either 0 or

1. This ﬁxed sequence, which appears at the beginning

of each codeword, is called the primer drive of the code.

An “F code” is a classic systematic CFC. The class of

F codes [7] does not preclude the existence of a primer

drive in the body of a codeword but it ensures that no

codewords are formed between concatenated codewords.

F codes are more efﬁcient in terms of transmitting

information compared to other codes such as Gilbert

codes [5]. Denote the code length by n>1 and suppose

r is the least integer ≥

2(n − 1), s is the least integer

≥

and t = r − s. Then an F code of length n

has 0s ﬁxed in the positions 1, 2,...,s and 1s ﬁxed in

the positions n, n − s, n − 2s, . . . , n − (t − 1)s while

all other positions are arbitrary and can be used to

transmit information symbols. It has been proved that

the F code is the most efﬁcient systematic CFC that

uses ﬁxed places to maintain synchronization [7]. For

example, when n =15, the information “101101100”

will be encoded into an F code “000101101111001”

where the digits in bold represent those ﬁxed bits used

for maintaining synchronization.

Decoding of a received data sequence is assumed to

be performed using a ﬁxed-length decoding window of

size n, which is simply the length of an F codeword.

The decoder only knows the beginning and end of the

received sequence but has no direct information about the

boundaries between codewords. Whenever the decoding

window detects the structure of an F code, this codeword

is decoded immediately. Then the decoding window

shifts n positions to the right and attempts detecting

and decoding the next codeword. If any substitution/in-

sertion/deletion errors occur, the n bits in the decoding

window may not form a valid codeword. In this case, the

decoding window will only shift one position at a time

to the right until a valid F codeword is found. These

procedures repeat until the last bit in the received data

sequence is reached.

By deﬁnition, a valid codeword cannot be formed

based on bits from two consecutive CFCs such as

systematic F codes. However, a valid codeword or a

false codeword can be formed when (substitution/in-

sertion/deletion) errors occur in CFCs. For example,

if there is an insertion error occurring in an F code,

this F codeword will not be detected or decoded. At

2017 23rd Asia-Pacific Conference on Communications (APCC)

393

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38503483

粉丝: 8
资源: 942

新型无逗号系统编码：通用F(n,s,t)码

PHP Csv(Excel)通用成绩查询系统 v20180831.zip

C/C++代码规范

Universal Product Code Database通用产品代码数据库-数据集

通用短信发送系统webservice接口

PHP+Csv(Excel)通用成绩查询系统

asp+txt 通用工资查询系统 v3.8-ASP源码.zip

ASP实例开发源码—asp+txt 通用成绩查询系统 v2021.zip

PHP+Csv(Excel)通用成绩查询系统手机版 v6.6_PM1C

基于c语言开发的电影票管理系统源代码，采用csv文件存储数据

C++编程：逗号运算符与逗号表达式解析

最新资源