Longest Common Sub-sequence Computation and
Retrieve for Encrypted Character Strings
Minghao Zhao
1
, Zhen Li
1,2
, Yilei Wang
3,4
, Qiuliang Xu
1
1
School of Computer Science and TechnologyShandong University, Jinan, China
2
School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, China
3
School of Information and Electric Engineering, Ludong University, Yantai, Shandong
4
Shandong Provincial Key Laboratory of Software Engineering, Jinan, China
Email: zhaominghao@hrbeu.edu.cn; xql@sdu.edu.cn
Abstract—Longest Common Sub-sequence is a basic
algorithm problem. It serves as a basic component for a variety
of applications in information processing and bioinformatics. It is
a NP-hard problem and often manipulated using dynamic
programming, which is relatively fast but involves large memory
space. Fortunately, cloud computing and outsourced computing
provides a practical method for overload alleviation. However,
for the security and privacy concern, clients hope to encrypt their
data before upload them to the cloud, meanwhile maintain the
ability for the cloud to process on the data. In this paper, we
propose a method to computing Longest Common Sub-sequence
using somewhat homomorphic encryption. Beyond that, we show
how to use our achievement into searchable encryption to achieve
rich expressiveness.
Keywords—homomorphic encryption, information retrieve,
longest Common Sub-sequence, searchable encryption
I. INTRODUCTION
With the proliferation of newly emerging technology in
multimedia, software and storage, we have stepped into the era
of big data, and data have become a torrent flowing into every
area of the global economy [1]. However, generally big data
processing involves tremendous overhead for storage and
computing (e.g. processor), which is a huge burden for
individuals and small or medium-sized enterprises. Fortunately,
cloud computing and outsourced computing provides a feasible
method for burden release for the client, and gets their
popularity nowadays. Having considered the fact that the
service provider is generally untrustworthy, individuals begin
to pay increasingly attention on data security and privacy. It is
desirable to provide a method that enables the cloud service
provider to process on the data, meanwhile prevent him to
acquire sensitive information about the data.
Generally, Full-homomorphic encryption (FHE) [2] and
secure multi-party computation (SMPC) [3-4] are generic
cryptographic achieving this target. Specifically, Full-
homomorphic encryption is an encryption scheme that enables
the cloud to perform any computation on the ciphertext, and
multi-party computation is a cryptographic protocol that
enables distributed parties to jointly compute functionality
without revealing each party’s input and output. However,
although after many years research, the low efficiency of FHE
and SMPC still prevent them for real application. Thus,
researchers are engaged in designing specific and appropriative
method for certain problems without using full-homomorphic
encryption and secure multiparty computation.
Longest Common Sub-sequence is a basic algorithm
problem which captures a wide range of application in
computational biology, computational linguistics and other
utilization of string manipulation. In this paper, we propose a
method to compute longest common sub-sequence use
somewhat homomorphic encryption. In addition, we will show
how our method can be used in searchable encryption to
achieve richer expensiveness.
II. PRELIMINERARISE
A. String Manipulation for Rich Expensiveness in Searchable
Encryption.
Searchable encryption is a cryptographic primitive that
enables a client to outsource his encrypted document to the
cloud, meanwhile maintains the ability to perform keyword
based search and retrieval the document. The security ensures
that only minimal information is leaked to the cloud. After
firstly proposed by song et al. [5], many scholars conduced to
researches on rich expensiveness, such as fuzzy search [6],
ranked search [7] and Boolean query [8].
In traditional searchable encryption, keyword is regarded as
the basic search unit, which indicates that, the search term is
restricted to a certain keyword (or logic combination of
multiple keywords). As they treat the keyword as an entire
component rather than stings, theme scheme do not support
wildcard search, sub-string search and regular expression query.
Generally, using keywords as search criteria is satisfiable
for a wide range of languages, especially for isolating language
and fusional language. But in terms of agglutinative language
(e.g. Finnish and German), in which a word is composed of
many morpheme. We seldom search for such a long word, but
instead of that, we tend to search for the morphemes. Existing
natural language parsing tools can be used for morpheme
division, but it is difficult to be absolutely accurate. Thus a
simple and method is to support any query text string, instead
of the whole word. Especially in CJK language (i.e. Chinese,
Japanese and Korean), the text is a sequence on a large
alphabet, and each symbol has an independent meaning. We
2016 19th International Conference on Network-Based Information Systems
2157-0426/16 $31.00 © 2016 IEEE
DOI 10.1109/NBiS.2016.82
496
2016 19th International Conference on Network-Based Information Systems
2157-0426/16 $31.00 © 2016 IEEE
DOI 10.1109/NBiS.2016.82
496