DNA Barcode Goes Two-Dimensions : DNA QR Code Web
Server
Chang Liu
1
*
.
, Linchun Shi
1.
, Xiaolan Xu
1.
, Huan Li
2
, Hang Xing
2
, Dong Liang
2
, Kun Jiang
3
,
Xiaohui Pang
1
, Jingyuan Song
1
, Shilin Chen
1
*
1 Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, People’s Republic of China, 2 School of
Computer Science and Engineering, Beijing University of Aeronautics, Beijing, People’s Republic of China, 3 Pidit Inc, Edison, New Jersey, United States of America
Abstract
The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present,
‘‘DNA barcode’’ actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval.
Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A
comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH , and CO1 was used as the test
data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based
on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR)
code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of
QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server
allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform
species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of
various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for
DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA
barcoding applications.
Citation: Liu C, Shi L, Xu X, Li H, Xing H, et al. (2012) DNA Barcode Goes Two-Dimensions: DNA QR Code Web Server. PLoS ONE 7(5): e35146. doi:10.1371/
journal.pone.0035146
Editor: Robert DeSalle, American Museum of Natural History, United States of America
Received January 3, 2012; Accepted March 8, 2012; Published May 4, 2012
Copyright: ß 2012 Liu et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted
use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This work was supported by the ‘‘Xiehe Scholar’’ start-up fund to C. Liu from the Chinese Academy of Medical Sciences (Grant No. PUMC20112569). The
funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing Interests: Kun Jiang is an employee of Pidit Inc. This does not alter the authors’ adherence to all the PLoS ONE policies on sharing data and
materials.
* E-mail: cliu@implad.ac.cn (CL); slchen@implad.ac.cn (SC)
. These authors contributed equally to this work.
Introduction
The DNA barcoding technology uses a short standard piece of
DNA sequence for species identification and has gained wide
acceptance as a standard and effective method for biodiversity
research, conservation genetics, wildlife forensics, and so on. The
648 bp region of the mitochondrial cytochrome c oxidase subunit I
(CO1) gene has been accepted as the DNA barcode for animals
[1,2]. For plants, two chloroplast genes, namely, rbcL and matK,
were proposed by the plant working group of the Consortium for
Barcode of Life (http://www.barcodeoflife.org/) as core barcodes
[3] after integrating the results obtained from a number of studies
[4,5,6,7,8,9,10,11,12]. More recently, the intergenic transcribed
spacer (ITS) and its subsequence (ITS2) have also been proposed
as additional core barcodes [13]. Furthermore, psbA-trnH remains
as a supplementary DNA barcode for further evaluation [14]. For
fungi, ITS was proposed as the core barcode in the fourth
International Barcode of Life Conference (Adelaide, Australia
2011). In summary, through numerous studies, consensus has been
reached for core barcodes for animals and plants to date.
With the determination of the core DNA barcodes for the two
kingdoms of life, efforts would now start shifting to practical
applications of DNA barcoding technologies. At present, ‘‘DNA
barcode’’ actually refers to DNA sequences, which has several
limitations in practical applications. First, it lacks information
compression, which results in a large printout size. Second, it
encounters difficulty in information retrieval through direct
scanning of DNA sequences. Consequently, adopting a new
format to represent DNA barcode sequences is urgently needed to
display and retrieve DNA barcode information efficiently.
Barcode technology has been adopted in the manufacturing and
retailing industries for many years. Thus, investigating if these
well-developed technologies can be applied to represent the so-
called DNA barcode would be logical. Actually, a study suggested
the use of PDF417 symbology for the ‘‘DNA Barcode’’ [15], which
affords efficient information retrieval. However, no comprehensive
evaluations of the available barcode types for suitability in
encoding DNA barcode sequences have been reported to date.
Furthermore, no computational tools have been developed that
allow users from a wide range of research communities, industries,
and regulatory agencies to utilize barcode symbologies for DNA
barcoding applications.
In the current study, a systematic comparison of various one-
dimensional (1D) and two-dimensional (2D) barcoding symbolo-
gies have been conducted using the sequences of the five most
widely accepted plant and animal barcodes (ITS2, rbcL, matK,
PLoS ONE | www.plosone.org 1 May 2012 | Volume 7 | Issue 5 | e35146