PRINTED DOCUMENTS FINGERPRINT IDENTIFICATION
Changyou Wang
, Xiangwei Kong
, Shize Shang
, Xin’gang You
School of Information and Communication Engineering, Dalian University of
Technology, Dalian, 116024, China
Beijing Institute of Electronic Technology and Application, Beijing, 100091, China
ABSTRACT
Document identification can be applied to validate paper
ballots, bank notes and contracts. In this paper, the surface
texture of a printed document is extracted as the page’s
unique fingerprint, and a document identification method
based on the fingerprint is proposed. Firstly image blocks
are segmented from the blank margin of the whole document
image. After the image blocks are enhanced and normalized,
a unique feature matrix can be extracted to fingerprint the
document based on a bank of LogGabor filters. Then two
matching approaches are adopted for validation. Compared
with previous works, our method can successfully identify
all the pages used in the experiment without modifying them
in any way and all pages are scanned at only one orientation.
Index Terms— Document identification, LogGabor
filters, feature matrix, majority vote
1. INTRODUCTION
There are various levels where the document identification
problem can be addressed. Advances in digital technologies
have made it much easier for criminals to counterfeit money
and contracts for economic interests. And in many cases, a
document can meet the requirements of forensic evidence if
the document can be properly authenticated. Furthermore,
some careerists may forge paper ballots to seek political
rights in the election process. Thus, a reliable and objective
way to identify documents is required whenever questions
of document integrity are raised.
Conventional methods generally employ watermark or
spectrum analysis to authenticate documents [1-2]. However,
watermark can be easily concealed or removed by scanners
or copiers and spectrum analysis requires specialized
devices or human expertise. Some new advances have been
proposed in recent years [3-8]. Eric Metois et al. [3] did the
pioneering work in using surface texture to uniquely
authenticate a document by measuring “inhomogeneities in
the substrate” with a specialized imaging device. Baoshi Zhu
et al. [4] studied the random ink splatter which occurs
around the edges of any features printed on a document to
generate a signature for authentication. Wiwi Samsul et al. [5]
focused on employing Laser Surface Authentication method
to identify documents by making use of unique
microstructure of each part of the documents. Xianliang
Wang et al. [6] achieved document authorization by
embedding the personal identification number into the
deformation characters of a document before it is printed.
The methods mentioned above either require specialized
equipment, or need to modify the original document. To
achieve lossless authentication, Buchanan and Cowburn’s
patent [7] first introduced a method using only a flatbed
scanner to identify documents. A feature vector can be
generated by scanning a document from multiple
orientations to extract additional information. While the
extracted feature vector can be easily removed and forged.
William et al. [8] proposed measuring the unique three-
dimensional surface of a document by using a commodity
scanner and good results can be achieved according to this
physical feature. However, their method requires scanning a
document at four different orientations and the test region
needs matching precisely.
We approached this by employing a bank of LogGabor
filters to extract the surface texture from the blank margin of a
scanned document image without modifying the page in any
way and the document is scanned at only one orientation.
The rest of this paper is organized as follows. Section 2
introduces the enhancing process to obtain image blocks,
and Section 3 describes how to generate a unique feature
matrix in detail. The experimental results are shown in
Section 4 and Section 5 gives the conclusions.
2. IMAGE BLOCK ENHANCEMENT
In this section, we first segment the image blocks from the
blank margin of a scanned document image and then employ
histogram equalization and normalization methods to
enhance the contrast of each image block.
To avoid the influence of printed contents, both row
projection approach and column projection approach are
used to separate the blank margin image from a scanned