COL 10(6), 061001(2012) CHINESE OPTICS LETTERS June 10, 2012
Motion-compensated interpolation for face-centered
-orthorhombic sampled video sequence
Ning Guan (''' www)
∗
, Xu Zhang (ÜÜÜ RRR), and Hongda Chen (øøø)
State Key Laboratory on Integrated Optoelectronics, Institute of Semiconductors,
Chinese Academy of Science, Beijing 100083, China
∗
Corresponding author: guanning@semi.ac.cn
Received November 22, 2011; accepted December 28, 2011; posted online February 24, 2012
Face-centered orthorhombic (FCO) sampling can be implemented more easily on CMOS image sensors
than on other video acquisition devices. The sampling efficiency of FCO is the highest among all three-
dimensional (3D) sampling schemes. However, interpolation of FCO-sampled data is inevitable in bridging
human perception and machine-vision algorithms. In this letter, the concept of motion compensation is
b orrowed from deinterlacing, which displays interlaced videos on progressively scanned devices. The com-
bination of motion estimation based on intrafield interpolated frames and motion-compensated interfield
interp olation is found to provide the best performance by evaluating different combinations of motion
estimation and interpolation.
OCIS codes: 100.3010, 110.3010, 110.4155, 100.2000.
doi: 10.3788/COL201210.061001.
The pixel rate, which is defined as the number of pixel
values that can be extracted from the pixel array during a
given period, is limited in resource-constraint situations,
which include high-speed and ultra-low-power video ac-
quisition. The efficiency of the sampling scheme is crit-
ical in increasing the overall performance. The multidi-
mensional sampling theory
[1]
proves that face-centered
orthorhombic (FCO) sampling (as shown in Fig. 1(b))
offers the highest sampling efficiency among all three-
dimensional (3D) sampling schemes just as hexagonal
pixels (Fig. 1(a)) do for two-dimensional (2D) image
sampling.
Conventional video sampling apparatuses such as vac-
uum tubes and charge-coupled devices (CCDs) are re-
stricted by their scanning readout nature and are barely
capable of realizing interlaced scanning. When com-
plementary metal oxide semiconductor (CMOS) image
sensors (CISs) are realized, FCO sampling
[2]
can be
implemented because the readout circuitry of CISs re-
sembles random-access memories.
The FCO-sampled sequences have to be interpolated
up to full resolution to bridge FCO-sampled video se-
quences and existing video-processing algorithms. This
problem is similar to the deinterlacing
[3]
process, which
displays interlaced video sequences on progressively
scanned screens. However, the final judgment on dein-
terlacing is only human perception. FCO interpolation
must further fulfill the need for machine-vision applica-
tions. Thus, its absolute accuracy is also of main concern
and not just human feelings.
Existing deinterlacing algorithms are categorized into
linear, motion-adaptive, and motion-compensated meth-
ods. Linear algorithms include intra- and interfield in-
terpolations. Some linear algorithms
[4]
even develop
interpolating coefficients based on 3D sampling as was
also done by Guan et al.
[2]
Motion-adaptive deinter-
lacing algorithms
[5−7]
use motion detectors to switch
between intra- and intrafield interpolating methods in
different areas of a field. They are practical for display
because human eyes are less sensitive to the details of
moving objects. However, they are improper for general-
purpose interpolation because no extra information is
added where intrafield interpolation is used.
On the contrary, motion-compensated algorithms
[8−10]
add extra information over the whole field. Generally
speaking, these algorithms first shift corresponding ar-
eas in the former and latter fields using a motion vector
(MV). Then, they interpolate based on the current and
shifted fields as if they represent the same stationary
scene. However, the motion estimators (MEs) and in-
terpolators cannot be directly used due to the different
sampling schemes—one is the interlaced scheme and the
other is FCO.
Therefore, we aim to determine which combination
of ME and motion-compensated interpolator is most
suitable for reconstructing FCO-sampled videos. Both
objective and subjective criteria are used in the evalua-
tion.
The original signal on the focal plane is a time-varying
2D illumination ψ ( x, y, t), which is a 3D signal if the
magnitudes of x, y and t are ignored. The multidi-
mensional Nyquist condition is not simply the super-
position of multiple one-dimensional (1D) criteria
[11]
.
Fig. 1. Most efficient sampling schemes for (a) 2D and (b) 3D
signals. (a) Hexagonal pixel arrangement on the focal plane
of the (b) FCO sampling scheme, in which only white pixels
are read out at each sampling time.
1671-7694/2012/061001(5) 061001-1
c
° 2012 Chinese Optics Letters