4 XAMPLING—PART I: PRACTICE
Neglecting this mathematical issue, a realtime system should
react gracefully to spectral changes.
Examples. Before proceeding, we examine two straightfor-
ward sampling solutions in light of the Xampling criteria.
Uniform sampling at f
NYQ
obviously contradicts (X2) and
cannot be considered as a sub-Nyquist system. Furthermore,
in wideband settings, the implementation may be impractical
since the rates of conventional Nyquist ADC devices are still
far below the wideband regime [32], contradicting (X3). Base-
band processing (X4) is also not possible, since by definition
the samples arrive at the high Nyquist rate, thus extracting
the information signals I(t), Q(t) must involve computational
complexity proportional to f
NYQ
.
The second approach utilizes the scheme of Fig. 1, by
searching for the carrier frequencies f
i
prior to sampling,
namely using analog components. The motivation for this
solution is two-fold. Standard receivers often compensate for
slight deviations from the prespecified carrier values f
i
, by
performing fine analog tunings to the local oscillator until its
frequency is locked to the actual carrier value [33], [34]. The
idea is therefore to use the same locking mechanism to search
for f
i
over the entire wideband regime. The fact that Fig. 1
satisfies all the four Xampling criteria, once the carriers f
i
are
known, is the second motivation.
Unfortunately, this solution is practically infeasible when
the carrier frequencies f
i
are unknown apriori and can lie
anywhere below f
max
. Locking over the entire wideband
spectrum is time consuming; during this time the signal
cannot be acquired. To shorten this period, the sampling rate
must be increased much above the minimal, contradicting
(X2). Furthermore, such a locking stage is both hardware and
software excessive. A standard tunable oscillator can cover
only a narrow range of frequencies [35], which may require
hundreds of devices to cover the range until f
max
. In addition,
when initializing the mechanism far away from the true carrier,
it may lock to a spurious frequency. Only high-level data-
aided algorithms can identify this situation and re-initialize
the hardware. This severely burdens the DSP, contradicting
also (X3). Furthermore, whenever the band positions change,
the locking needs to be reinitiated, and again the signal cannot
be acquired until this task is completed.
B. Xampling = Compressed sensing for analog signals
We now briefly describe the CS framework [14], [15], and
distinguish between CS and Xampling.
The majority of compressed sensing publications study
variants of the underdetermined sparse recovery problem. The
signal model assumes a vector x in the finite space R
n
or C
n
,
which has only a few nonzero entries. Sampling, referred to
as sensing in this framework, is carried out by computing the
linear projection
y = Ax, (4)
with A having far fewer rows than columns. Results from
this field [14], [15] show that under suitable conditions, the
linear sensing is stably invertible, even when the length of y
is proportional to the number of nonzeros in x, rather than the
ambient dimension n.
Sensing of sparse vectors is the discrete counterpart of
the sub-Nyquist problem illustrated in Fig. 1. However, it is
not straightforward to generalize the discrete CS formulation
to analog signals. The difficulty can be noticed immediately
in the signal model. Sparsity is defined in CS by count-
ing the number of nonzeros in x, while analog sparsity of
x(t) involves an uncountable number of zeros and nonzeros.
Another difference relates to the sensing matrix A. In the
analog context, A corresponds to generalized sampling, where
measurements are inner products with the input x(t) [17],
[36]–[38]. This leads to a structured and deterministic matrix
A, as it needs to be implemented in hardware. In contrast,
mainstream CS results are stated for random unstructured
sensing matrices. A final important difference is the issue of
recovery complexity. Na
¨
ıve extensions of CS-type algorithms
to the infinite dimensions, such as `
1
linear programming
min
x
kxk
1
s.t. ky − Axk ≤ , (5)
or greedy techniques, lead to undefined or difficult problems.
For example, an optimization over the continuous signal x(t)
[39]
min
x(t)
obj(x(t)) s.t. ky[n] − A(x(t))k
l
2
≤ , (6)
where the objective is a sparsity-promoting function L
2
(R) →
R and the constraint involves the infinite sample set y[n], the
sampling operator A : L
2
(R) → l
2
(R), and the continuous
signal. A program such as (6) is not a well-defined optimiza-
tion structure [39]. In turn, discretization methods result in
very large scale CS systems, which impose a severe burden
on the digital processing units. In contrast, continuous recon-
struction in sampling theory, though hypothetically involving
infinite sequences, is practically performed in realtime over a
well localized set of samples.
The baseband processing is another distinct aspect. Lately,
there has been growing interest in processing in the com-
pressed domain, e.g. machine learning and classifications [40],
or manifold learning [41], [42]. These works exploit the fact
that certain properties are approximately invariant under a
linear transformation. This allows, to some extent, learning and
classification tasks to be carried out directly on the short-length
vector y of (4). The methods [40]–[42] are specific to discrete
vectors. In contrast, sub-Nyquist methods, both classic [3]–
[6] and recent [7]–[12], are all based on spreading or aliasing
techniques which result in a mixture of the signal content. In
particular, the information contents I(t), Q(t) are not invariant
under the mixture, and a DSP algorithm cannot be carried out
directly on the samples. To emphasize, baseband processing
in Xampling means the ability to translate the seemingly-
corrupted samples to the original information bits or analog
message. Learning, classification, and other DSP algorithms
can then follow so that there is no need to approximate their
results in the compressed domain. The crucial requirement
of (X4) is that the information is to be extracted without
interpolating to the Nyquist grid.
The CS paradigm aims at avoiding high-rate redundant
sampling. The discrete CS framework [14], [15] initiated a
long line of highly influential works. However, it still remains