Req
F1
aC
+
-
+
Stage
Controller
pc
eval
T
matched
delay
S
F2
aC
+
-
+
Stage
Controller
pc
eval
matched
delay
F3
aC
+
-
+
Stage
Controller
pc
eval
matched
delay
Req
Req
DoneDoneDone
Figure 1. Block diagram of an HC pipeline
5-bit address word and only 32 entries. This represents a
dramatic reduction in the size of memory required, from
one 1024 entry table, to two 32 entry tables. However, there
is a slight tradeoff: twice the number of partial sums are
generated, requiring an additional adder stage to combine
them.
Exploiting Symmetry. To further reduce the table size in
half, a data representation scheme is used that makes the ta-
ble symmetric. In particular, the signed-digit offset binary
notation [8] is used, in which the symbols “0” and “1” stand
for negative and positive exponents of 2. For example, in
this notation, the 4-bit number “1001” stands for the value
8 − 4 − 2+1=3. The advantage of this representation is
that arithmetic negation is simply achieved by complement-
ing each bit: “0110” stands for the value −3. An interesting
feature of the filter equation (Equation 1) is that, if all the
inputs are negated, the filter output is also negated. Con-
sequently, when this representation is used, if two address
words for the lookup table are bit-wise complements of each
other, then the corresponding table entries will also be bit-
wise complements of each other. Exploiting this symmetry,
half of the table can be discarded.
2.2. High-Capacity Asynchronous Pipelines
This subsection reviews the pipelining approach adapted
for the asynchronous portion of the filter chip. A class
of pipelines, called high-capacity (
HC), is used, which are
targeted to dynamic logic implementations [12]. They are
based on a novel protocol that maximizes the pipeline stor-
age capacity by allowing every dynamic stage to hold a
distinct data item. In contrast, in traditional latch-free
asynchronous dynamic pipelines (e.g., [18, 13]), alternat-
ing stages usually must contain “spacers,” or “reset tokens,”
limiting the pipeline capacity to 50%.
The key idea in the
HC approach is one of decou-
pled control: the pull-up and pull-down of the dynamic
gates are made separately controllable. Therefore, the
precharge and evaluate controls can both be simultaneously
de-asserted, allowing the gate to enter a special “isolate
phase”—between “evaluation” and “precharge”—in which
its output is protected from further input changes. As a re-
sult, every pipeline stage can store a distinct data item, pro-
viding the capability of supporting 100% storage capacity.
In addition, the decoupled control leads to increased over-
all pipeline concurrency which in turn directly results in a
significantly increased throughput.
2.2.1. Structure
Figure 1 shows a simple block diagram of an
HC pipeline.
Each stage consists of three components: a function block, a
completion generator and a stage controller. In steady-state
operation, the function block alternately produces data to-
kens and reset spacers for the next stage, and its completion
generator indicates completion of the stage’s evaluation or
precharge. The third component, the stage controller, gen-
erates the decoupled control signals—pc and eval—which
control the function block and the completion generator.
HC pipelines use a single-rail bundled datapath [11, 1].
A control signal, Req, indicates arrival of new inputs to a
stage. A high value of Req indicates the arrival of new data:
the previous stage has completed evaluation. A low Req in-
dicates the arrival of a spacer: the previous stage has com-
pleted precharge. For correct operation, a simple timing
constraint must be satisfied: Req must arrive after the data
inputs to the stage are stable and valid. This requirement is
met by inserting a “matched delay” which is greater than or
equal to the worst-case delay through the function block.
Function Block. Figure 2 shows one gate of a dynamic
function block in a pipeline stage. In general, for a multiple
output function block, there will be one such dynamic gate
for each output.
1
The pc input controls the pull-up network
and the eval input controls the “foot” of the pull-down net-
work. Precharge occurs when pc is asserted low and eval
is de-asserted low. Evaluation occurs when eval is asserted
high and pc is de-asserted high. In
HC pipelines, the two
control signals, pc and eval, are separately generated and are
decoupled. Therefore, when both signals are de-asserted,
the gate output is effectively isolated from the gate inputs;
1
For complex logic, where a single dynamic gate would be too large
and slow, decomposition into a multi-level monotonic network is used.
Proceedings of the Eighth International Symposium on Asynchronous Circuits and Systems (ASYNC’02)
1522-8681/02 $17.00 © 2002 IEEE