sensor readings are routed up the aggregation tree.
For the distribution phase, TAG uses a tree based routing
scheme rooted at the sink node. The sink broadcasts a message
asking nodes to organize into a routing tree and then sends its
queries. In each message there is a field specifying the level,
or distance from the root, of the sending node (the level of the
root is equal to zero). Whenever a node receives a message
and it does not yet belong to any level, it sets its own level to
be the level of the message plus one. It also elects the node
from which it receives the message as its parent. The parent is
the node that is used to route messages towards the sink. Each
sensor then rebroadcasts the received message adding its own
identifier (ID) and level. This process continues until all nodes
have been assigned an ID and a parent. The routing messages
are periodically broadcast by the sink in order to keep the tree
structure updated. After the construction of the tree, the queries
are sent along the structure to all nodes in the network. TAG
adopts the selection and aggregation facilities of the database
query languages (SQL). Accordingly, TAG queries have the
following form:
SELECT{agg(expr), attrs} from SENSOR
WHERE{selPreds}
GROUP BY{attrs}
HAVING{havingPreds}
EPOCH DURATION i
In practice, the sink sends a query, where it specifies the
quantities that it wants to collect (attrs field), how these must
be aggregated (agg(expr)) and the sensors that should be
involved in the data retrieval. This last request is specified
through the WHERE, GROUP and HAVING clauses [5]. Fi-
nally, an EPOCH duration field specifies the time (in seconds)
each device should wait before sending new sensor readings.
This means the readings used to compute an aggregate record
all belong to the same time interval, or epoch.
During the data collection phase, due to the tree structure,
each parent has to wait for data from all of its children before
it can send its aggregate up the tree. Epochs are divided into
shorter intervals called communication slots. The number of
these slots equals the maximum depth of the routing tree. The
slot mechanism gives a nice benefit. As the time is slotted,
sensor nodes can be put to sleep until the next scheduled
transmission interval. In practice, a node goes back to sleep
soon after it has finished sending its readings to its parent.
Data aggregation is performed by all intermediate nodes.
However, in order not to limit TAG to the few and very
simple aggregation functions defined by the SQL language
(such as COUNT, MIN, MAX, SUM, and AVERAGE) a
more general classification is accounted for by partitioning
aggregates according to the Duplicate Sensitivity, Exemplary
and Summary and Monotonic properties [5].
As for most tree-based schemes, TAG may be inefficient in
case of dynamic topologies or link/device failures: as discussed
above, trees are particularly sensitive to failures at intermediate
nodes as the related subtree may become disconnected. In
addition, as the topology changes, TAG has to re-organize the
tree structure and this means high costs in terms of energy
consumption and overhead.
Directed Diffusion [1] - Directed Diffusion is a reactive data
centric protocol. The routing scheme is specifically tailored
for those applications where one or few sinks ask some
specific information by flooding the network with their queries.
Directed Diffusion is organized in three phases (see Fig. 2,
originally shown in [1]): 1) interest dissemination, 2) gradient
setup and 3) data forwarding along the reinforced paths
(path reinforcement and forwarding). When a certain sink is
interested in collecting data from the nodes in the network,
it propagates an interest message (interest dissemination),
describing the type of data the node is interested in, and
setting a suitable operational mode for its collection. Each
node, on receiving the interest, re-broadcasts it to its neighbors.
In addition, the node sets up interest gradients, i.e., vectors
containing the next hop that has to be used to propagate the
result of the query back to the sink node (gradient setup).
As an illustrative example (see Fig. 2), if the Sink sends an
interest which reaches nodes a and b, and both forward the
interest to node c, then node c sets up two vectors indicating
that the data matching that interest should be sent back to
a and/or b. The strength of such a gradient can be adapted,
which may result in a different amount of information being
redirected to each neighbor. To this end, various metrics such
as the node’s energy level, its communication capability and
its position within the network can be used. Each gradient is
related to the attribute it has been set up for. As the gradient
setup phase for a certain interest is complete, only a single
path for each source is reinforced and used to route packets
towards the sink (path reinforcement and forwarding).
Data aggregation is performed when data is forwarded to
the sink by means of proper methods, which can be selected
according to application requirements. The data gathering tree
(i.e., reinforced paths) must be periodically refreshed by the
sink and this can be expensive in case of dynamic topologies.
A tradeoff, depending on the network dynamics, is involved
between the frequency of the gradient setup (i.e., energy
expenditure) and the achieved performance. A valuable feature
of Directed Diffusion consists of the local interaction among
nodes in setting up gradients and reinforcing paths. This allows
for increased efficiency as there is no need to spread the
complete network topology to all nodes in the network.
We observe that attention is to be paid to MAC Layer
design. Consider as an example the IEEE802.11 wireless
technology. As said above, queries are propagated by means
of broadcasts (basic access in IEEE802.11). However, data is
sent back to the sink via unicast transmissions. This means
that when either the node density increases or the duplicate
suppression rule is not used, due to MAC collisions and
subsequent backoffs, the delay may become excessively large.
Hence, the local traffic should be kept at an acceptably low
level in order to avoid collisions. Several approaches [36], [48],
[49] have been proposed to reduce the control traffic generated
by the local interactions among nodes with Directed Diffusion.
In these solutions, the authors use properly defined aggregation
trees with the main purpose of reducing both traffic and delay.
In [48] a modified version of Directed Diffusion, Enhanced
Directed Diffusion (EDD), is proposed. This protocol jointly
exploits Directed Diffusion to collect data and a cluster-based
architecture to increase the efficiency of the local interactions