A parallel evolutionary algorithm to optimize dynamic data types in embedded systems 1159
Fig. 1 Code before and after the exploration of DDT
Table 1 DDT library
Representation DDT Description
1 AR Array
2 AR(P) Array of pointers
3 SLL Single-linked list
4 DLL Doubly-linked list
5 SLL(O) Single-linked list with roving pointer
6 DLL(O) Doubly-linked list with roving pointer
7 SLL(AR) Single-linked list of arrays
8 DLL(AR) Doubly-linked list of arrays
9 SLL(ARO) Single-linked list of arrays and roving pointer
10 DLL(ARO) Doubly-linked list of arrays and roving pointer
to be instantiated as a certain DDT from the set of possi-
ble implementation of DDTs library D presented in Atienza
et al. (2007) and Daylight et al. (2004). Thus, the goal of our
optimization flow is to obtain a set of pairs (variable, DDT)
{v
i
∈ V, d
j
∈ D}, such that minimizes memory accesses,
memory usage and power consumption for the target embed-
ded system. Additional constraints as the minimum and max-
imum values for all three objectives may be defined. Clearly,
this is a multi-objective optimization problem.
To measure the quality of a solution, we have defined the
equations to evaluate the behavior of DDT implementations
by means of parameters such as the number of sequential
accesses, random accesses, average size, etc. In our case we
have classified the DDT implementations in basic DDT and
multi-layer implementations relevant for embedded multi-
media applications. Table 1 contains the DDTs implemented
(Atienza et al. 2007).
Once we have fixed the problem optimization process for
DDTs, we can describe the whole process shown in Fig. 2.
It has three main steps: profiling of the application, estima-
tion of the parameters and multi-objective optimization algo-
rithms execution. These three steps are described in the next
paragraphs.
2.1 Profiling
In order to evaluate the different metrics we need to obtain
the real execution information from the application. Unfor-
tunately, the execution of the whole application is not a viable
solution. An alternative good solution recently proposed Day-
light et al. (2004) is to obtain a profiling report of the applica-
tion where the following information is logged: number and
location of the accesses of an element, addition of an element,
removal of an element, the clearing of the container, iterator
operations such as pre-increment or dereference, constructor,
destructor, copy constructor and swap operation. To this end,
we need to replace all the candidate variables in the appli-
cation by our vector DDT implementation, which logs all
the information needed to evaluate them the using equations
developed in Atienza et al. (2007).
2.2 Parameters estimation
In this phase, we extract all information needed from the
profiling report. The purpose is to measure the quality of a
solution (v
i
, d
j
) in the DDT exploration, using several para-
meters, namely, the number of candidate variables, number
of elements stored in the DDT (the worst case), average of the
number of elements stored, size of the elements (in bytes),
size of the pointers (in bytes), number of read accesses, num-
ber of write accesses and cache misses. All these parameters
can be extracted from the profiling report. To this end, we
have developed a tool called Profile Analyzer. Cache misses
are also obtained by means of simulation, generating memory
traces from the profiling report and the DDT library, using
them as input for the Dinero IV cache simulator (Edler and
Hill 2007) for the particular memory configuration of the
target embedded system. This phase is the most-time con-
suming part of the exploration, although it is done only once
for each target architecture, and for each tested applications.
2.3 Optimization
The last phase is the optimization process and this is where
MOEAs work. They take as input the information previously
estimated and try to minimize three different objectives:
memory accesses, memory usage and energy consumption.
To this end, we have implemented sequential versions of
two elitist MOEAs, called SPEA2 and NSGA-II and three
pMOEAs, which combine SPEA2 and NSGA-II. A detailed
explanation is presented in the next section.
When the three phases of the optimization process end,
we obtain a set of DDT instantiation policies, i.e., which
variable should be instantiated by which DDT. We also obtain
123