6 HETEROGENEOUS PLATFORMS AND THEIR USES
incorporates nine cores, one general - purpose PowerPC architecture and eight
special - purpose “ synergistic processing element (SPE) ” processors that
emphasize 32 - bit arithmetic, with a peak performance of 204 gigafl op/s in 32 -
bit arithmetic per chip at 3.2 GHz.
Heterogeneous computing, like multicore structures, offer possible new
opportunities in performance and power effi ciency but impose signifi cant,
perhaps even daunting, challenges to application users and software designers.
Partitioning the work among parallel processors has proven hard enough, but
having to qualify such partitioning by the nature of the work performed and
employing multi - instruction set architecture (ISA) environments aggravates
the problem substantially. While the promise may be great, so are the problems
that have to be resolved. This year has seen initial efforts to address these
obstacles and garner the possible performance wins. Teaming between Intel
and ClearSpeed is just one example of new and concerted efforts to accom-
plish this. Recent work at the University of Tennessee applying an iterative
refi nement technique has demonstrated that 64 - bit accuracy can achieve eight
times the performance of the normal 64 - bit mode of the Cell architecture by
exploiting the 32 - bit SPEs (Buttari et al. , 2007 ).
Japan has undertaken an ambitious program: the “ Kei - soku ” project to
deploy a 10 - petafl ops scale system for initial operation by 2011. While the
planning for this initiative is still ongoing and the exact structure of the system
is under study, key activities are being pursued with a new national High Per-
formance Computing (HPC) Institute being established at RIKEN (2008) .
Technology elements being studied include various aspects of interconnect
technologies, both wire and optical, as well as low - power device technologies,
some of which are targeted to a 0.045 - μ m feature size. NEC, Fujitsu, and
Hitachi are providing strong industrial support with academic partners , includ-
ing University of Tokyo, Tokyo Institute of Technology, University of Tsukuba,
and Keio University among others. The actual design is far from certain, but
there are some indications that a heterogeneous system structure is receiving
strong consideration, integrating both scalar and vector processing compo-
nents, possibly with the addition of special - purpose accelerators such as the
MD - Grape (Fukushige et al. , 1996 ). With a possible budget equivalent to over
US$1 billion (just under 1 billion euros) and a power consumption of 36 MW
(including cooling), this would be the most ambitious computing project yet
pursued by the Asian community, and it is providing strong leadership toward
inaugurating the Petafl ops Age (1 – 1000 petafl ops).
1.3 HETEROGENEOUS CLUSTERS
A heterogeneous cluster (Fig. 1.2 ) is a dedicated system designed mainly for
high - performance parallel computing, which is obtained from the classical
homogeneous cluster architecture by relaxing one of its three key properties,
thus leading to the situation wherein :