2142 K. Yang et al.
depend greatly on sufficiently large and fixed sample sizes.
For instance, Drugan and Nowe (2013) developed a multi-
objective approach to solve the problem of multi-objective
multi-armed bandits in the field of reinforcement learning,
based on a standard upper confidence bound approach and a
Pareto dominance order relationship. Unfortunately, despite
being capable of solving the multi-objective problem directly,
such approach causes low performance efficiency. Addition-
ally, we have also investigated two immune-inspired multi-
objective optimization approaches to solve MEVP (Zhang
and Tu 2007b) and multi-objective chance-constrained pro-
gramming (Zhang et al. 2013b). They involve in several
design inspirations such as immune suppression, immune
selection, aging and probabilistic dominance, in which a sam-
ple allocation technique is designed to assign large sample
sizes to high-quality individuals.
3 Problem statement and preliminaries
Consider the following multi-objective expected value pro-
gramming problem of the form:
MEVP min
x∈D
f (x) = E [ f
1
(x, ξ ), f
2
(x, ξ ),..., f
q
(x, ξ )],
with bounded and closed domain D in R
p
, decision vec-
tor x in D, where ξ is a r -dimensional random vector with
unknown distribution; E [ f
1
(x, ξ ), f
2
(x, ξ )..., f
q
(x, ξ )]
denotes the expected value vector function
(E [ f
1
(x, ξ )], E [ f
2
(x, ξ )],...,E [ f
q
(x, ξ )]); E [.] is the
operator of expectation; f
j
(x, ξ ) is the jth nonlinear stochas-
tic subobjective function. In order to seek MEVP’s solutions,
the concept of ε-dominance (Batista et al. 2011) is usually
picked up to execute solution comparison. In other words,
for two given candidates x, y ∈ D we say that x ε-dominates
y(x ≺
ε
y),if
E[ f
j
(x, ξ )]+ε ≤ E[ f
j
(y, ξ )], (1)
with 1 ≤ j ≤ q, and there exists k satisfying
E[ f
k
(x, ξ )]+ε<E[ f
k
(y, ξ )]. (2)
This way, x
∗
∈ D is called an ε-Pareto optimal solution, if
there is no candidate z∈ D such that z ≺
ε
x
∗
. In particu-
lar, for a given finite population A, x ∈ A is said to be
an ε-nondominated individual, if there is no individual y
in A such that y ε-dominates x. Similarly, the concept of
ε-dominance above may be naturally extended into the ver-
sion of β-dominance (Trautmann et al. 2009; Eskandari and
Geiger 2009) which will be used in identifying competi-
tive individuals. In other words, x β-dominates y with given
β = (β
1
,β
2
,...,β
q
) and β
i
> 0, if inequalities (1) and (2)
are true after replacing ε by β
j
and β
k
, respectively. Corre-
spondingly, x ∈ A is said to be a β-nondominated individual,
if there is no individual y in A such that y β-dominates x.
Generally, when ξ is with known distribution F
ξ
(z), each
of the above subobjective functions can be replaced by
E [ f
j
(x, ξ )]=
z∈R
r
f
j
(x,z)dF
ξ
(z), (3)
and hence the above MEVP can be changed into an analyt-
ically deterministic multi-objective programming problem.
However, in many practical problems, the noisy information
of ξ is unknown and accordingly, the model approximation
handling method is an alternative way to solve such kind
of problem. Sample average approximation is a simple and
popular method used in coping with expected value program-
ming problems with unknown noise distributions. Therefore,
we use it to transform the above problem into the following
multi-objective sample average approximation model:
SAA min
x∈D
ˆ
f (x) =
ˆ
f
1
(x),
ˆ
f
2
(x),...,
ˆ
f
q
(x)
,
s.t.,
ˆ
f
j
(x) =
1
m
m
i=1
f
j
(x, ξ
i
),
with 1 ≤ j ≤ q; m denotes the fixed sampling size; ξ
i
,
1 ≤ i ≤ m,arethei.i.d samples of ξ . x is said to empirically
ε-dominate y (simply say x ≺
ˆε
y) if satisfying
ˆ
f
j
(x) + ε ≤
ˆ
f
j
(y) with 1 ≤ j ≤ q and
ˆ
f
k
(x) + ε<
ˆ
f
k
(y) for some k.
Similar to the version of ε-orβ-dominance above, x is called
an ε-orβ-empirical nondominated individual if there does
not exist any individual y in A such that y ε-orβ-dominates
x empirically.
We easily know that the set of solutions for the above
problem SAA can approach that of the MEVP above when
m is sufficiently large according to the law of large number.
However, in such case any optimization method will cause
expensive computational cost. Consequently, we require that
different candidates be attached different sample sizes so as
to reduce the cost of computation. Hence, the above SAA
is transformed into the following multi-objective sample-
dependent approximation (SDA) model:
SDA min
x∈D
ˆ
f (x) =
ˆ
f
1
(x),
ˆ
f
2
(x), . . . ,
ˆ
f
q
(x)
,
s.t.,
ˆ
f
j
(x) =
1
m(x)
m(x)
k=1
f
j
(x, ξ
k
), 1 ≤ j ≤ q,
where m(x) is the sample size of ξ at the point x. We next cite
the following conclusions to help us design a novel racing
ranking approach to be used in deciding those competitive
individuals in a given population.
123