HK.NCCP International Journal of Intelligent Information and Management Science
ISSN: 2307-0692, Volume 5, Issue 6, December, 2016
45
A Fast Algorithm for Computing
Dominance Classes
Yan LI, Qun YU
Key Lab. Of Machine Learning and Computational Intelligence, College of Mathematics and Information Science
Hebei University, Baoding, 071002, China
Abstract: Traditional rough set theory (TRS) is based on the concept of equivalence relation to define upper
and lower approximation sets of a given target concept, and therefore uncertainties in information systems can
be represented. By using equivalence relations, TRS only considers whether attribute values are distinguished
or not, regardless of the preference information contained in attribute values. Rough sets based on dominance
relations effectively solve this problem and can deal with preference-ordered data. In these dominance-based
approaches, the computational cost of the dominance classes greatly affects the efficiency of attribute reduc-
tion and rule extraction. This paper presents an efficient method of computing dominance classes in an or-
dered information system by rapidly reducing the search space. Based on the definition of dominance class,
the inferior class of an object is gradually removed from the universe with the increase of the attributes in the
computation process. Experiments on ten UCI data sets show that the proposed algorithm obviously improves
the efficiency of computing dominance classes, especially for large-scale data.
Keywords: Rough set, dominance class, ordered information systems, fast algorithm
1. Introduction
Rough set theory [1] is a mathematical theory proposed
by Professor Z.Pawlak in 1982 to deal with imprecise,
incomplete and incompatible knowledge. It has been
widely used in machine learning, data mining and pattern
recognition [2, 3] and other fields. In practical problems,
especially in multi-criteria decision analysis, some
attribute values are preference-ordered. For example, the
attribute "score" can be numerical or can be divided into
three attribute values: high, medium and low. This type
of attributes is often used to evaluate examples in the
universe, for example, to score a student in a few subjects.
In this case, the order information contained in the
attribute values must be considered for more accurate
decision making. In this case, Greco et al. firstly pro-
posed dominance relation based rough set approach
(DRSA) in 1999 [4,5] by replacing equivalence relations
in TRS with dominance relations and considering the
preference ordered information among objects. DRSA is
very useful to deal with practical problems with conti-
nuous-valued partial ordered attributes. In the literature
[6,7,12], information systems with dominance relations
are referred to as ordered information systems.
Many scholars have made extensive studies on domin-
ance based rough sets [8-11], and most research work
focuses on attribute reduction under dominance relations.
Note that the computation of dominance classes is a ne-
cessary step in most related algorithms, and traditional
algorithms [12,13] need to compare the values of each
attribute of all samples, consuming a large amount of
time and memory. This will greatly affect the efficiency
of computing dominance classes and further affect the
computation of the approximate sets, attribute reduction
[18], as well as rule extraction[19]. Nowadays, efficiently
processing of large-scale data with dominance relations
has become a main concern [20,21]. However, most of
the existing acceleration algorithms are designed for the
computation of equivalence classes [14-17], and there-
fore we propose a fast method to improve the computa-
tional efficiency of dominance classes by rapidly reduc-
ing the search space. Obviously, this method can be fur-
ther used in attribute reduction, rule extraction and other
related algorithms.
2. Basic Concepts
Definition 2.1 (Information system) An information sys-
tem is a 4-tuple
= ,where
,,...
,
12
U
xx
= is a
non-empty finite set of objects;
,...,
,
2
A
a
= is a non-
empty finite set of attributes;
is the value set of
attributes
; :
×→ is an information function that
specifies the attribute value of each object
in, that is,
,fxa
∈
for every ,
.
For a given information system S, if there is a partial or-
der relation
on the range of the attribute
, we
call
as a criterion.
,,
a
∈≥
represents that
is at
least as good as
under criterion
,that is,
is better
than
.