The feature extraction problem is implemented as a search for some new features that are
more relevant for classication and are dened (in some language) by means of the existing
features.
These new features can be e.g. of the form
a
2
0
:
5
1) or 2
a
+3
b>
0
:
75
:
Their values on
a given ob ject are computed from given values of conditional attributes on the ob ject. The
new features are often binary taking value 1 on a given ob ject i the sp ecied condition is
true on this ob ject. In the case of symbolic value attributes we lo ok for new features like
a
2 f
French, English, Polish
g
with value 1 i a p erson sp eaks any of these languages. The
imp ortant issues in feature extraction are problems of discretization of real value attributes,
grouping of symb olic (nominal) value attributes, searching for new features dened byhyp er-
planes or more complex surfaces dened over existing attributes. In Section 1.7.2 discretization
based on rough set and Bo olean reasoning approach is discussed. Some other approaches to
feature extraction that are based on Bo olean reasoning are also discussed. All cases of fea-
ture extraction problem mentioned above may b e described in terms of searching for relevant
features in a particular language of features. Bo olean reasoning plays the crucial role of an
inference engine for feature selection problems.
Feature extraction and feature selection are usually implemented in a pre-pro cessing stage
of the whole mo deling pro cess. There are some other asp ects related to this stage of modeling
such as, for instance, elimination of noise from the data or treatment of missing values. More
information related to these problems can be found in 344 , 345] and in the bibliography
included in these b o oks.
In the next stage of the synthesis of target concept approximations descriptions of the target
concepts are constructed from the extracted relevant features (relevant primitive concepts) by
applying some op erations. In the simplest case when Bo olean connectives
_
and
^
are chosen
these descriptions form the so-called decision rules. In Sect. 1.7.3 we give a short introduction to
metho ds for decision rule synthesis that are based on rough set metho ds and Bo olean reasoning.
Two main cases of decision rules are discussed: exact (deterministic) and approximate (non-
deterministic) rules. More information on decision rule synthesis and using rough set approach
the reader may nd in 344 , 345 ] and in the bibliography included in these bo oks.
Finally, it is necessary to estimate the quality of constructed approximations of target con-
cepts. Let us observe that the "building blocks" from which dieren
t approximations of target
concepts are constructed may be inconsistent on new, so far unseen ob jects (i.e. some ob jects
from the same class may b e classied to disjoint concepts). This creates a necessitytodevelop
metho ds for resolving these inconsistencies. The quality of target concept approximations can
be considered acceptable if the inconsistencies may be resolved by using these metho ds. In
Sect. 1.7.4 some introductory comments on this problem are presented and references to rough
set metho ds that resolve conicts among dierent decision rules byvoting for the nal decision
are given.
1.7.1 Signicance of Attributes and Approximate Reducts
One of the rst ideas 297] was to consider as relevant features those in the
core
of an information
system, i.e. features that b elong to the intersection of all reducts of the information system.
It can b e easily checked that several denitions of relevant features that are used by machine
learning community4]canbeinterpreted bycho osing a relevant decision system corresp onding
to the information system.
Another approach is related to dynamic reducts (see e.g. 19 ]) i.e. conditional attribute
sets app earing \suciently often" as reducts of samples of the original decision table. The
attributes b elonging to the \ma jority" of dynamic reducts are dened as relevant. The value
13