1.2 Information integration 13
In some concrete information integration scenarios, the common ontology can
be either physically existing or virtual. Below, we discuss these scenarios in some
detail.
1.2.1 Schema integration
Schema integration is the oldest scenario
[
Batini et al., 1986,
Sheth and Larson, 1990, Parent and Spaccapietra, 1998
]
. Suppose that two (or
more) enterprises want to perform either a merger or an acquisition among them.
Ultimately, these enterprises have to integrate their databases into a single one.
Usually, a first technical step is to identify correspondences between semantically
related entities of the schemas before merging the databases. This step, known as
matching, is required even if the databases to be integrated are coming from the
same domain of interest, e.g., book selling, car rentals. This is because the schemas
have been designed and developed independently. In fact, people follow diverse
modelling principles and patterns, even if they have to encode the same real-world
object. Finally, the schemas to be integrated might have been developed according
to different business goals. This makes the matching problem even harder.
Under the schema integration heading we can classify some other scenarios. For
example, (tightly-coupled) federated databases
[
Sheth and Larson, 1990
]
. These typ-
ically have one global schema providing a unified access to the federation of com-
ponent databases. Component databases, in turn, are autonomous. Thus, in this ap-
plication when, for example, one component schema of the federated database is
changed, the federated (global) schema has consequently to be also reconsidered.
Matching can help in identifying those changes.
Finally, it is worth noting the applications which we are not discussing here,
e.g., distributed databases systems
[
¨
Ozsu and Valduriez, 1999
]
. These are usually de-
signed in a centralised way, e.g., by a database administrator, and therefore, semantic
heterogeneity does not exist there by construction
[
Elmagarmid et al., 1999
]
.
1.2.2 Catalogue integration
In Business-to-Business (B2B) applications, trade partners store information about
their products in electronic catalogues. Typical examples of catalogues are prod-
uct directories of electronic sales portals, such as Amazon or eBay. In order for a
merchant to participate in the marketplace, e.g., eBay, it has to determine corre-
spondences between entries of its catalogues and those of the marketplace catalogue
(see Fig. 1.3). This process of finding correspondences among entries of the cat-
alogues is referred to as the catalogue matching problem
[
Bouquet et al., 2003c
]
.
Notice that if we look at this problem from a merchant viewpoint, matching has to
be performed for each marketplace it would like to participate. Having identified the
correspondences between the entries of the catalogues, they are further analysed in
order to generate query expressions that automatically translate data instances be-
tween the catalogues. Finally, having matched the catalogues, users of a marketplace