DSpace System Documentation:
Functional Overview
5
pre-configured with the DSpace source code. However, you can configure multiple schemas and select metadata
fields from a mix of configured schemas to describe your items.
Other descriptive metadata about items (e.g. metadata described in a hierarchical schema) may be held in serialized
bitstreams. Communities and collections have some simple descriptive metadata (a name, and some descriptive
prose), held in the DBMS.
Administrative Metadata
This includes preservation metadata, provenance and authorization policy data. Most of this is held within
DSpace's relation DBMS schema. Provenance metadata (prose) is stored in Dublin Core records. Additionally,
some other administrative metadata (for example, bitstream byte sizes and MIME types) is replicated in Dublin
Core records so that it is easily accessible outside of DSpace.
Structural Metadata
This includes information about how to present an item, or bitstreams within an item, to an end-user, and the
relationships between constituent parts of the item. As an example, consider a thesis consisting of a number of
TIFF images, each depicting a single page of the thesis. Structural metadata would include the fact that each image
is a single page, and the ordering of the TIFF images/pages. Structural metadata in DSpace is currently fairly
basic; within an item, bitstreams can be arranged into separate bundles as described above. A bundle may also
optionally have a primary bitstream. This is currently used by the HTML support to indicate which bitstream in
the bundle is the first HTML file to send to a browser.
In addition to some basic technical metadata, bitstreams also have a 'sequence ID' that uniquely identifies it within
an item. This is used to produce a 'persistent' bitstream identifier for each bitstream.
Additional structural metadata can be stored in serialized bitstreams, but DSpace does not currently understand
this natively.
2.4. Packager Plugins
Packagers are software modules that translate between DSpace Item objects and a self-contained external
representation, or "package". A Package Ingester interprets, or ingests, the package and creates an Item. A Package
Disseminator writes out the contents of an Item in the package format.
A package is typically an archive file such as a Zip or "tar" file, including a manifest document which contains metadata
and a description of the package contents. The IMS Content Package [http://www.imsglobal.org/content/packaging/] is
a typical packaging standard. A package might also be a single document or media file that contains its own metadata,
such as a PDF document with embedded descriptive metadata.
Package ingesters and package disseminators are each a type of named plugin (see Plugin Manager), so it is easy to
add new packagers specific to the needs of your site. You do not have to supply both an ingester and disseminator for
each format; it is perfectly acceptable to just implement one of them.
Most packager plugins call upon Crosswalk plugins to translate the metadata between DSpace's object model and the
package format.
2.5. Crosswalk Plugins
Crosswalks are software modules that translate between DSpace object metadata and a specific external representation.
An Ingestion Crosswalk interprets the external format and crosswalks it to DSpace's internal data structure, while a
Dissemination Crosswalk does the opposite.
For example, a MODS ingestion crosswalk translates descriptive metadata from the MODS format to the metadata
fields on a DSpace Item. A MODS dissemination crosswalk generates a MODS document from the metadata on a
DSpace Item.