The Impact of the MIT-BIH
Arrhythmia Database
History, Lessons Learned, and Its
Influence on Current and Future Databases
T
he MIT-BIH Arrhythmia Database was
the first generally available set of stan
-
dard test material for evaluation of ar
-
rhythmia detectors, and it has been used
for that purpose as well as for basic re
-
search into cardiac dynamics at about 500
sites worldwide since 1980. It has lived a
far longer life than any of its creators ever
expected. Together with the American
Heart Association (AHA) Database, it
played an interesting role in stimulating
manufacturers of arrhythmia analyzers to
compete on the basis of objectively mea-
surable performance, and much of the cur-
rent appreciation of the value of common
databases, both for basic research and for
medical device development and evalua-
tion, can be attributed to this experience.
In this article, we briefly review the his-
tory of the database, describe its contents,
discuss what we have learned about data-
base design and construction, and take a
look at some of the later projects that have
been stimulated by both the successes and
the limitations of the MIT-BIH Arrhyth
-
mia Database.
Nature of the Data
Electrocardiograms (ECGs) are very
widely used as an inexpensive and
noninvasive means of observing the phys
-
iology of the heart. In 1961, Holter [1] in
-
troduced techniques for continuous
recording of the ECG in ambulatory sub
-
jects over periods of many hours; the
long-term ECG (Holter recording), typi
-
cally with a duration of 24 hours, has since
become the standard technique for ob
-
serving transient aspects of cardiac elec
-
trical activity.
Since the mid-1970s, our research group
has studied abnormalities of cardiac rhythm
(arrhythmias) as reflected in long-term
ECGs as well as automated methods for
identifying arrhythmias. Many other re
-
search groups in academia and industry
have had similar interests. Until 1980, it was
necessary for those wishing to pursue such
work to collect their own data. Although the
recordings themselves are plentiful, access
to these data is not universal, and thorough
characterization of the recorded waveforms
is a tedious and expensive process. Further
-
more, there is very wide variability in ECG
rhythms and in details of waveform mor
-
phology, both between subjects and within
individuals over time, so that a useful repre-
sentative collection of long-term ECGs for
research must include many recordings.
During the 1960s and 1970s, develop-
ment of automated arrhythmia analysis al-
gorithms was hampered by a lack of
universally accessible data. Each group
that performed such work acquired its
own set of recordings and often
self-evaluated their algorithms using the
same data that had been used to develop
those algorithms. From the earliest days,
it was clear that performance of these al
-
gorithms was invariably data-dependent,
and the use of different data for the evalu
-
ation of each algorithm did not permit ob
-
jective comparisons of algorithms from
different groups.
Selection of Data
In 1975, recognizing that we would
need a suitable set of well-characterized
long-term ECGs for our own research, we
began collecting, digitizing, and annotat
-
ing long-term ECG recordings obtained
by the Arrhythmia Laboratory of
Boston’s Beth Israel Hospital (BIH; now
the Beth Israel Deaconess Medical Cen
-
ter). From the outset, however, we
planned to make these recordings avail
-
able to the research community at large, in
order to stimulate work in this field and to
encourage strictly reproducible and ob
-
jectively comparable evaluations of dif
-
ferent algorithms [2]. We expected that
the availability of a common database
May/June 2001 IEEE ENGINEERING IN MEDICINE AND BIOLOGY 450739-5175/01/$10.00©2001IEEE
©Digital Vision
George B. Moody and Roger G. Mark
Harvard-MIT Division of
Health Sciences and Technology