A Min-Max Framework of Cascaded Classifier with Multiple Instance Learning
for Computer Aided Diagnosis
Dijia Wu
1∗
, Jinbo Bi
2
, Kim Boyer
1
1
Rensselaer Polytechnic Institute, Troy, NY 12180 USA, wud5@rpi.edu
2
Siemens Medical Solutions, Malvern, PA 19355 USA, jinbo.bi@siemens.com
Abstract
The computer aided diagnosis (CAD) problems of detect-
ing potentially diseased structures from medical images are
typically distinguished by the following challenging char-
acteristics: extremely unbalanced data between negative
and positive classes; stringent real-time requirement of on-
line execution; multiple positive candidates generated for
the same malignant structure that are highly correlated and
spatially close to each other. To address all these problems,
we propose a novel learning formulation to combine cas-
cade classification and multiple instance learning (MIL) in
a unified min-max framework, leading to a joint optimiza-
tion problem which can be converted to a tractable quadrat-
ically constrained quadratic program and efficiently solved
by block-coordinate optimization algorithms.
We apply the proposed approach to the CAD problems of
detecting pulmonary embolism and colon cancer from com-
puted tomography images. Experimental results show that
our approach significantly reduces the computational cost
while yielding comparable detection accuracy to the current
state-of-the-art MIL or cascaded classifiers. Although not
specifically designed for balanced MIL problems, the pro-
posed method achieves superior performance on balanced
MIL benchmark data such as MUSK and image data sets.
1. Introduction
Over the years, computer aided diagnosis (CAD) sys-
tems have been widely used to assist physicians in inter-
preting medical images from different modalities such as
magnetic resonance imaging (MRI), X-ray, and computed
tomography (CT) and to identify potentially diseased re-
gions like lesions or tumors. Most CAD systems comprise
of three stages: identify candidate structures, i.e., poten-
∗
This work was conducted when Dijia Wu was with Siemens Medical
Solutions at Malvern PA, USA.
tially unhealthy regions, in the image; generate features for
each candidate; classify each candidate as normal (nega-
tive) or diseased (positive). To maintain high sensitivity,
a very large number of candidates are generated in the first
stage because any malignant regions missed at this stage can
never be recovered later in the CAD system. Consequently,
majority of the candidates generated, typically more than
99%, are false positives, which makes the data extremely
unbalanced. In this situation, cascaded classifiers can be
used to speed up candidate classification by quickly dis-
carding numerous negative samples with low-cost features
at early stages and spending more computation on promis-
ing disease-like candidates [15].
Moreover, for CAD data, a candidate is labeled as posi-
tive if it is sufficiently close to a radiologist’s mark (ground
truth) and labeled as negative otherwise. Multiple candi-
dates are usually generated corresponding to the same ab-
normal structure so that if any such candidate is detected,
the underlying structure is found. Therefore, CAD prob-
lems are better modeled as multiple instance learning (MIL)
by enclosing all the candidates within a certain distance to
a radiologist’s mark into a positive bag [6].
In this paper, we propose a novel approach to combine
MIL classifiers in a cascade. In particular, we start out
with formulating MIL as an optimization problem in a min-
max framework in Section 2. Section 3 reviews the joint
optimization principle [5] used to construct all hyperplane
classifiers of a cascade in one shot, and describes a new
min-max formulation for optimization of the cascade. The
two min-max frameworks are fused as discussed in Sec-
tion 4 to form a unified approach that optimizes a cascade
of MIL classifiers simultaneously. Experimental results on
two CAD applications and MIL benchmark datasets are
given in Section 5 together with some discussion. We con-
clude with a review of our contributions and potential ex-
tensions in Section 6.
1
1359978-1-4244-3991-1/09/$25.00 ©2009 IEEE