Automated Quantitative Image Analysis of Hematoxylin-eosin Staining Slides in
Lymphoma Based on Hierarchical Kmeans Clustering
Peng Shi
School of Mathematics and Computer Science
Fujian Normal University
Fuzhou, China
e-mail: pshi@fjnu.edu.cn
Jing Zhong, Rongfang Huang, and Jianyang Lin
Fujian Provincial Cancer Hospital
Fuzhou, China
Abstract—The microscopic image of tissue section stained by
hematoxylin-eosin (HE) is an essential part in histopathology
researches. Automated HE image processing remains
challenging because forms and distributions of cells and other
tissue structures are always extremely irregular with no clear
boundaries, especially in conducting high throughput analysis
which demands higher accuracy and efficient quantification
for the reference of pathologists. To solve this problem, we
proposed an automated quantitative image analysis pipeline
based on hierarchical clustering of local correlations, which
segmented the image into nuclei, cytoplasm and extracellular
spaces by classifying image pixels on the basis of local
correlation features. Segmentation for precise nucleus
boundaries was then performed, and finally a set of indicators
characterizing tissue structures were extracted to complete
quantification of HE images. Experimental results showed high
accuracy and adaptability in cell segmentation despite data
variance. Quantitative indicators obtained in this essay provide
a reliable evidence for the analysis of HE staining lymphoma
pathological image.
Keywords-hematoxylin-eosin staining; image segmentation;
tumor microstructure; lymphoma; kmeans clustering
I. INTRODUCTION
Hematoxylin-Eosin (HE) staining is one of the most
commonly used techniques in staining pathological paraffin
sections, especially in analysis of tumor tissue microscopic
images [1,2], in which the nucleus is stained hyacinthine by
alkaline hematoxylin, while the cytoplasm is stained red by
acidic eosin. Lymphoma is a malignant tumor with rising
incidence in recent years. The infiltrated lymph nodes were
difficult to be distinguished from metastatic lymph nodes of
other malignant carcinomas. Generally, lymphoma has two
main types including Hodgkin's disease (HD) and non-
Hodgkin's lymphoma (NHL), with many complicated sub-
pathological types for each group [3]. Biological researches
of lymphoma are often conducted based on HE pathological
images using ImageJ [4], and is still difficult because various
color aggregations always have no obvious differences, or
clear borders to each other.
In recent years, automated segmentation of HE staining
images is one of the important issues studied in computer
aided pathological analysis. A series of studies have been
made on the automated segmentation of histopathologic HE
images by far, most of which are segmentation methods
based on morphology [5,6], textural features [7-10] or
classification [11-14]. Morphological methods are to make
use of pixel intensity, gradient flow and other characteristic
morphological differences between both sides of the cell
boundaries for segmentation, on which level set [5], active
contour [6] and other algorithms rely to look for boundaries.
Methods based on textural features usually adopts local
textural features extracted from graph run length matrices
(GLSM) or gray-Level co-occurrence matrices (GLCM) for
boundary detection, while classification methods take the
single pixel as the object of study and pixels in the same
category together constitute all the components of cells and
other types of tissues. However, in analyzing actual
pathological sections of lymphoma, the above methods may
face some difficulties including: First, no clear boundaries
between stained nuclei and cytoplasm as well as extracellular
spaces (ECS). Second, cell nucleus often adheres to each
other because of the extremely high cell density in tissue
sections. Third, due to the diversity of nucleus shape, it is
difficult to establish shape models in nuclei detection and
segmentation.
Figure 1. Workflow of proposed lymphoma HE image processing and
analysis pipeline
In practice, the high throughput computer-aided HE
image processing needs high adaptability and little human
intervention to avoid staining variability and subjective error,
2016 8th International Conference on Information Technology in Medicine and Education
978-1-5090-3906-7/16 $31.00 © 2016 IEEE
DOI 10.1109/ITME.2016.190
99