3
One of the most exciting aspects of deep learning-enhanced optical microscopy and image
transformations is that, once the network is trained, the inference is non-iterative and very fast to
compute without the need for any parameter search or optimization. In this sense, compared to other
solutions to inverse problems in optics, using e.g., deconvolution, convex optimization or compressive
sampling/sensing techniques, a trained deep neural network, once optimized, presents significant
advantages in terms of computation speed and inference time, even using modest computers and
processors. In fact, with the emergence of new processor architectures that are specifically optimized
for deep learning, this important advantage will take another level of significance, helping us perform
the inference tasks of neural networks in real time, even with mobile phones and other low-end
consumer devices.
It is indeed true that the network training process can be quite lengthy (from a few hours to more than a
day), depending on e.g., the size of the training data, available hardware and model complexity;
however, once the model is trained it remains fixed and can also be used to warm start new models,
when new data become available or new tasks are required. This process of applying one trained neural
network onto a new task with new data is known as transfer learning, which significantly accelerates a
new network’s training time.
One of the first applications of deep learning to enhance optical microscopy images was demonstrated
using bright-field microscopy [2]. In this work, histochemically stained lung tissue sections were imaged
using a bright-field microscope with a 40×/0.95NA objective lens to obtain lower resolution (LR) images
of specimen, and a 100×/1.4NA oil-immersion objective lens was used to obtain the corresponding high-
resolution (HR) labels or gold standard images, to be used for training a convolutional neural network. A
deep neural network architecture was then designed to transform LR images (used as input) into
enhanced images that match the HR labels. The task that the network was aiming to learn in this case
can be thought as predicting the pixel values of the HR image, given the LR input. Therefore, an
important step prior to training was to correctly align, or register, the LR and HR training images with
respect to each other; this enforces the deep network to solely learn the LR-to-HR transformation,
rather than some arbitrary affine transformation between the input and output images. Following the
accurate alignment of the images, the network model can be trained with the matched LR and HR image
pairs. One key advantage of this deep learning-based image transformation is that, unlike other image
enhancement or deconvolution methods, there is no need to have a priori information about the object
or the image formation process; stated differently modelling of the point spread function, spatial and
spectral aberrations, illumination properties or other physical parameters of the imaging system or the
object, and their impact on the acquired image do not need to be known or estimated since the neural
network uses training image data to inherently learn these details in its multi-dimensional solution
space.
Following its training, this bright-field microscopic image enhancement network was blindly tested on
Masson’s Trichrome stained lung tissue sections, which were taken from a different patient. The
network output, in response to LR input images, super-resolved the blurry and distorted features in the
input images, providing similar images to the ones that were acquired with a 100×/1.4NA objective [2].
This trained network was also robust in imaging new types of samples that were not part of the training
data (see Fig. 2). For example, the same model was tested on a different tissue type (kidney), which was
also stained using the Masson’s Trichrome stain. The inference results showed that the network was
indeed able to enhance the resolution of the imaged specimen, demonstrating the generalization of its