model for nano-scale defect inspection and classification. An unsupervised machine learning model was applied
to reduce false-positive defects arising from SEM noise. Dey et al. also benchmarked another ensemble defect de-
tection framework based on the state-of-the-art one-stage object detection model YOLOv5,
9
which outperformed
the previous ResNet-based ensemble framework.
A data-centric approach was also demonstrated,
10
showing a way to consensually ensemble labels from dif-
ferent anonymous labelers for the same dataset, requiring minimal intervention from experts for defect labeling
expectations. The final combined label, along with expert-aware post-processing data, yielded the highest test
mAP of 0.919 with the state-of-the-art YOLOv8 model. The concept of pseudo-label prediction and model en-
sembles from different labelling partitions to avoid duplicate labeling work was proposed. Cheon et al.
11
utilized
a Convolutional Neural Network (CNN) along with a k-NN method to classify wafer surface defects from SEM
images. The final activation layer outputs were used as features for k-Nearest Neighbors (k-NN), but with a
threshold for clustering, allowing the algorithm to identify unknown defect classes. However, when a new class
is discovered, the CNN needs retraining to ensure consistent high performance over time. To address this issue,
the authors proposed the idea of incremental training. In previous study,
12
a noise characterization for future
denoising algorithms demonstrated that as scanning speed increased, a shift in noise property from a Gaussian
to a Gamma distribution was observed. In recent years, CD-SEM tool vendors have also incorporated machine
learning to enhance tool efficiency.
2, 3
The idea of ML-based super-resolution defect inspection was proposed,
3
where a CNN model takes the database of designed features and CD-SEM images as input and generates the
pixel distribution for defect prediction. Their model utilized non-defect regions to train the algorithm, generat-
ing the pixel luminosity distribution for the given database. If the pixel brightness values of the original image
significantly deviated from the predicted pixel distribution center, the pixel was classified as defective. This ap-
proach demonstrated the potential for identifying bridge defects and necking defects even in lower pixel resolution
images of 4nm/pixel. However, in this method, knowledge of the design database (mask layout) is necessary,
making automated defect type classification challenging. Nonetheless, their research highlights the advantages
of massive metrology, showcasing how a FOV of 80×80 microns can facilitate automatic defect identification
through machine learning. Thus, the development of a super-resolution ML method capable of both identifying
and classifying defects without requiring knowledge of the mask design remains a crucial missing component.
3. METHODOLOGY
In this section, we provide a brief introduction to our training datasets, the proposed data augmentation strategy,
an overview of the architecture of our proposed SEMI-SuperYOLO-NAS, and the strategy for preparing the
dataset of High-resolution/Low-resolution (HR/LR) image pairs to train our SR-branch assisted defect detector
model.
3.1 Dataset
Two Line-Space (LS) pattern datasets were utilized. The first dataset, SEM-ADI, comprises 1324 images captured
with the CD-SEM tool during After-Development-Inspection (ADI). The second dataset, EDR-AEI, comprises
527 images captured with the EDR tool during After-Etching-Inspection (AEI). Each image in these datasets
contains at least one defect. SEM-ADI and EDR-AEI have original pixel resolutions of 1024×1024 and 480×480,
respectively. SEM-ADI encompasses 5 different defect types, while EDR-AEI has 4 different defect types. We
divided the two datasets into training and validation sets, as outlined in Table 1 and Table 2, where we also
provide details regarding the original defect classes and their respective total number of instances. In Fig.3, we
illustrate the various defect types in the SEM-ADI dataset. In Fig.4, we present all defect types in the EDR-AEI
dataset. The imbalance in defect classes and the limited availability of data in both datasets prompted us to
develop novel augmentation strategies tailored to SEM and EDR imaging characteristics and assess their impact
on model performance.
3.2 Proposed Data-augmentation strategy
We implemented different image augmentation
13
techniques to enhance the robustness of our model and address
imbalances in the number of instances across defect classes. Within our image augmentation pipeline, utilizing
the Albumentations library,
14
we integrated the following enhancements to introduce variability in illumination,
noise, and atmospheric conditions: