汉字识别：卷积神经网络的半监督迁移学习新方法

研究论文

120 浏览量更新于2024-08-28 收藏 306KB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

"基于卷积神经网络的汉字识别的半监督转移学习" 这篇研究论文主要探讨了如何利用半监督转移学习（Semi-supervised Transfer Learning, STL）方法来提升基于卷积神经网络（Convolutional Neural Network, CNN）的汉字识别效果。在传统的机器学习和深度学习中，大量标注数据是模型训练的关键，但在实际应用中，获取大规模标注数据往往成本高昂且耗时。因此，如何有效地利用未标注数据成为了一个重要的研究课题。作者Yejun Tang、Bing Wu、Liangrui Peng和Changsong Liu来自清华大学信息科学技术国家实验室和电子工程系，他们提出了一种新的STL方法，该方法结合了多核最大均值差异（Multi-Kernel Maximum Mean Discrepancy, MK-MMD）损失函数，用于改进在目标领域中的CNN模型微调过程。 STL方法包含三个步骤： 1. 首先，使用源领域的大量标注样本训练一个初始的CNN模型。这个源领域通常拥有丰富的标注数据，可以为模型提供基础的特征学习能力。 2. 然后，通过少量目标领域的标注样本对预训练的CNN模型进行微调。这样可以让模型适应目标领域的特性，减少源域与目标域之间的差距。 3. 最后，关键步骤是结合大量未标注样本和有限的标注样本在目标领域中继续训练CNN模型。通过引入MK-MMD损失，模型能够更好地捕捉到未标注数据的分布信息，从而增强模型在目标领域的泛化能力。多核最大均值差异（MK-MMD）是一种衡量两个概率分布之间差异的方法，它允许在高维空间中比较非参数分布。在半监督学习中，MK-MMD损失有助于模型在没有标签信息的情况下，学习到未标注数据的潜在结构，从而提高模型的性能。这项研究为汉字识别提供了新的思路，通过半监督转移学习，不仅减少了对大量标注数据的依赖，还提高了模型在新环境下的适应性和准确性。这对于现实世界的应用，如自动文本识别、智能输入法等，具有重要的实践意义。同时，这种方法也对其他领域的图像识别任务具有一定的借鉴价值。

资源详情

资源推荐

Semi-supervised Transfer Learning for Convolutional Neural Network based

Chinese Character Recognition

Yejun Tang, Bing Wu, Liangrui Peng and Changsong Liu

Tsinghua National Laboratory for Information Science and Technology

Department of Electronic Engineering, Tsinghua University, Beijing, China

Email:{tangyj,plr,lcs}@ocrserv.ee.tsinghua.edu.cn, bingwuthu@gmail.com

Abstract—Although transfer learning has aroused re-

searchers’ great interest, how to utilize the unlabeled data is

still an open and important problem in this ﬁeld. We propose

a novel semi-supervised transfer learning (STL) method by in-

corporating Multi-Kernel Maximum Mean Discrepancy (MK-

MMD) loss into the traditional ﬁne-tuned Convolutional Neural

Network (CNN) transfer learning framework for Chinese

character recognition. The proposed method includes three

steps. First, a CNN model is trained by massive labeled samples

in the source domain. Then the CNN model is ﬁne-tuned

by a few labeled samples in the target domain. Finally, the

CNN model is trained by both a large number of unlabeled

samples and the limited labeled samples in the target domain to

minimize the MK-MMD loss. Experiments investigate detailed

conﬁgurations and parameters of the proposed STL method

with several frequently used CNN structures including AlexNet,

GoogLeNet, and ResNet. Experimental results on practical

Chinese character transfer learning tasks, such as Dunhuang

historical Chinese character recognition, indicate that the pro-

posed method can signiﬁcantly improve recognition accuracy

in the target domain.

I. INTRODUCTION

With the emergence of deep learning, Optical Character

Recognition (OCR) has achieved great progress in recent

years. However, deep learning framework is faced with two

challenges. First, training a deep neural network requires

massive labeled samples while labeled samples are hard

to obtain in some tasks. Second, Many machine learning

methods work well only under the assumption: the training

data and testing data are of exactly the same distribution

[1], while they are often slightly different in many scenarios.

Therefore, their performance in real-world scenarios is likely

to be unsatisfactory. The domain of training samples are

often denoted as the source domain and the domain of

testing samples are denoted as the target domain. In such

cases, transfer learning would be necessary to transfer the

classiﬁcation knowledge from the source domain into the

target domain.

Transfer learning can be classiﬁed into three categories

according to their training data, supervised transfer learning

whose training samples are labeled; unsupervised trans-

fer learning whose training samples are unlabeled; semi-

supervised transfer learning whose training data comprises

mostly unlabeled samples and a few labeled samples. Su-

pervised transfer learning is currently the most straightfor-

ward and commonly used way in either traditional feature

extractor and classiﬁer framework or deep neural network

framework. In traditional feature extractor and classiﬁer

framework, Zhang et al. [2] proposed a linear style transfer

mapping method, Li et al. [3] adopted the method in

historical Chinese character recognition and Feng et al. [4]

proposed a nonlinear transfer mapping method based on

Gaussian Process. Both of them utilized the samples in the

source domain and the labeled samples in the target domain

to train a parameter transfer mapping unit. In deep neural

network framework, Oquab et al. [5] added an adaptation

layer on AlexNet and transfer the weights by ﬁne-tuning the

network with the labeled samples in the target domain on im-

age recognition task. Zhang et al. [6] added an unsupervised

adaptation layer on their network to adapt the variance of

writing styles in handwriting Chinese character recognition

tasks. Tang et al. [7] applied the parameter ﬁne-tuning based

transfer learning method to Dunhuang historical Chinese

recognition task. The method proved to be effective but the

recognition accuracy extremely relies on the amount of ﬁne-

tuning samples.

Due to the limitation of supervised transfer learning, re-

searchers show increasing interests in unsupervised transfer

learning recently. Domain adaptation is one of the major

ways for unsupervised learning. It aims at ﬁnding the

representation which minimizes the discrepancy between

probability distributions of the source domain and the target

domain. The key problem in this process is how to compare

probabilities and deﬁne their discrepancy. Various similarity

measures have been used such as: Kullback-Leibler di-

vergence, the total variation distance [8],the Kolmogorov

distance [9], the Wasserstein distance [10], etc. Gretton et

al. [11] found out that a kernel embedding of probability

distributions into reproducing kernel Hilbert spaces (RKHS)

allows the comparison of two probability measures based

on the distance between their respective embeddings, and

proposed the method of maximum mean discrepancy (M-

MD), which yields a consistent estimate with low compu-

tational cost. Long et al. [12] proposed a novel network

structure in which MK-MMD loss is adopted to minimize

2017 14th IAPR International Conference on Document Analysis and Recognition

DOI 10.1109/ICDAR.2017.79

441

下载后可阅读完整内容，剩余6页未读，立即下载

weixin_38678773

粉丝: 4
资源: 963

汉字识别：卷积神经网络的半监督迁移学习新方法

基于卷积神经网络的中文新闻文本分类.pdf

基于多通道卷积神经网络的中文文本关系抽取.docx

基于卷积神经网络车牌识别的选题背景

基于卷积神经网络的语音识别tensorflow

基于卷积神经网络的车牌识别系统

基于卷积神经网络的验证码识别

基于卷积神经网络的语音识别

基于卷积神经网络识别

基于卷积神经网络的文字识别

基于卷积神经网络基于卷积神经网络猫狗识别猫狗识别

基于FPGA的卷积神经网络图像识别

基于卷积神经网络的猫狗识别的成果

基于卷积神经网络的图像识别算法的研究

基于卷积神经网络的图像识别算法前景

基于卷积神经网络的人脸识别

基于卷积神经网络水果识别

基于卷积神经网络的抑郁情绪识别算法对情感识别领域研究的意义

基于卷积神经网络人脸识别

基于卷积神经网络的声纹识别

基于卷积神经网络的图像识别结论与展望

最新资源