深度学习与主动学习结合：一种新的视角

需积分: 44 13 浏览量更新于2024-07-15 收藏 1.21MB PDF 举报

"深度主动学习综述论文" 主动学习(Active Learning, AL)是一种机器学习策略，旨在通过有选择地请求用户或专家对数据进行标注，从而最大限度地提高模型的性能，同时减少对大量标注样本的依赖。传统的机器学习算法通常需要大量标注数据以训练模型，但在标注数据获取成本高或者数据量庞大时，主动学习就显得尤为关键。它通过智能地选择最有价值的未标注数据进行标注，从而以最小的标注成本提升模型的泛化能力。深度学习(Deep Learning, DL)是当前人工智能领域的热点，它利用多层神经网络结构来自动学习数据的高级表示。然而，深度学习的一个显著缺点是其对大量标注数据的渴求，这在数据获取和标注过程中可能带来巨大的时间和经济成本。因此，将深度学习与主动学习相结合，可以有效地缓解这一问题，使得模型能在有限的标注数据上达到更好的性能。近年来，互联网技术的快速发展导致了数据的爆发式增长，我们正处在一个信息丰富的时代。这些海量数据为深度学习提供了丰富的原料，但也带来了挑战：如何高效利用这些数据并减少对人工标注的依赖。因此，深度主动学习的融合成为了研究的新方向。尽管深度学习已经在图像识别、自然语言处理、语音识别等领域取得了突破性进展，但相比而言，主动学习的研究相对较少。主要原因是，在深度学习兴起之前，传统机器学习方法对标注数据的需求较小，使得主动学习的价值没有得到充分认识。然而，随着深度学习模型复杂度的增加，以及数据规模的不断扩大，主动学习的重要性日益凸显。深度主动学习的核心在于如何设计有效的查询策略，选择最具代表性和信息量的未标注样本进行标注。常见的策略包括不确定性采样（如熵采样、最大边际采样）、多样性采样（探索数据集中的多样性和互信息）以及代表性的采样（寻找能够代表整个数据分布的样本）。这些策略的目标是使模型能更快地收敛，同时减少对额外标注数据的依赖。此外，为了在深度学习框架下更好地实施主动学习，研究人员也在探索新的方法，如集成多个模型的不确定性估计、利用无监督预训练增强模型的初始表示能力，以及开发更适应主动学习场景的网络结构。这些努力不仅有助于提高深度学习模型的效率，也有助于降低对大规模标注数据集的依赖，使得在资源有限的环境中也能实现高质量的模型训练。深度主动学习是将主动学习的思想与深度学习的强大学习能力相结合，以解决标注数据稀缺问题的一种有效途径。未来的研究将继续关注如何优化查询策略、改进模型结构，以及开发适用于不同领域的深度主动学习框架，以进一步推动人工智能技术的发展。

6 Ren and Chang, et al.

3 DEEP ACTIVE LEARNING

In this section, we will provide a comprehensive and systematic overview of DAL-related works.

Fig.1c illustrates a typical example of DAL model architecture. The parameters

of the deep learning

model are initialized or pre-trained on the label training set

, while the samples of the unlabeled

pool

are used to extract features through the deep learning model. The next steps are to select

samples based on the corresponding query strategy, and query the label in the oracle to form a new

label training set

, then train the deep learning model on

and update

at the same time. This

process is repeated until the label budget is exhausted or the pre-dened termination conditions are

reached. From the DAL framework example in Fig.1c, we can roughly divide the DAL framework

into two parts: namely, the AL query strategy on the unlabeled dataset and the DL model training

method. These will be discussed and summarized in the following Section 3.1 and 3.2 respectively.

Finally, we will discuss the eorts made by DAL on the generalization of the model in Section 3.3.

3.1 ery Strategy Optimization in DAL

In the pool-based method, we dene

= {X, Y}

as an unlabeled dataset with

samples; here,

is the sample space,

is the label space, and

P(x, y)

is a potential distribution, where

x ∈ X, y ∈ Y

= {X, Y }

is the current labeled training set with

samples, where x

∈ X ,

∈ Y

. Under

the standard supervision environment of DAL, our main goal is to design a query strategy

−→ L

, using the deep model

f ∈ F , f

X → Y

. The optimization problem of DAL in a

supervised environment can be expressed as follows:

arg min

L ⊆U , (x, y)∈L

(x, y)

[ℓ(f (x), y)], (1)

where

ℓ(·) ∈ R

is the given loss equation, and we expect that

m ≪ n

. Our goal is to make

small as possible while ensuring a predetermined level of accuracy. Therefore, the query strategy

Q in DAL is crucial to reduce the labeling cost.

3.1.1 Batch Mode DAL (BMDAL). The main dierence between DAL and classic AL is that DAL

uses batch-based sample querying. In traditional AL, most algorithms use a one-by-one query

method, which leads to frequent training of the learning model but little change in the training

data. The training set obtained by this query method is not only inecient in the training of the

DL model, but can also easily lead to overtting. Therefore, it is necessary to investigate BMDAL

in more depth. In the context of BMDAL, at each acquisition step, we score the batch of candidate

unlabeled data samples

B = {x

, x

, ..., x

} ⊆ U

based on the acquisition function used

and the

deep model

(L)

trained on

, to select a new batch of data samples

∗

= {x

∗

, x

∗

, ..., x

∗

}

. This

problem can be formulated as follows:

∗

= arg max

B ⊆U

bat ch

(B, f

(L)). (2)

A naive approach would be to continuously query a batch of samples based on the one-by-

one strategy. For example, [

] adopts the method of batch acquisition, and chooses to query

Bayesian Active Learning by Disagreement (BALD) [

] to obtain the top

samples with the

highest scores. Obviously, however, this method is not feasible, as it is very likely to choose a set

of information-rich but similar samples. The information provided to the model by such similar

samples is essentially the same, which not only wastes labeling resources, but also makes it dicult

for the model to learn genuinely useful information. Therefore, the core of BMDAL is to query a

set of samples that are both information-rich and diverse. Fig.2 illustrates a schematic diagram of

this idea.

剩余29页未读，继续阅读

syp_net

粉丝: 158
资源: 1187

深度学习与主动学习结合：一种新的视角

deep-active-learning:深度主动学习

deep-active-learning-pytorch:一站式购买最新的深度主动学习方法

电子科大最新《深度半监督学习》综述论文（2021版）

《深度持续学习》综述论文

最新《智能交通系统的深度强化学习》综述论文

深度强化学习在智能制造中的应用展望综述(毕设&课设论文参考).pdf

近十年国内教育领域深度学习研究综述——基于CNKI的文献计量可视化分析.pdf

《持续学习机器人技术：定义，框架，学习策略，机会与挑战》综述论文

关于学习兴趣的文献综述-论文.zip

多媒体在计算机教学中的运用综述论文.doc

最新资源