经验估计指导的多实例学习实例选择方法

151 浏览量更新于2024-08-26 收藏 147KB PDF 举报

"基于经验估计的多实例学习指导实例选择" 本文是一篇研究论文，主要探讨了在多实例学习（Multiple-Instance Learning, MIL）中如何通过实例选择来优化学习过程。多实例学习是一种机器学习方法，它处理的是每个样本由多个实例组成的情况，其中单个实例可能有正有负，但整个样本可能被标记为正或负。传统的监督学习通常假设每个样本只有一个表示，而在MIL中，一个样本可以被视为一个“袋”（bag），包含多个“实例”（instance）。现有的MIL算法大多依赖单一标准进行实例原型的选择，而本文提出采用两种不同的实例选择标准，从两个不同视角来改进这一过程。这表明，通过综合考虑这两种标准，可以在实例选择中取得更好的效果。作者还引入了一个经验估计器，使得这两个标准能够在实例选择过程中相互竞争，以找到最具代表性的实例。实验结果证实了基于经验估计的实例选择方法在多实例学习中的有效性。这种方法能够提高模型的性能，因为它能够更好地捕捉到那些对学习过程有关键影响的实例。通过优化实例选择，算法能够更准确地识别哪些实例是关键的，从而提高对整个“袋”标签预测的准确性。在MIL的应用场景中，例如图像识别、文本分类或生物信息学问题，这种方法特别有用。在图像识别中，一个图像可能包含多个对象或特征，每个对象或特征可以视为一个实例，而整个图像则是一个“袋”。通过有效选择实例，模型可以更好地理解哪些特征对于区分不同类别至关重要。这篇论文为多实例学习提供了一种新的实例选择策略，该策略结合了不同视角下的选择标准，并通过经验估计器进行了优化。这种创新的方法有助于提升MIL算法的性能，对于理解和改进多实例学习的实例选择过程具有重要的理论和实践意义。

Multiple-Instance Learning with Empirical

Estimation Guided Instance Selection

Liming Yuan, Xianbin Wen, and Haixia Xu

School of Computer Science and Engineering

Tianjin University of Technology

Tianjin, China 300384

Email: yuanleeming@163.com

Lu Zhao

School of Computer and

Information Engineering

Tianjin Chengjian University

Tianjin, China 300384

Abstract—The embedding based framework handles the

multiple-instance learning (MIL) via the instance selection and

embedding. It is how to select instance prototypes that becomes

the main difference between various algorithms. Most current

studies depend on single criteria for selecting instance prototypes.

In this paper, we adopt two kinds of instance-selection criteria

from two different views. For the combination of the two-view

criteria, we also present an empirical estimator under which

the two criteria compete for the instance selection. Experimen-

tal results validate the effectiveness of the proposed empirical

estimator based instance-selection method for MIL.

I. INTRODUCTION

Multiple-instance learning (MIL) is a variant of the con-

ventional supervised learning. In MIL, each example, called

bag, comprises a variable number of feature vectors, called

instances. Every bag is associated with a label, but the label

of any individual instance is unknown. Since this particular

framework was introduced by Dietterich et al. [1], it has been

successfully applied to solve numerous real-world tasks, e.g.,

region-based image categorization [2], object detection [3],

tracking [4], localization [5], etc.

Different applications have induced two main assumptions

on the relationship between the label of a bag and that of

its inner instances. The standard MIL assumption [1] states

that a positive bag contains at least one positive instance,

while all instances are negative for a negative bag. Various

generalized assumptions [ 6]–[11] commonly claim that the

class of a positive bag is jointly determined by its one or

more different kinds of instances, whereas it is not the case

for a negative bag.

The embedding based MIL framework [8] can tackle the

problems satisfying various assumptions. It depends on some

instance prototypes for embedding each bag into a new bag-

level feature space, i.e., by computing the distance between

the bag and every instance prototype. It is thus how to choose

instance prototypes that becomes the key to this framework.

However, most existing embedding based MIL algorithms

depend upon single criteria for the instance selection.

In this paper, we consider jointly applying two kinds of

instance-selection criteria from two different views. For the

combination of the two-view criteria, we also provide an

empirical estimator which enables the two criteria to compete

for the instance selection. If one criteria is signiﬁcantly better

than the other one under the empirical estimator, we will use

the better one and both of them otherwise.

The rest of this paper is organized as follows: Section II

gives an overview of some related work. Section III details the

proposed MIL algorithm. Section IV provides the experimental

results and analysis on six data sets. Finally, Section V

concludes this paper with some discussion.

II. RELATED WORK

Most earlier MIL algorithms are based on the standard

assumption. APR [1] optimizes an axis-parallel rectangle by

forcing it to include at least one instance from every positive

bag and exclude all instances from negative bags. DD [12]

deﬁnes a function named diverse density, which describes the

likelihood that an instance appears in all positive bags and

does not appear in any negative bag. DD is further extended

by EM-DD [13], which applies expectation maximization

for exploring some complex and disjoint concepts. Several

other algorithms aim at adapting the conventional supervised

learning technique to the MIL setting. Citation-NN [14] adapts

kNN (k-nearest neighbor) using the Hausdorff distance. Both

mi-SVM and MI-SVM [15] are built upon SVM (support

vector machine).

The embedding based MIL algorithms follow the gener-

alized assumption. DD-SVM [2] is considered as the ﬁrst

MIL algorithm applying the idea of instance selection and

embedding, which regards the local maxima of diverse density

as instance prototypes. MILES [8] may be the most famous

embedding based MIL algorithm. MILES ﬁrst achieves the

embedding for bags using all instances in the training set,

and then applies a 1-norm SVM for selecting instances and

constructing the classiﬁer at the same time. CCE [16] ﬁrst

determines k clusters in the feature space, and then transforms

every bag into a k-dimensional feature vector in which the

value of the ith feature is one if one instance of the bag falls

within the ith cluster and zero otherwise.

Both MILD [17] and MILIS [18] identify from every bag

(only positive bags for MILD) the only instance with the

highest ability in classifying training bags. MI-AdaBoost [19]

applies the AdaBoost framework for jointly selecting instances

and building the classiﬁer. miVLAD [20] establishes the

2018 24th International Conference on Pattern Recognition (ICPR)

Beijing, China, August 20-24, 2018

下载后可阅读完整内容，剩余5页未读，立即下载

weixin_38663452

粉丝: 5
资源: 923

经验估计指导的多实例学习实例选择方法

对数域实例学习光照估计方法：提升颜色恒常性精度

多实例主动学习在对象检测中的应用研究

使用统计搭配模型优化基于实例的机器翻译

Python基于pandas绘制散点图矩阵代码实例

基于移动曲面法的GPS高程拟合实例分析.pdf

基于函数模型GPS粗差探测方法比较及实例分析.pdf

基于克里金插值算法的机载LiDAR数据处理应用实例分析.pdf

基于概率维度的不确定数据挖掘框架及其应用实例.pdf

基于MATLAB和GM(1,1)模型的预测方法应用实例.pdf

MATLAB手写数字识别：基于概率神经网络实例解析

最新资源