Pareto优化法：高效选取最优特征子集

94 浏览量更新于2024-08-29 收藏 863KB PDF 举报

本文主要探讨了在大量变量中通过帕累托优化（Pareto Optimization）选择最优子集的问题，特别是在特征选择、稀疏回归和词典学习等机器学习任务中的应用。作者提出了一种名为POSS（Pareto Optimization-based Subset Selection）的方法，该方法利用进化型的帕累托优化策略来寻找性能优良的小规模子集。在稀疏回归这一具体场景下，POSS方法被证明能够高效地达到目前理论上最佳的近似性能保证。特别地，在指数衰减子类问题中，POSS被证明可以达到最优解，这显示了其在优化上的优势。实验结果进一步验证了这些理论发现，并且对比显示POSS在与贪心策略和凸放松方法的性能竞争中展现出显著的优势。 Subset Selection by Pareto Optimization是一个关键的研究领域，因为它涉及到如何在保证模型效率和预测能力的前提下，减少复杂性，从而提高学习算法的效率和泛化能力。传统的方法可能依赖于线性搜索或者局部优化，而POSS通过全局搜索和多目标优化的特性，能够找到更优的解决方案。研究论文的核心贡献包括： 1. **方法设计**：引入了基于进化帕累托优化的POSS算法，它能够同时考虑多个性能指标，而不是单一的目标函数，从而增加了找到全局最优子集的可能性。 2. **理论分析**：对于稀疏回归，作者提供了理论证明，表明POSS能够在有限时间内获得当前最佳的近似性能，这对于实际应用中的性能预测具有重要意义。 3. **性能比较**：实验结果不仅验证了理论预测，还显示了POSS在Exponential Decay子类问题上的优越性，尤其是在与传统的贪心策略和凸放松技术进行比较时。 4. **应用前景**：由于POSS方法的有效性和普适性，它有可能在未来各种学习任务中，特别是那些涉及大量变量且对效率有高要求的情况下，成为一种强有力的工具。总结来说，这篇文章提供了一个创新的解决子集选择问题的框架，展示了在特定条件下POSS方法的高效性和准确性，为相关领域的研究者和实践者提供了新的视角和实用工具。

Subset Selection by Pareto Optimization

Chao Qian Yang Yu Zhi-Hua Zhou

National Key Laboratory for Novel Software Technology, Nanjing University

Collaborative Innovation Center of Novel Software Technology and Industrialization

Nanjing 210023, China

{qianc,yuy,zhouzh}@lamda.nju.edu.cn

Abstract

Selecting the optimal subset from a large set of variables is a fundamental problem

in various learning tasks such as feature selection, sparse regression, dictionary

learning, etc. In this paper, we propose the POSS approach which employs evo-

lutionary Pareto optimization to ﬁnd a small-sized subset with good performance.

We prove that for sparse regression, POSS is able to achieve the best-so-far the-

oretically guaranteed approximation performance efﬁciently. Particularly, for the

Exponential Decay subclass, POSS is proven to achieve an optimal solution. Em-

pirical study veriﬁes the theoretical results, and exhibits the superior performance

of POSS to greedy and convex relaxation methods.

1 Introduction

Subset selection is to select a subset of size k from a total set of n variables for optimizing some

criterion. This problem arises in many applications, e.g., feature selection, sparse learning and

compressed sensing. The subset selection problem is, however, generally NP-hard [13, 4]. Previous

employed techniques can be mainly categorized into two branches, greedy algorithms and convex

relaxation methods. Greedy algorithms iteratively select or abandon one variable that makes the

criterion currently optimized [9, 19], which are however limited due to its greedy behavior. Convex

relaxation methods usually replace the set size constraint (i.e., the `

-norm) with convex constraints,

e.g., the `

-norm constraint [18] and the elastic net penalty [29]; then ﬁnd the optimal solutions to

the relaxed problem, which however could be distant to the true optimum.

Pareto optimization solves a problem by reformulating it as a bi-objective optimization problem

and employing a bi-objective evolutionary algorithm, which has signiﬁcantly developed recently in

theoretical foundation [22, 15] and applications [16]. This paper proposes the POSS (Pareto Opti-

mization for Subset Selection) method, which treats subset selection as a bi-objective optimization

problem that optimizes some given criterion and the subset size simultaneously. To investigate the

performance of POSS, we study a representative example of subset selection, the sparse regression.

The subset selection problem in sparse regression is to best estimate a predictor variable by linear

regression [12], where the quality of estimation is usually measured by the mean squared error, or

equivalently, the squared multiple correlation R

[6, 11]. Gilbert et al. [9] studied the two-phased

approach with orthogonal matching pursuit (OMP), and proved the multiplicative approximation

guarantee 1 + Θ(µk

) for the mean squared error, when the coherence µ (i.e., the maximum cor-

relation between any pair of observation variables) is O(1/k). This approximation bound was later

improved by [20, 19]. Under the same small coherence condition, Das and Kempe [2] analyzed the

forward regression (FR) algorithm [12] and obtained an approximation guarantee 1−Θ(µk) for R

These results however will break down when µ ∈ w(1/k). By introducing the submodularity ratio

γ, Das and Kempe [3] proved the approximation guarantee 1 − e

−γ

on R

by the FR algorithm;

this guarantee is considered to be the strongest since it can be applied with any coherence. Note

that sparse regression is similar to the problem of sparse recovery [7, 25, 21, 17], but they are for

下载后可阅读完整内容，剩余8页未读，立即下载

weixin_38524246

粉丝: 6

Pareto优化法：高效选取最优特征子集

Irrelevant_features_and_the_subset_selection_problem

基于Matlab的智能特征选择系统：向后搜索与向前搜索优化策略的实践应用,基于matlab的特征选择也叫特征子集选择(FSS, Feature Subset Selection) 是指从已有的M个特

MATLAB Genetic Algorithm Function Optimization: Four Efficient Implementation Methods

MATLAB Genetic Algorithm Automatic Optimization Guide: Liberating Algorithm Tuning, Enhancing ...

MATLAB Genetic Algorithm: A Deep Dive into Bio-inspired Heuristic Optimization Techniques

World Examples: From Theory to Practice, Witnessing the Power of Optimization

cole_02_0507.pdf

工程硕士开题报告：无线传感器网络路由技术及能量优化LEACH协议研究

【东海期货-2025研报】东海贵金属周度策略：金价高位回落，阶段性回调趋势初现.pdf

图像数据处理工具+数据(帮助用户快速划分数据集并增强图像数据集。通过自动化数据处理流程，简化了深度学习项目的数据准备工作)

最新资源