高维非凸函数优化：低有效维度下的RESOO算法

69 浏览量更新于2024-08-29 收藏 834KB PDF 举报

"ScalingSimultaneousOptimisticOptimizationforHigh-DimensionalNon-ConvexFunctionswithLowEffectiveDimensions" 本文是一篇研究论文，主要关注在高维非凸函数优化问题中的同时乐观优化（Simultaneous Optimistic Optimization, SOO）方法的改进。SOO是一种基于强大理论基础的全局优化算法，它在低维度问题中表现优异。然而，当面临高维环境时，其性能会显著下降。为了解决这一问题，作者提出了随机嵌入扩展的SOO算法，即RESOO（Random Embedding for Scaling SOO）。 SOO算法的核心在于它的乐观策略，即在每次迭代中，算法会尝试预测下一个最优解可能出现的位置，并在该位置进行探索。在低维度问题中，这种方法能够有效地搜索全局最优解。然而，随着问题维度增加，搜索空间的复杂度急剧上升，导致SOO的效率降低。为了解决高维问题，RESOO引入了随机嵌入的思想。随机嵌入是将高维空间中的问题映射到一个低效实际维度的空间中，通过这种方式，算法可以更高效地处理原本难以解决的高维问题。作者证明了RESOO的简单后悔（simple regret）仅依赖于问题的有效维度，而不再受解决方案空间维度的影响。简单后悔是衡量优化算法在有限次迭代后性能的一个关键指标，表示当前最佳解与全局最优解之间的差距。实证研究表明，在一系列高维非凸测试函数以及多类支持向量机的超参数调优任务上，RESOO相比于原始的SOO算法，表现出了显著的性能提升。这表明，随机嵌入能够有效降低高维空间的复杂性，提高优化效率，特别是在具有低有效维度的高维问题中。此外，论文还可能探讨了RESOO的实现细节、算法复杂性分析、与其他优化方法的比较，以及对未来研究的潜在影响。通过这样的改进，RESOO为解决现实世界中的高维非凸优化问题提供了新的思路，对于机器学习、数据科学和工程领域的优化任务有着重要的应用价值。

Scaling Simultaneous Optimistic Optimization for High-Dimensional

Non-Convex Functions with Low Effective Dimensions

∗

Hong Qian and Yang Yu

National Key Laboratory for Novel Software Technology,

Nanjing University, Nanjing 210023, China

{qianh,yuy}@lamda.nju.edu.cn

Abstract

Simultaneous optimistic optimization (SOO) is a re-

cently proposed global optimization method with a

strong theoretical foundation. Previous studies have

shown that SOO has a good performance in low-

dimensional optimization problems, however, its per-

formance is unsatisfactory when the dimensionality is

high. This paper adapts random embedding to scaling

SOO, resulting in the RESOO algorithm. We prove that

the simple regret of RESOO depends only on the ef-

fective dimension of the problem, while that of SOO

depends on the dimension of the solution space. Em-

pirically, on some high-dimensional non-convex testing

functions as well as hyper-parameter tuning tasks for

multi-class support vector machines, RESOO shows sig-

niﬁcantly improved performance from SOO.

Introduction

Problem.

Solving sophisticated optimizations is in the cen-

tral problems of artiﬁcial intelligence. Let

f : X→R

be a

function deﬁned on a bounded region

X⊆R

, of which we

assume that a global maximizer

∗

∈X

always exists. An

optimization problem can be formally written as

∗

=argmax

x∈X

f(x).

Without loss of generality, we assume

X =[−1, 1]

this paper. We treat

as a black-box function that can only

be evaluated point-wisely, i.e., we can only access

f(x)

for

any given solution

x ∈X

. We assume that

is deterministic,

i.e., every call of

f(x)

returns the same value for the same

The performance of an optimization algorithm is evaluated

by the simple regret (Bubeck, Munos, and Stoltz 2009), i.e.,

given n function evaluations, for maximization,

=max

x∈X

f(x) − f(x(n)),

where

x(n) ∈X

is the solution with the highest function

value found by the algorithm when the budget of

function

evaluations is used up. The simple regret measures the differ-

ence between the true maximum of

and the best found by

the algorithm. An algorithm is no-regret if it possesses the

desirable asymptotic property lim

n→∞

=0.

∗

This research was supported by the NSFC (61375061,

61422304, 61223003), Foundation for the Author of National Ex-

cellent Doctoral Dissertation of China (201451).



2016, Association for the Advancement of Artiﬁcial

Related Work.

Methods with different principles have

been proposed to address the black-box global optimization

problems. Most of them can be roughly categorized into three

kinds: meta-heuristic search, deterministic Lipschitz opti-

mization methods, and Bayesian optimization methods. Meta-

heuristic search algorithms are designed with inspired heuris-

tics, such as evolutionary strategies (Hansen, M

uller, and

Koumoutsakos 2003) , which, however, are very weak in their

theoretical foundations. Deterministic Lipschitz optimization

methods require Lipschitz continuity assumption on

, either

globally (Pint

er 1996; Kearfott 1996; Strongin and Sergeyev

2000) or locally (Kleinberg, Slivkins, and Upfal 2008;

Bubeck et al

2011; Jones, Perttunen, and Stuckman 1993;

Bubeck, Stoltz, and Yu 2011; Slivkins 2011; Munos 2011),

which can have sound theoretical foundations. In addition,

when a function evaluation is very expensive, Bayesian op-

timization methods (Brochu, Cora, and de Freitas 2010;

Snoek, Larochelle, and Adams 2012) are particularly suitable,

which are often theoretically supported under the assumption

of Gaussian process priors.

We are interested in the deterministic Lipschitz optimiza-

tion methods, and particularly the recently proposed Simul-

taneous Optimistic Optimization (SOO) algorithm (Munos

2011; 2014), since the Local Lipschitz continuity that SOO

requires is intuitive, easy to be satisﬁed, and relatively easy

to be veriﬁed. SOO incorporates an optimistic estimation of

the function value with the branch-and-bound principle. It

is worth mentioning that, although SOO assumes the Local

Lipschitz continuity with respect to some semi-metric



, fortu-

nately



does not need to be known. Because of these advan-

tages, SOO has attracted attentions, such as the hybrid with

Bayesian optimization to eliminate the optimization of acqui-

sition functions (Wang et al

2014). Also, variants of SOO

have been proposed for, e.g., stochastic optimization (Valko,

Carpentier, and Munos 2013) and parallel optimistic opti-

mization (Grill, Valko, and Munos 2015).

However, previous studies have noticed that SOO may

perform poorly in high-dimensional optimization prob-

lems (Preux, Munos, and Valko 2014; Derbel and Preux

2015). Meanwhile, it has been observed that in a wide range

of high-dimensional optimization problems, such as hyper-

parameter optimization in neural networks (Bergstra and

Bengio 2012), intrinsically only a few low dimensions are

effective. For these optimization problems with low effective

2000

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16)

下载后可阅读完整内容，剩余6页未读，立即下载

weixin_38723105

粉丝: 4
资源: 967

高维非凸函数优化：低有效维度下的RESOO算法

Digitally Assisted, Fully Integrated, Wideband Transmitters for High-Speed 2019

api-ms-win-shcore-scaling-l1-1-1.wim

high-performance-data-mining-scaling-algorithms-applications-and-systems

sklearn.manifold

multidimensional scaling matlab

scaler.fit_transform(train_data.reshape(-1,1)).flatten()

sklearn tsne

'train': ( "{cmd_mpi:s} nnp-scaling 100 > nnp-scaling-stdout.log 2> nnp-scaling-stdout.err; " "{cmd_mpi:s} nnp-train > nnp-train-stdout.log 2> nnp-train-stdout.err"), 'predict': '{cmd_mpi:s} nnp-dataset 0 > nnp-dataset-stdout.log 2> nnp-dataset-stdout.err'

api-ms-win-shcore-scaling-l1-1-1.dll

最新资源