支持向量机回归参数选择的全局解决方案

下载需积分: 9 | PDF格式 | 3.03MB | 更新于2024-07-21 | 30 浏览量 | 举报

"支持向量机回归参数选择" 支持向量机（SVM）是一种广泛应用的机器学习算法，尤其在分类和回归任务中表现出色。它通过构造最优超平面来实现数据的非线性划分或拟合。在SVM回归中，参数的选择对于模型的性能至关重要。"Global resolution of the support vector machine regression parameters selection problem with LPCC" 这篇论文探讨了使用线性预测交叉相关系数（Linear Predictive Cross Correlation, LPCC）来解决SVM回归参数选择的全局优化问题。 SVM的主要参数包括惩罚参数C和核函数参数γ。C控制了模型的复杂度，当C增大时，SVM倾向于找到一个更精确但可能过拟合的数据边界；而γ则决定了核函数的宽度，影响模型的非线性程度。在高维空间中，选择合适的C和γ对于模型的泛化能力有着直接影响。 LPCC是一种评估时间序列预测准确性的方法，它通过分析序列的自相关性来预测未来的值。在本文中，研究者将LPCC引入到SVM参数选择过程中，旨在找到能够最大化LPCC值的参数组合，从而提高回归模型的预测性能。该研究可能采用了一种全局优化算法，如遗传算法、粒子群优化或者模拟退火等，来遍历C和γ的可能取值空间，寻找使LPCC最大的参数设置。这种全局优化方法有助于避免局部最优解，确保找到的参数组合能够在各种数据集上表现稳定。此外，文章可能还涉及了交叉验证的技术，用于评估不同参数组合下的模型性能，确保在训练集上的表现良好，并且能够泛化到未见数据。通过交叉验证，研究者可以有效地防止过拟合或欠拟合，并选择出最佳的参数组合。这篇研究工作为SVM回归参数选择提供了一个新的视角，利用LPCC作为优化目标，提高了模型的预测能力。这一方法对其他需要优化复杂参数的机器学习算法也具有参考价值，尤其是那些在回归问题中寻求高精度预测结果的应用。

206 Y.-C. Lee et al.

min

,...,b



f =1

end



i=front

subject to b

min

≤ b

max

, ∀ f = 1,...,F,

)

+ b

− y

≤ p

, ∀i = n + 1,...,n + m

−(x

)

− b

+ y

≤ p

, ∀i = n + 1,...,n + m

)

+ b

− y

≤ p

, ∀i = n + m

+ 1,...,n + m

+ m

−(x

)

− b

+ y

≤ p

, ∀i = n + m

+ 1,...,n + m

+ m

)

+ b

− y

≤ p

, ∀i = front

,...,end

−(x

)

− b

+ y

≤ p

, ∀i = front

....,end

(7)

Note that the b



f obtained by solving (7) will be the same with the global solution

produced by the bi-level model (3) when the intervals [b

min

max

] and w

in (7)

are obtained at the global optimal parameters C

and ε

. In other words, The bi-level

formulation implicitly produces an optimal b



f =1

/F and w



f =1

simultaneously with the global parameter. The analysis of this (w

, b

) has not yet

been covered in this work.

3 (C

,ε

)-rectangle search algorithm

We demonstrate a global optimization algorithm that solves the program (5). At every

iteration, we ﬁx the design parameters at different values to solve the lower-level

problem, and with this lower-level solution, an upper bound of the outer-level problem

can be obtained. The algorithm is named rectangle search because the search of the

values for the parameters proceeds along the boundary of rectangular areas to obtain

information of termination or further area partitioning.

For any ﬁxed values of (C

,ε

), we can solve the lower-level problem (4), an LCP,

by existing methodologies, such as semismooth method. The active and inactive con-

straints of the solved LCP will provide us a set of linear equalities and inequalities. We

call this set of linear equalities and inequalities a piece. Replacing the complementarity

constraints in model (5) by the linear constraints (piece) restricts the feasible C

and ε

in a smaller but convex region. This region is called an invariancy region on (C

,ε

plane. Because of the “invariancy”, it is sufﬁcient to ﬁnd the local best (C

,ε

) pair

and a local minimum of (5) by solving a linear program (restricted linear program)

subject to the piece resulting from the LCP ﬁxed at an arbitrary point within the same

convex region. Then, we search for the next (C

,ε

) (outside the current invariancy

region) at which the lower-level problem is again ﬁxed and solved. The above proce-

dure repeats continually until the algorithm exhausts all the piecewise-convex feasible

regions

and achieves global optimality.

By exhausting all the invariancy regions, we mean that for every invariancy region, at least one (C

,ε

)

point within the region is chosen and the associated restricted linear program is solved.

123

Author's personal copy

Global resolution of the support vector machine regression... 207

The specialty of our algorithm lies in the search technique for the next (C

,ε

The rectangle search scheme we proposed decomposes the entire feasible regions into

small rectangles. Searching along the boundary of these rectangles reduces the effort

of identifying geometric location of the invariancy regions to identifying the geometric

location of invariancy intervals on a given line. Consider the initial [C

, C]×[ε, ε]

rectangular area on the 2-dimensional (C

,ε

)-plane. We ﬁrst ﬁx the values of (C

,ε

)

where the lower-level program is solved at one vertex of the rectangular area. Along

one side of the boundary, we can easily identify the endpoints of the invariancy interval

where this (C

,ε

) point belongs. Then, we ﬁnd a new (C

,ε

) point on the same side

of the boundary but outside the current invariancy interval, and solve another lower-

level problem with this new (C

,ε

). All the invariancy intervals along the four sides

of the boundary are identiﬁed with this repeated procedure.

For each invariancy region, there is a corresponding SVR data points allocation in

the feature space. We call one speciﬁc data points allocation a grouping. Recording

groupings is equivalent to recording the invariancy regions that have been found.

Based on the grouping information associated with the invariancy intervals along

the boundary, one can either conclude that all the invariancy regions contained in

this rectangular area have been examined or that further area partitioning is required.

Throughout the algorithm, we maintain a queue of rectangular areas to be examined.

The rectangular areas can be removed from the queue if (1) the four-corner points

of the (C

,ε

)-rectangular area have the same grouping vector, or (2) the (C

,ε

rectangular area is bisected into two regions by a straight line, each belonging to one

invariancy region. These two sufﬁcient conditions are called the 1st and the 2nd stage

of the algorithm respectively.

To summarize, this algorithm contains the f ollowing 7 key procedural tasks:

1. Solve the lower-level problem with a ﬁxed (C

,ε

) by known methodologies, such

as semismooth method (Ferris and Munson 2004).

2. Replace the complementarity constraints by linear constraints (piece), which

restricts the feasible region in a smaller but convex region (invariancy region).

3. Solve for the local best (C

,ε

) and the optimal minimum within the invariancy

region. This can be done by solving a linear program.

4. Search for the next (C

,ε

) (outside the current invariancy region) at which the

lower-level problem is ﬁxed and solved. We propose the rectangle search scheme to

search along the boundary of a rectangle and thus the invariancy region is reduced

to invariancy interval.

5. Partition the initial [C

, C]×[ε, ε] area into small rectangular regions at chosen

points and maintain a queue of rectangular areas in to be examined.

6. Maintain a list of visited invariancy regions by recording their corresponding data

allocation in space (grouping).

7. Eliminate the rectangular areas from the queue if (1) the four-corner points of

the (C

,ε

)-rectangular area have the same grouping vector, or (2) the (C

,ε

rectangular area is bisected into two regions by a straight line, each belonging to

one invariancy region.

Task 1 is discussed in Sect. 3.1; the linear constraints (piece) and the convex region

(invariancy region) mentioned in task 2 are deﬁned in Sect. 3.2; the allocation of data

123

Author's personal copy

208 Y.-C. Lee et al.

points in the feature space (grouping) mentioned in task 6 and the degeneracy issue

are also discussed in Sect. 3.2; the linear program mentioned in task 3 is introduced

in Sect. 3.3; the search strategy along the boundary of a rectangle mentioned in task 4

can be seen in Sect. 3.4; the partitioning, recording, and eliminating steps mentioned

in tasks 5–7 are shown in Sects. 3.5 and 3.6.

3.1 Lower-level problem with ﬁxed (C

,ε

)

The lower-level problem with a ﬁxed (C

,ε

) is the SVM regression model for each

fold of training data. We have shown that the lower-level problem is equivalent to a

collection of LCPs. To solve these LCPs, we employ the semismooth method (Luca

et al. 1996), which involves the use of the semismooth Fischer–Burmeister function.

Consider the general complementarity problem as follows:

0 ≤ U

(a) ⊥ a

≥ 0, ∀i ∈ I,

0 = U

(a) ⊥ a

: free, ∀i ∈ E, (8)

where I and E denotes the nonoverlapping sets of indices for inequalities and equalities,

respectively. The Fischer–Burmeister function for the LCP (8) is deﬁned as:

φ(a

, U

(a)) := a

+ U

(a) −



+ U

(a)

The equality φ(a

, U

(a)) = 0 holds if and only if 0 ≤ U

(a) ⊥ a

≥ 0. In the

semismooth method, the merit function being used is of the following form:

(a) :=

⎡

⎢

⎣

φ(a

, U

(a)), ∀i ∈ I

(a), ∀i ∈ E

⎤

⎥

⎦

It has been proven in many research papers, such as Facchinei and Soares (1997),

that the merit function (a) has some desirable properties including that (a) is a

semismooth function and that g(a) :=

(a)

is continuously differentiable. Most

importantly, for any H ∈ ∂

(a), where ∂

(a) represents the B-subdifferential

of (a),wehave∇g(a) = H

(a). These properties hold under the continuous

differentiability of U(a), which is satisﬁed in the application of SVR. Thus, by solv-

ing the equation g(a) = 0, a solution to the complementarity problem (8) is found.

Theoretical foundations of the semismooth method can be seen in Luca et al. (1996),

Billups (1995), Ferris and Munson (2004). The damped Newton method (Facchinei

and Soares 1997) embedded in our algorithm to solve the lower-level problems of the

SVR is in Appendix A.

123

Author's personal copy

剩余66页未读，继续阅读

UniSVR

粉丝: 0

支持向量机回归参数选择的全局解决方案

城市道路交通量预测：GSVMR模型提升准确性

基于MSCN特征的盲像质量评估新进展

混合遗传算法与支持向量机回归评估意外死亡记录

【Basic】Data Regression Prediction Based on Support Vector Machine (SVM) in Matlab

Machine Learning Algorithms

Pattern Recogintion and Machine Learning

Machine Learning Algorithms pdf format

understanding machine learning theory-algorithms

Machine Learning Algorithms 2017.8

Avoiding the Accuracy Pitfall: Evaluating Indicators with Support Vector Machines

最新资源