SMO算法：训练支持向量机的高效方法

需积分: 10 106 浏览量更新于2024-07-23 收藏 88KB PDF 举报

本文档主要探讨了"SMO优化算法"，一种针对支持向量机（SVM）训练的高效方法。Sequential Minimal Optimization，或简称SMO，是John Platt在1998年提出的一种创新算法，发表于Microsoft Research的技术报告MSR-TR-98-14中。SMO算法的核心在于它将原本的大规模二次规划（QP）问题分解为一系列最小子问题，这些子问题可以通过解析解法解决，避免了耗时的数值优化循环。这显著提高了算法的计算效率。 SMO算法的优势在于内存需求与训练集大小成线性关系，这意味着它可以处理非常庞大的数据集，这是传统SVM算法难以实现的。由于SMO算法减少了矩阵运算，其时间复杂度在不同测试问题中表现良好，对于线性SVM和稀疏数据集尤其表现出色。相比之下，传统的分块SVM算法在训练集大小上的计算开销更大，可能达到线性到立方的增长。 SMO的计算瓶颈主要在于支持向量的评估，但这正是其速度优势的来源。因此，当数据集易于处理且问题为线性时，SMO算法可以提供显著的性能提升。总结来说，这篇文章不仅阐述了SMO算法的原理和设计，还强调了它在实际应用中的高效性和对大型数据集的友好性，为SVM模型的训练提供了一种重要的优化手段。

the entire set of non-zero Lagrange multipliers has been identified, hence the last step solves the

large QP problem.

Chunking seriously reduces the size of the matrix from the number of training examples squared

to approximately the number of non-zero Lagrange multipliers squared. However, chunking still

cannot handle large-scale training problems, since even this reduced matrix cannot fit into

memory.

In 1997, Osuna, et al. [16] proved a theorem which suggests a whole new set of QP algorithms

for SVMs. The theorem proves that the large QP problem can be broken down into a series of

smaller QP sub-problems. As long as at least one example that violates the KKT conditions is

added to the examples for the previous sub-problem, each step will reduce the overall objective

function and maintain a feasible point that obeys all of the constraints. Therefore, a sequence of

QP sub-problems that always add at least one violator will be guaranteed to converge. Notice

that the chunking algorithm obeys the conditions of the theorem, and hence will converge.

Osuna, et al. suggests keeping a constant size matrix for every QP sub-problem, which implies

adding and deleting the same number of examples at every step [16] (see figure 2). Using a

constant-size matrix will allow the training on arbitrarily sized data sets. The algorithm given in

Osuna’s paper [16] suggests adding one example and subtracting one example every step.

Clearly this would be inefficient, because it would use an entire numerical QP optimization step

to cause one training example to obey the KKT conditions. In practice, researchers add and

subtract multiple examples according to unpublished heuristics [17]. In any event, a numerical

QP solver is required for all of these methods. Numerical QP is notoriously tricky to get right;

there are many numerical precision issues that need to be addressed.

Chunking

Osuna

SMO

Figure 2. Three alternative methods for training SVMs: Chunking, Osuna’s algorithm, and SMO. For

each method, three steps are illustrated. The horizontal thin line at every step represents the training

set, while the thick boxes represent the Lagrange multipliers being optimized at that step. For

chunking, a fixed number of examples are added every step, while the zero Lagrange multipliers are

discarded at every step. Thus, the number of examples trained per step tends to grow. For Osuna’s

algorithm, a fixed number of examples are optimized every step: the same number of examples is

added to and discarded from the problem at every step. For SMO, only two examples are analytically

optimized at every step, so that each step is very fast.

剩余20页未读，继续阅读

keaiwenwen

粉丝: 12
资源: 3

SMO算法：训练支持向量机的高效方法

smo最经典论文

SMO比较详细的论文

04-群体智能优化算法-蜘蛛猴优化算法.docx

smo算法优化matlab代码-Social-mimic-optimization-SMO:社交拟态优化算法及工程应用（SMO算法）

动态筛选策略的SMO优化算法研究

提升大规模SVM训练效率：SMO改进算法优化

基于平衡策略的SMO改进算法 (2005年)

论文研究-间隔值辅助的SMO算法改进研究.pdf

论文研究-求解最小闭包球问题改进的SMO-型算法.pdf

SMO算法的快速实现-英文文献

最新资源