Global resolution of the support vector machine regression... 207
The specialty of our algorithm lies in the search technique for the next (C
e
,ε
e
).
The rectangle search scheme we proposed decomposes the entire feasible regions into
small rectangles. Searching along the boundary of these rectangles reduces the effort
of identifying geometric location of the invariancy regions to identifying the geometric
location of invariancy intervals on a given line. Consider the initial [C
, C]×[ε, ε]
rectangular area on the 2-dimensional (C
e
,ε
e
)-plane. We first fix the values of (C
e
,ε
e
)
where the lower-level program is solved at one vertex of the rectangular area. Along
one side of the boundary, we can easily identify the endpoints of the invariancy interval
where this (C
e
,ε
e
) point belongs. Then, we find a new (C
e
,ε
e
) point on the same side
of the boundary but outside the current invariancy interval, and solve another lower-
level problem with this new (C
e
,ε
e
). All the invariancy intervals along the four sides
of the boundary are identified with this repeated procedure.
For each invariancy region, there is a corresponding SVR data points allocation in
the feature space. We call one specific data points allocation a grouping. Recording
groupings is equivalent to recording the invariancy regions that have been found.
Based on the grouping information associated with the invariancy intervals along
the boundary, one can either conclude that all the invariancy regions contained in
this rectangular area have been examined or that further area partitioning is required.
Throughout the algorithm, we maintain a queue of rectangular areas to be examined.
The rectangular areas can be removed from the queue if (1) the four-corner points
of the (C
e
,ε
e
)-rectangular area have the same grouping vector, or (2) the (C
e
,ε
e
)-
rectangular area is bisected into two regions by a straight line, each belonging to one
invariancy region. These two sufficient conditions are called the 1st and the 2nd stage
of the algorithm respectively.
To summarize, this algorithm contains the f ollowing 7 key procedural tasks:
1. Solve the lower-level problem with a fixed (C
e
,ε
e
) by known methodologies, such
as semismooth method (Ferris and Munson 2004).
2. Replace the complementarity constraints by linear constraints (piece), which
restricts the feasible region in a smaller but convex region (invariancy region).
3. Solve for the local best (C
e
,ε
e
) and the optimal minimum within the invariancy
region. This can be done by solving a linear program.
4. Search for the next (C
e
,ε
e
) (outside the current invariancy region) at which the
lower-level problem is fixed and solved. We propose the rectangle search scheme to
search along the boundary of a rectangle and thus the invariancy region is reduced
to invariancy interval.
5. Partition the initial [C
, C]×[ε, ε] area into small rectangular regions at chosen
points and maintain a queue of rectangular areas in to be examined.
6. Maintain a list of visited invariancy regions by recording their corresponding data
allocation in space (grouping).
7. Eliminate the rectangular areas from the queue if (1) the four-corner points of
the (C
e
,ε
e
)-rectangular area have the same grouping vector, or (2) the (C
e
,ε
e
)-
rectangular area is bisected into two regions by a straight line, each belonging to
one invariancy region.
Task 1 is discussed in Sect. 3.1; the linear constraints (piece) and the convex region
(invariancy region) mentioned in task 2 are defined in Sect. 3.2; the allocation of data
123