pegasos.h File Reference
PEGASOS SVM. More...
#include "generic.h"Go to the source code of this file.
Detailed Description
pegasos.h provides basic impelmentation of the PEGASOS [1] SVM solver.
- Overview Overview
- Bias Bias
- Non-linear kernels Non-linear kernels
- Algorithm Algorithm
- References References
Overview
PEGASOS solves the linear SVM learning problem
where
are data vectors in
,
are binary labels,
is the regularization parameter, and
is the hinge loss. The result of the optimization is a model
that yields the decision function
It is well known that the hinge loss is a convex upper bound of the 01-loss of the decision function:
PEGASOS is accessed by calling vl_pegasos_train_binary_svm_d or vl_pegasos_train_binary_svm_f, operating respectively on double or float data.
Bias
PEGASOS SVM formulation does not incorporate a bias. To learn an SVM with bias, the vector
can be optionally extended by a constant component
. In this case, the model
is
dimensional and the decision function is
. If B is large enough, the weight
remains small and it has small contribution in the regularization term
; however, a large
makes the optimization harder.
Non-linear kernels
PEGASOS can be extended to work with kernels, but the algorithm is not particularly efficient in this setting. A preferable solution may be to compute an explicit feature map representing the kernel.
Let
be a kernel. A feature map is a function
such that
. Using this representation, that exists for any positive definite kernel as long as one accepts infinite dimensional feature maps, non-linear SVM learning writes:
Thus the only difference with the linear case is that
is used in place of
.
The ability of computing a feature-map representation depends on the kernel. A general solution is to take the (incomplete) Cholesky decomposition
of the kernel matrix
(in this case
is the i-th columns of V). Alternatively, for additive kernels (e.g. intersection, Chi2) homkermap.h can be used.
Algorithm
PEGASOS is a stochastic subgradient optmizer with automatically tuned learning rate. At the t-iteration, the algorithm:
- Samples uniformly at random as subset
of k of training pairs
from the m pairs provided. - Compute a subgradient
of the function
. - Do a step
with learning rate
. - Back project on the hypersphere of radius
to obtain the next model estimate
:
References
[1] S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal estimated sub-GrAdient SOlver for SVM. In Proc. ICML, 2007.
Definition in file pegasos.h.