The source files for all examples can be found in /examples.
Support Vector Machine
We are showing how to solve a support vector machine problem with COSMO (and JuMP).
Generating the Dataset
We want to classify the points in this example dataset with $m = 100$ samples and $n = 2$ features:
using Distributions: MvNormal
using Plots, LinearAlgebra, SparseArrays, Random, Test
# Generate dataset
rng = Random.MersenneTwister(123);
num_samples = 100;
Xpos = rand(rng, MvNormal([1.5, 1.5], 1.25), div(num_samples, 2))';
Xneg = rand(rng, MvNormal([-1.5, -1.5], 1.25), div(num_samples, 2))';
ypos = ones(div(num_samples, 2));
yneg = -ones(div(num_samples, 2));
# Plot dataset
plot(Xpos[:, 1], Xpos[:, 2], color = :red, st=:scatter, markershape = :rect, label = "positive", xlabel = "x1", ylabel = "x2")
plot!(Xneg[:, 1], Xneg[:, 2], color = :blue, st=:scatter, markershape = :circle, label = "negative")
with samples $(x_1, x_2, \ldots, x_m) \in \mathbb{R}^2$ and labels $y_i \in \{-1,1\}$.
Solving SVM as a QP
We want to compute the weights $w$ and bias term $b$ of the (soft-margin) SVM classifier:
\[\begin{array}{ll} \text{minimize} & \|w\|^2 + \lambda \sum_{i=1}^m \text{max}(0, 1 - y_i(w^\top x_i - b)), \end{array}\]
where $\lambda$ is a hyperparameter. This problem can be solved as a quadratic program. We can rewrite above problem into an optimisation problem in primal form by introducing the auxiliary slack variables $t_i$:
\[t_i = \text{max}(0, 1 - y_i(w^T x_i - b)), \quad t_i \geq 0.\]
This allows us to write the problems in standard QP format:
\[\begin{array}{ll} \text{minimize} & \|w\|^2 + \lambda \sum_{i=1}^m t_i\\ \text{subject to} & y_i (w^\top x_i - b) \geq 1 - t_i, \quad \text{for } i = 1,\ldots, m\\ & t_i \geq 0, \quad \text{for } i = 1,\ldots, m. \end{array}\]
Next, we will remove the bias term $b$ by adding an initial feature $x_0 = -1$ to each sample (now: $n = 3$):
X = [-ones(num_samples) [Xpos; Xneg]];
y = [ypos; yneg];
m, n = size(X)
(100, 3)
Modelling in JuMP
We can model this problem using JuMP
and then hand it to COSMO
:
using JuMP, COSMO
λ = 1.0; # hyperparameter
model = JuMP.Model(optimizer_with_attributes(COSMO.Optimizer, "verbose" => true));
@variable(model, w[1:n]);
@variable(model, t[1:m] >= 0.);
@objective(model, Min, w' * w + λ * ones(m)' * t);
@constraint(model, diagm(0 => y) * X * w .+ t .- 1 .>= 0);
status = JuMP.optimize!(model)
------------------------------------------------------------------
COSMO v0.8.9 - A Quadratic Objective Conic Solver
Michael Garstka
University of Oxford, 2017 - 2022
------------------------------------------------------------------
Problem: x ∈ R^{103},
constraints: A ∈ R^{200x103} (500 nnz),
matrix size to factor: 303x303,
Floating-point precision: Float64
Sets: Nonnegatives of dim: 200
Settings: ϵ_abs = 1.0e-05, ϵ_rel = 1.0e-05,
ϵ_prim_inf = 1.0e-04, ϵ_dual_inf = 1.0e-04,
ρ = 0.1, σ = 1e-06, α = 1.6,
max_iter = 5000,
scaling iter = 10 (on),
check termination every 25 iter,
check infeasibility every 40 iter,
KKT system solver: QDLDL
Acc: Anderson Type2{QRDecomp},
Memory size = 15, RestartedMemory,
Safeguarded: true, tol: 2.0
Setup Time: 0.18ms
Iter: Objective: Primal Res: Dual Res: Rho:
1 -1.2711e+03 1.8883e+01 9.5020e+00 1.0000e-01
25 1.4390e+01 1.9116e-01 2.2694e-01 1.0000e-01
50 1.5260e+01 6.6030e-02 2.2582e-02 1.0000e-01
75 1.5404e+01 1.4637e-01 3.4191e-02 1.0000e-01
100 1.5406e+01 3.5197e-02 9.2455e-04 1.0000e-01
125 1.5407e+01 3.5177e-02 9.4742e-04 1.0000e-01
150 1.5407e+01 3.5098e-02 2.8603e-05 1.0000e-01
175 1.5561e+01 4.0316e-03 1.1067e+00 1.2825e+01
200 1.5556e+01 7.9630e-04 1.9263e+00 9.9789e-01
225 1.5555e+01 1.0381e-03 2.0181e-02 9.9789e-01
250 1.5555e+01 6.0072e-04 4.3297e-03 9.9789e-01
275 1.5555e+01 1.7551e-05 4.0054e-04 9.9789e-01
300 1.5555e+01 4.4664e-06 8.3466e-05 9.9789e-01
325 1.5555e+01 8.0736e-07 4.8018e-06 1.3197e-01
------------------------------------------------------------------
>>> Results
Status: Solved
Iterations: 335 (incl. 10 safeguarding iter)
Optimal objective: 15.55
Runtime: 0.005s (4.64ms)
The optimal weights $w = [w_0, w_1, w_2]^\top$ (where $w_0 = b$) are:
w_opt = JuMP.value.(w)
3-element Vector{Float64}:
0.1531242891284758
1.0341323254762473
0.520268546984156
Plotting the hyperplane
The separating hyperplane is defined by $w^\top x - b = 0$. To plot the hyperplane, we calculate $x_2$ over a range of $x_1$ values:
\[x_2 = (-w_1 x_1 - w_0) / w_2, \text{ where } w_0 = b.\]
x1 = -4:0.1:4;
x2 = (-w_opt[2] * x1 .- w_opt[1]) / w_opt[3]
plot!(x1, x2, label = "SVM separator", legend = :topleft)
Modelling with COSMO
The problem can also be solved by transforming it directly into COSMO
's problem format. Define COSMO
`s $x$-variable to be $x=[w, t]^\top$ and choose $P$, $q$, accordingly:
P = blockdiag(spdiagm(0 => ones(n)), spzeros(m, m));
q = [zeros(n); 0.5 * λ * ones(m)];
Next we transform the first constraint $y_i (w^\top x_i - b) \geq 1 - t_i, \quad \text{for } i = 1,\ldots, m$ into COSMO
's constraint format: $Ax + b \in \mathcal{K}$.
A1 = [(spdiagm(0 => y) * X) spdiagm(0 => ones(m))];
b1 = -ones(m);
cs1 = COSMO.Constraint(A1, b1, COSMO.Nonnegatives);
It remains to specify the constraint $t_i \geq 0, \quad \text{for } i = 1,\ldots, m$:
A2 = spdiagm(0 => ones(m));
b2 = zeros(m);
cs2 = COSMO.Constraint(A2, b2, COSMO.Nonnegatives, m+n, n+1:m+n);
Create, assemble and solve the COSMO.Model
:
model2 = COSMO.Model();
assemble!(model2, P, q, [cs1; cs2]);
result2 = COSMO.optimize!(model2);
w_opt2 = result2.x[1:3];
@test norm(w_opt2 - w_opt, Inf) < 1e-3
Test Passed
This page was generated using Literate.jl.