Convex Optimization 读书笔记 (9)
Chapter10: Equality constrained minimization
10.1 Equality constrained minimization problems
Consider a optimization problem:
m
i
n
i
m
i
z
e
f
(
x
)
s
u
b
j
e
c
t
t
o
A
x
=
b
\begin{aligned} {\rm minimize} \ \ \ \ & f(x) \\ {\rm subject \ to} \ \ \ \ & Ax=b \end{aligned}
minimize subject to f(x)Ax=bwhere
f
:
R
n
→
R
f : \mathbf{R}^n → \mathbf{R}
f:Rn→R is convex and twice continuously differentiable, and
A
∈
R
p
×
n
A ∈ \mathbf{R}^{p×n}
A∈Rp×n with
r
a
n
k
A
=
p
<
n
\mathbf{rank} \ A = p < n
rank A=p<n. Solving this is equal to solve the problem below:
A
x
∗
=
b
,
∇
f
(
x
∗
)
+
A
T
ν
∗
=
0
Ax^*=b, \ \ \ \ \nabla f(x^*)+A^T\nu^*=0
Ax∗=b, ∇f(x∗)+ATν∗=0The first set of equations,
A
x
=
b
Ax = b
Ax=b, are called the primal feasibility equations, which are linear. The second set of equations,
∇
f
(
x
⋆
)
+
A
T
ν
⋆
=
0
∇f(x^⋆) + A^T ν^⋆ = 0
∇f(x⋆)+ATν⋆=0, are called the dual feasibility equations, and are in general nonlinear.
10.1.1 Equality constrained convex quadratic minimization
Consider the equality constrained convex quadratic minimization problem
m
i
n
i
m
i
z
e
f
(
x
)
=
1
2
x
T
P
x
+
q
T
x
+
r
s
u
b
j
e
c
t
t
o
A
x
=
b
\begin{aligned} {\rm minimize} \ \ \ \ & f(x)=\frac{1}{2}x^TPx+q^Tx+r\\ {\rm subject \ to} \ \ \ \ & Ax=b \end{aligned}
minimize subject to f(x)=21xTPx+qTx+rAx=bwhere
P
∈
S
+
n
P\in \mathbf{S}^n_+
P∈S+n and
A
∈
R
p
×
n
A\in \mathbf{R}^{p×n}
A∈Rp×n. Here the optimality conditions are
A
x
⋆
=
b
,
P
x
⋆
+
q
+
A
T
ν
⋆
=
0
,
Ax^⋆ =b, \ \ \ \ Px^⋆ +q+A^Tν^⋆ =0,
Ax⋆=b, Px⋆+q+ATν⋆=0,which we can write as
[
P
A
T
A
0
]
[
x
∗
ν
∗
]
=
[
−
q
b
]
\left[ \begin{array}{cc} P & A^T \\ A & 0 \end{array} \right] \left[ \begin{array}{cc} x^* \\ \nu^* \end{array} \right]= \left[ \begin{array}{cc} -q \\ b \end{array} \right]
[PAAT0][x∗ν∗]=[−qb]This set of
n
+
p
n + p
n+p linear equations in the
n
+
p
n + p
n+p variables
x
⋆
,
ν
⋆
x^⋆, \nu^⋆
x⋆,ν⋆ is called the KKT system for the equality constrained quadratic optimization problem.
10.1.2 Eliminating equality constraints
One general approach to solving the equality constrained problem is to eliminate the equality constraints, and then solve the resulting
R
n
×
(
n
−
p
)
\mathbf{R}^{n×(n−p)}
Rn×(n−p) and vector
x
~
∈
R
n
\tilde{x} ∈ \mathbf{R}^n
x~∈Rn that parametrize the (affine) feasible set:
{
x
∣
A
x
=
b
}
=
{
F
z
+
x
~
∣
z
∈
R
n
−
p
}
\{ x\mid Ax=b \}=\{ Fz+\tilde{x}\mid z\in \mathbf{R}^{n−p} \}
{x∣Ax=b}={Fz+x~∣z∈Rn−p}Here
x
~
\tilde{x}
x~ can be chosen as any particular solution of
A
x
=
b
Ax = b
Ax=b, and
F
∈
R
n
×
(
n
−
p
)
F\in\mathbf{R}^{n×(n−p)}
F∈Rn×(n−p) is any matrix whose range is the nullspace of
A
A
A. Then we minimize the problem:
m
i
n
i
m
i
z
e
f
(
x
)
=
(
F
z
+
x
~
)
\begin{aligned} {\rm minimize} \ \ \ \ & f(x)=(Fz+\tilde{x})\\ \end{aligned}
minimize f(x)=(Fz+x~)
10.1.3 Solving equality constrained problems via the dua
Another approach is to solve the dual, and then recover the optimal primal variable
x
⋆
x^⋆
x⋆,
m
a
x
i
m
i
z
e
−
b
T
ν
−
f
∗
(
−
A
T
ν
)
{\rm maximize} \ \ \ \ -b^T\nu-f^*(-A^T\nu)\\
maximize −bTν−f∗(−ATν)
10.2 Newton’s method with equality constraints
The initial point must be feasible and we need to make sure that the Newton step ∆ x n t ∆x_{\rm nt} ∆xnt is a feasible direction: A ∆ x n t = 0 A∆x_{\rm nt} = 0 A∆xnt=0.
10.2.1 The Newton step
We replace the objective with its second-order Taylor approximation near
x
x
x, to form the problem
m
i
n
i
m
i
z
e
f
^
(
x
+
v
)
=
f
(
x
)
+
∇
f
(
x
)
T
v
+
1
2
v
T
∇
2
f
(
x
)
v
s
u
b
j
e
c
t
t
o
A
(
x
+
v
)
=
b
\begin{aligned} {\rm minimize} \ \ \ \ & \hat{f}(x+v)=f(x)+\nabla f(x)^Tv+\frac{1}{2}v^T\nabla^2f(x)v\\ {\rm subject \ to} \ \ \ \ & A(x+v)=b \end{aligned}
minimize subject to f^(x+v)=f(x)+∇f(x)Tv+21vT∇2f(x)vA(x+v)=bThe Newton step of a constrained problem should satisfy
[
∇
2
f
(
x
)
A
T
A
0
]
[
Δ
x
n
t
w
]
=
[
−
∇
f
(
x
)
0
]
\left[ \begin{array}{cc} \nabla^2f(x) & A^T \\ A & 0 \end{array} \right] \left[ \begin{array}{cc} \Delta x_{\rm nt} \\ w \end{array} \right]= \left[ \begin{array}{cc} -\nabla f(x) \\ 0 \end{array} \right]
[∇2f(x)AAT0][Δxntw]=[−∇f(x)0]
10.2.2 Newton’s method with equality constraints
The outline of Newton’s method with equality constraints is exactly the same as for unconstrained problems.
10.2.3 Newton’s method and elimination
10.2.4 Convergence analysis
In particular, the practical performance of New- ton’s method with equality constraints is exactly like the performance of Newton’s method for unconstrained problems. Once x ( k ) x^{(k)} x(k) is near x ⋆ x^⋆ x⋆, convergence is extremely rapid, with a very high accuracy obtained in only a few iterations.
10.3 Infeasible start Newton method
10.3.1 Newton step at infeasible points
We substitute
x
+
∆
x
x + ∆x
x+∆x for
x
⋆
x^⋆
x⋆ and
w
w
w for
ν
⋆
\nu^⋆
ν⋆ in the optimality conditions, and use the first-order approximation
∇
f
(
x
+
Δ
x
)
≈
∇
f
(
x
)
+
∇
2
f
(
x
)
Δ
x
\nabla f(x+\Delta x) \approx \nabla f(x) +\nabla ^2f (x) \Delta x
∇f(x+Δx)≈∇f(x)+∇2f(x)Δxfor the gradient to obtain
A
(
x
+
∆
x
)
=
b
,
∇
f
(
x
)
+
∇
2
f
(
x
)
∆
x
+
A
T
w
=
0.
A(x + ∆x) = b, \ \ \ \ ∇f(x) + ∇^2f(x)∆x + A^T w = 0.
A(x+∆x)=b, ∇f(x)+∇2f(x)∆x+ATw=0.This is a set of linear equations for
∆
x
∆x
∆x and
w
w
w,
[
∇
2
f
(
x
)
A
T
A
0
]
[
Δ
x
n
t
w
]
=
[
−
∇
f
(
x
)
A
x
−
b
]
\left[ \begin{array}{cc} \nabla^2f(x) & A^T \\ A & 0 \end{array} \right] \left[ \begin{array}{cc} \Delta x_{\rm nt} \\ w \end{array} \right]= \left[ \begin{array}{cc} -\nabla f(x) \\ Ax-b \end{array} \right]
[∇2f(x)AAT0][Δxntw]=[−∇f(x)Ax−b]
10.3.2 Infeasible start Newton method
10.3.3 Convergence analysis
We show that once the norm of the residual is small enough, the algorithm takes full steps (which implies that feasibility is achieved), and convergence is subsequently quadratic. We also show that the norm of the residual is reduced by at least a fixed amount in each iteration before the region of quadratic convergence is reached.
10.3.4 Convex-concave games
We can use the infeasible start Newton method to compute a solution of a convex-concave game with twice differentiable payoff function. We define the residual as
r
(
u
,
v
)
=
∇
f
(
u
,
v
)
=
[
∇
u
f
(
u
,
v
)
∇
v
f
(
u
,
v
)
]
r(u,v) = ∇f(u,v) = \left[ \begin{array}{c} \nabla_uf(u,v) \\ \nabla_vf(u,v) \end{array} \right]
r(u,v)=∇f(u,v)=[∇uf(u,v)∇vf(u,v)]
10.3.5 Examples
10.4 Implementation
10.4.1 Elimination
To implement the elimination method, we have to calculate a full rank matrix
F
F
F and an
x
~
\tilde{x}
x~ such that
{
x
∣
A
x
=
b
}
=
{
F
z
+
x
~
∣
z
∈
R
n
−
p
}
\{ x\mid Ax=b \}=\{ Fz +\tilde{x}\mid z\in\mathbf{R}^{n-p} \}
{x∣Ax=b}={Fz+x~∣z∈Rn−p}