Convex Optimization 读书笔记 (9)

Chapter10: Equality constrained minimization

10.1 Equality constrained minimization problems

Consider a optimization problem:
m i n i m i z e      f ( x ) s u b j e c t   t o      A x = b \begin{aligned} {\rm minimize} \ \ \ \ & f(x) \\ {\rm subject \ to} \ \ \ \ & Ax=b \end{aligned} minimize    subject to    f(x)Ax=bwhere f : R n → R f : \mathbf{R}^n → \mathbf{R} f:RnR is convex and twice continuously differentiable, and A ∈ R p × n A ∈ \mathbf{R}^{p×n} ARp×n with r a n k   A = p < n \mathbf{rank} \ A = p < n rank A=p<n. Solving this is equal to solve the problem below:
A x ∗ = b ,      ∇ f ( x ∗ ) + A T ν ∗ = 0 Ax^*=b, \ \ \ \ \nabla f(x^*)+A^T\nu^*=0 Ax=b,    f(x)+ATν=0The first set of equations, A x = b Ax = b Ax=b, are called the primal feasibility equations, which are linear. The second set of equations, ∇ f ( x ⋆ ) + A T ν ⋆ = 0 ∇f(x^⋆) + A^T ν^⋆ = 0 f(x)+ATν=0, are called the dual feasibility equations, and are in general nonlinear.

10.1.1 Equality constrained convex quadratic minimization

Consider the equality constrained convex quadratic minimization problem
m i n i m i z e      f ( x ) = 1 2 x T P x + q T x + r s u b j e c t   t o      A x = b \begin{aligned} {\rm minimize} \ \ \ \ & f(x)=\frac{1}{2}x^TPx+q^Tx+r\\ {\rm subject \ to} \ \ \ \ & Ax=b \end{aligned} minimize    subject to    f(x)=21xTPx+qTx+rAx=bwhere P ∈ S + n P\in \mathbf{S}^n_+ PS+n and A ∈ R p × n A\in \mathbf{R}^{p×n} ARp×n. Here the optimality conditions are
A x ⋆ = b ,      P x ⋆ + q + A T ν ⋆ = 0 , Ax^⋆ =b, \ \ \ \ Px^⋆ +q+A^Tν^⋆ =0, Ax=b,    Px+q+ATν=0,which we can write as
[ P A T A 0 ] [ x ∗ ν ∗ ] = [ − q b ] \left[ \begin{array}{cc} P & A^T \\ A & 0 \end{array} \right] \left[ \begin{array}{cc} x^* \\ \nu^* \end{array} \right]= \left[ \begin{array}{cc} -q \\ b \end{array} \right] [PAAT0][xν]=[qb]This set of n + p n + p n+p linear equations in the n + p n + p n+p variables x ⋆ , ν ⋆ x^⋆, \nu^⋆ x,ν is called the KKT system for the equality constrained quadratic optimization problem.

10.1.2 Eliminating equality constraints

One general approach to solving the equality constrained problem is to eliminate the equality constraints, and then solve the resulting R n × ( n − p ) \mathbf{R}^{n×(n−p)} Rn×(np) and vector x ~ ∈ R n \tilde{x} ∈ \mathbf{R}^n x~Rn that parametrize the (affine) feasible set:
{ x ∣ A x = b } = { F z + x ~ ∣ z ∈ R n − p } \{ x\mid Ax=b \}=\{ Fz+\tilde{x}\mid z\in \mathbf{R}^{n−p} \} {xAx=b}={Fz+x~zRnp}Here x ~ \tilde{x} x~ can be chosen as any particular solution of A x = b Ax = b Ax=b, and F ∈ R n × ( n − p ) F\in\mathbf{R}^{n×(n−p)} FRn×(np) is any matrix whose range is the nullspace of A A A. Then we minimize the problem:
m i n i m i z e      f ( x ) = ( F z + x ~ ) \begin{aligned} {\rm minimize} \ \ \ \ & f(x)=(Fz+\tilde{x})\\ \end{aligned} minimize    f(x)=(Fz+x~)

10.1.3 Solving equality constrained problems via the dua

Another approach is to solve the dual, and then recover the optimal primal variable x ⋆ x^⋆ x,
m a x i m i z e      − b T ν − f ∗ ( − A T ν ) {\rm maximize} \ \ \ \ -b^T\nu-f^*(-A^T\nu)\\ maximize    bTνf(ATν)

10.2 Newton’s method with equality constraints

The initial point must be feasible and we need to make sure that the Newton step ∆ x n t ∆x_{\rm nt} xnt is a feasible direction: A ∆ x n t = 0 A∆x_{\rm nt} = 0 Axnt=0.

10.2.1 The Newton step

We replace the objective with its second-order Taylor approximation near x x x, to form the problem
m i n i m i z e      f ^ ( x + v ) = f ( x ) + ∇ f ( x ) T v + 1 2 v T ∇ 2 f ( x ) v s u b j e c t   t o      A ( x + v ) = b \begin{aligned} {\rm minimize} \ \ \ \ & \hat{f}(x+v)=f(x)+\nabla f(x)^Tv+\frac{1}{2}v^T\nabla^2f(x)v\\ {\rm subject \ to} \ \ \ \ & A(x+v)=b \end{aligned} minimize    subject to    f^(x+v)=f(x)+f(x)Tv+21vT2f(x)vA(x+v)=bThe Newton step of a constrained problem should satisfy
[ ∇ 2 f ( x ) A T A 0 ] [ Δ x n t w ] = [ − ∇ f ( x ) 0 ] \left[ \begin{array}{cc} \nabla^2f(x) & A^T \\ A & 0 \end{array} \right] \left[ \begin{array}{cc} \Delta x_{\rm nt} \\ w \end{array} \right]= \left[ \begin{array}{cc} -\nabla f(x) \\ 0 \end{array} \right] [2f(x)AAT0][Δxntw]=[f(x)0]

10.2.2 Newton’s method with equality constraints

The outline of Newton’s method with equality constraints is exactly the same as for unconstrained problems.

Convex Optimization 读书笔记 (9)

10.2.3 Newton’s method and elimination

10.2.4 Convergence analysis

In particular, the practical performance of New- ton’s method with equality constraints is exactly like the performance of Newton’s method for unconstrained problems. Once x ( k ) x^{(k)} x(k) is near x ⋆ x^⋆ x, convergence is extremely rapid, with a very high accuracy obtained in only a few iterations.

10.3 Infeasible start Newton method

10.3.1 Newton step at infeasible points

We substitute x + ∆ x x + ∆x x+x for x ⋆ x^⋆ x and w w w for ν ⋆ \nu^⋆ ν in the optimality conditions, and use the first-order approximation
∇ f ( x + Δ x ) ≈ ∇ f ( x ) + ∇ 2 f ( x ) Δ x \nabla f(x+\Delta x) \approx \nabla f(x) +\nabla ^2f (x) \Delta x f(x+Δx)f(x)+2f(x)Δxfor the gradient to obtain
A ( x + ∆ x ) = b ,      ∇ f ( x ) + ∇ 2 f ( x ) ∆ x + A T w = 0. A(x + ∆x) = b, \ \ \ \ ∇f(x) + ∇^2f(x)∆x + A^T w = 0. A(x+x)=b,    f(x)+2f(x)x+ATw=0.This is a set of linear equations for ∆ x ∆x x and w w w,
[ ∇ 2 f ( x ) A T A 0 ] [ Δ x n t w ] = [ − ∇ f ( x ) A x − b ] \left[ \begin{array}{cc} \nabla^2f(x) & A^T \\ A & 0 \end{array} \right] \left[ \begin{array}{cc} \Delta x_{\rm nt} \\ w \end{array} \right]= \left[ \begin{array}{cc} -\nabla f(x) \\ Ax-b \end{array} \right] [2f(x)AAT0][Δxntw]=[f(x)Axb]

10.3.2 Infeasible start Newton method

Convex Optimization 读书笔记 (9)

10.3.3 Convergence analysis

We show that once the norm of the residual is small enough, the algorithm takes full steps (which implies that feasibility is achieved), and convergence is subsequently quadratic. We also show that the norm of the residual is reduced by at least a fixed amount in each iteration before the region of quadratic convergence is reached.

10.3.4 Convex-concave games

We can use the infeasible start Newton method to compute a solution of a convex-concave game with twice differentiable payoff function. We define the residual as
r ( u , v ) = ∇ f ( u , v ) = [ ∇ u f ( u , v ) ∇ v f ( u , v ) ] r(u,v) = ∇f(u,v) = \left[ \begin{array}{c} \nabla_uf(u,v) \\ \nabla_vf(u,v) \end{array} \right] r(u,v)=f(u,v)=[uf(u,v)vf(u,v)]

10.3.5 Examples

10.4 Implementation

10.4.1 Elimination

To implement the elimination method, we have to calculate a full rank matrix F F F and an x ~ \tilde{x} x~ such that
{ x ∣ A x = b } = { F z + x ~ ∣ z ∈ R n − p } \{ x\mid Ax=b \}=\{ Fz +\tilde{x}\mid z\in\mathbf{R}^{n-p} \} {xAx=b}={Fz+x~zRnp}

10.4.2 Solving KKT systems

10.4.3 Examples