数值计算详细笔记(二):非线性方程解法

2. Nonlinear Equation f(x) = 0

2.1 Bisection Method

2.1.1 Root-Finding Problem

Root-Finding or Zero-Finding Problem

Given a function f(x)f(x) in one variable xx, finding a root xx of an equation of the form f(x)=0f(x)=0.

Solution xx is called a root of equation $f(x)=0 $, or zero of function f(x)f(x).

2.1.2 Bisection Algorithm

Prejudgment

By the Intermediate Value Theorem, if

fC[a,b],and f(a)f(b)<0, f\in C[a,b], and \ f(a)*f(b)<0,

then there exists at least a point x(a,b)x^*\in (a,b), such that f(x)=0f(x^*)=0.

Algorithm Description

  • INPUT: endpoints a,ba,b; tolerance TOLTOL; maximum number of iterations NN.

  • OUTPUT: approximate solution cc or message of failure.

  • Step 11: Set k=1,FA=f(a)k = 1,FA=f(a);

  • Step 22: While kNk\leq N, do Steps 363~6

    • Step 33: Set c=a+b2c= \displaystyle\frac{a+b}{2}; and compute FC=f(c)FC=f(c).
    • Step 44: If FC=0FC=0 or ba2<TOL\displaystyle\frac{|b-a|}{2}<TOL, then output cc, (Procedure complete successfully.) Stop!
    • Step 55: If FAFC<0FA*FC<0, then set b=cb=c; else set a=ca=c
    • Step 66: Set k=k+1k=k+1.
  • Step 77: OUTPUT “Method failed after NN iterations.” STOP!

Other Stopping Criteria

Other Stopping Criteria for Iteration procedures with a given tolerance ε>0:\varepsilon >0:

pnpn1<εpnpn1pn<εf(pn)<ε |p_n-p_{n-1}|<\varepsilon\\ \displaystyle\frac{|p_n-p_{n-1}|}{|p_n|} < \varepsilon \\ |f(p_n)|< \varepsilon

2.1.3 Convergence Analysis

Theorem

Suppose that fC[a,b]f\in C[a,b], and f(a)f(b)<0f(a)*f(b)<0. The Bisection method generates a sequence {pn}1\{{p_n}\}_1^\infty approximating a zero point pp of ff with

pnpba2n,n1. |p_n-p|\leq \displaystyle\frac{b-a}{2^n},n\geq1.

Proof

By the procedure, we know that

b1a1=ba,b2a2=b1a12=ba2,...bnan=bn1an12=ba2n1, |b_1-a_1|=|b-a|,\\ |b_2-a_2|=\displaystyle\frac{|b_1-a_1|}{2}=\displaystyle\frac{|b-a|}{2},\\ ...\\ |b_n-a_n|=\displaystyle\frac{|b_{n-1}-a_{n-1}|}{2}=\displaystyle\frac{|b-a|}{2^{n-1}},

Since pn=an+bn2p_n=\displaystyle\frac{a_n+b_n}{2} and p(an,pn]p\in (a_n,p_n] or p[pn,bn)p\in [p_n,b_n) for all n1n\geq 1, it follows that

pnpbnan2=ba2n. |p_n-p|\leq \displaystyle\frac{|b_n-a_n| }{2}=\displaystyle\frac{b-a}{2^n}.

Convergence Rate

Since

pnpbnan2=ba2n |p_n-p|\leq \displaystyle\frac{|b_n-a_n|}{2}=\displaystyle\frac{|b-a|}{2^n}

The Sequence {pn}n=1\{p_n\}_{n=1}^\infty converges to pp with rate of convergence O(12n)O(\displaystyle\frac{1}{2^n}), that is

pn=p+O(12n) p_n=p+O(\displaystyle\frac{1}{2^n})

Other Property

Bisection is certain to converge, but does so slowly.

Given starting interval [a,b][a,b], length of interval after kk iterations is ba2k\displaystyle\frac{b-a}{2^k}, so achieving error tolerance of ε((ba)2k<ε)\varepsilon(\displaystyle\frac{(b-a)}{2^k}<\varepsilon) requires k[log2baε]k\approx[log_2^{\frac{b-a}{\varepsilon}}] iterations, regardless of function ff involved.

2.2 Fixed Point Method

2.2.1 Introduction

  • Fixed point of given function g:RRg:\mathbb{R}\rightarrow \mathbb{R} is value xx^* such that x=g(x)x^*=g(x^*)
  • Many iterative methods for solving nonlinear equations use fixed-point iteration scheme of form
    xk+1=g(xk) x_{k+1}=g(x_k)
    where fixed points for gg are solutions for f(x)=0f(x)=0.

Example

If f(x)=x2x2f(x)=x^2-x-2, it has two roots x=2x^*=2 and x=1x^*=-1. Then fixed points of each of functions
g(x)=x22g(x)=x+2g(x)=1+2xg(x)=x2+22x1 g(x)=x^2-2 \\ g(x) = \sqrt{x+2}\\ g(x) = 1+\displaystyle\frac{2}{x} \\ g(x) = \displaystyle\frac{x^2+2}{2*x-1}
are solutions to equation f(x)=0f(x)=0.

2.2.2 Algorithm

Definition

  • To approximate the fixed point of a function g(x)g(x), we choose an initial approximation p0p_0, and generate the sequence {pn}n=0\{p_n\}^\infty_{n=0} by letting
    {Given    p0pn=g(pn1),n=0,1,..., \left\{ \begin{aligned} Given & \ \ \ \ p_0 \\ p_n = & g(p_{n-1}),n=0,1,..., \end{aligned} \right.
    for each n1n\geq 1.
  • If the sequence {pn}n=0\{p_n\}^\infty_{n=0} converges to pp and g(x)g(x) is continuous, then we have
    p=limnpn=limng(pn)=g(limnpn)=g(p). p = \lim_{n\rightarrow \infty}p_n=\lim_{n\rightarrow \infty}g(p_n)=g(\lim_{n\rightarrow \infty}p_n)=g(p).
    and a solution to x=g(x)x=g(x) is obtained.
  • This technique described above is called fixed point iteration (or functional iteration).

Example

数值计算详细笔记(二):非线性方程解法

Pseudo-Code

  • INPUT: Initial approximation p0p_0, tolerance TOLTOL, Maximum number of iteration NN.

  • OUTPUT: approximate solution pp or message of failure.

  • Step 11: Set n=1n = 1.

  • Step 22: While nNn\leq N, do Steps 353~5.

    • Step 33: Set p=g(p0)p= g(p_0).
    • Step 44: If pp0<TOL|p-p_0|<TOL, then output pp; (Procedure complete successfully.) Stop!
    • Step 55: Set n=n+1,p0=pn=n+1, p_0=p.
  • Step 66: OUTPUT “Method failed after NN iterations.” STOP!

2.2.3 Existence and Uniqueness

Theorem: Sufficient Conditions for the Existence and Uniqueness of a Fixed Point

  1. 【Existence】If g(x)C[a,b]g(x)\in C[a,b] and g(x)[a,b]g(x)\in [a,b] for all x[a,b]x\in [a,b], then g(x)g(x) has a fixed point in [a,b][a,b]. (g(x)[a,b]g(x)\in [a,b] means max{g(x)}b,min{g(x)}amax\{g(x)\}\leq b,min\{g(x)\}\geq a)
  2. 【Uniqueness】If, in addition, g(x)g'(x) exists on (a,b)(a,b), and a positive constant k<1k<1 exists with g(x)k|g'(x)|\leq k, for all x(a,b)x\in (a,b).

Meet the two conditions above means the fixed point in [a,b][a,b] is unique.

Proof for Existence

  1. If g(a)=ag(a)=a or g(b)=bg(b)=b, then g(x)g(x) has a fixed point at an endpoint.
  2. Suppose not, then it must be true that g(a)>ag(a)>a and g(b)<bg(b)<b.
  3. Thus the function h(x)=g(x)xh(x)=g(x)-x is continuous on [a,b][a,b], and we have
    h(a)=g(a)a>0,h(b)=g(b)b<0. h(a)=g(a)-a>0,h(b)=g(b)-b<0.
  4. The Intermediate Value Theorem implies that there exists p(a,b)p\in (a,b) for h(x)=g(x)xh(x)=g(x)-x which h(p)=0h(p)=0.
  5. Thus g(p)p=0g(p)-p=0, and pp is a fixed point of g(x)g(x).

The proving process above changes g(x)=xg(x)=x to f(x)=g(x)xf(x)=g(x)-x, changing fixed point problem into zero point existing problem.

Proof for Uniqueness

Using contradiction method to prove this characteristic.

  1. Suppose, in addition, that g(x)k1|g'(x)\leq k\leq 1| and that pp and qq are both fixed points in [a,b][a,b] with pq.p\not=q.
  2. Then by the Mean Value Theorem, a number ξ\xi exists between pp and qq. Hence, in [a,b][a,b],
    g(p)g(q)pq=g(ξ) \displaystyle\frac{g(p)-g(q)}{p-q}=g'(\xi)
    exists.
  3. Then
    pq=g(p)g(q)=g(ξ)pqkpq<pq, |p-q|=|g(p)-g(q)|=|g'(\xi)||p-q|\leq k*|p-q|<|p-q|,
    which is a contradiction.
  4. So p=qp=q, and the fixed point in [a,b][a,b] is unique.

2.2.4 Convergence Analysis

Theorem

Meet the two conditions above, we can find that, for any number p0[a,b]p_0\in[a,b], the sequence {pn}0\{p_n\}^\infty_0 defined by
pn=g(pn1),n1 p_n=g(p_{n-1}), n\geq 1
converges to the unique fixed point pp in [a,b][a,b].

Proof

Using the Intermediate Value Theorem, we can prove the existence.

Using the fact that g(x)k|g'(x)|\leq k and the Mean Value Theorem, we have
pnp=g(pn1g(p))=g(ξ)pn1pkpn1p, |p_n-p|=|g(p_{n-1}-g(p))|=|g'(\xi)||p_{n-1}-p|\leq k*|p_{n-1}-p|,
where ξ(a,b)\xi \in (a,b).

Applying this inequality inductively shows
pnpkpn1pk2pn2p...knp0p. |p_n-p|\leq k*|p_{n-1}-p|\leq k^2*|p_{n-2}-p|\leq ...\leq k^n*|p_{0}-p|.

Since k<1k<1,
limnpnplimnknp0p=0, \lim_{n\rightarrow\infty}|p_n-p|\leq \lim_{n\rightarrow\infty}k^n|p_0-p|=0,
and {pn}0\{p_n\}_0^\infty converges to pp.

Convergence Rate

Since
pnpknp0pknba |p_n-p|\leq k^n*|p_0-p|\leq k^n*|b-a| ,
the Sequence {pn}n=0\{p_n\}_{n=0}^\infty converges to pp with rate of convergence O(kn)O(k^n) with k<1k< 1, that is
pn=p+O(kn),k<1 p_n=p+O(k^n),k<1 .

2.2.5 Bounds for the Error

Corollary

If the solution for g(x)=xg(x)=x exists and is unique, then the bounds for the error involved in using pnp_n to approximate pp are given by
pnpknmax{p0a,bp0} |p_n-p|\leq k^n*max\{p_0-a,b-p_0\}
and
pnpkn1kp1p0, |p_n-p|\leq \displaystyle\frac{k^n}{1-k}*|p_1-p_0|,
for all n1n\geq 1.

Proof

pnpn1g(pn1)g(pn2)kpn1pn2...kn1p1p0. |p_n-p_{n-1}|\leq |g(p_{n-1})-g(p_{n-2})|\leq k*|p_{n-1}-p_{n-2}|\leq ...\leq k^{n-1}|p_1-p_0|.

Let m>nm>n, using the theorem a+ba+b|a+b|\leq |a|+|b|, then we have
pmpnpmpm1+pm1pm2+...+pn+1pn(km1+km2+...+kn)p1p0pmpmkn(1+k+...+kmn1)p1p0kn1kmn11kp1p0 |p_m-p_n|\leq |p_m-p_{m-1}|+|p_{m-1}-p_{m-2}|+...+|p_{n+1}-p_n|\leq (k^{m-1}+k^{m-2}+...+k^n)|p_1-p_0|\\ |p_m-p_m|\leq k^n*(1+k+...+k^{m-n-1})|p_1-p_0|\leq k^n*\displaystyle\frac{1-k^{m-n-1}}{1-k}*|p_1-p_0| .

Let mm\rightarrow\infty, we have
limmpmpn=ppnkn1kp1p0. \lim_{m\rightarrow\infty}|p_m-p_n|=|p-p_n|\leq \displaystyle\frac{k^n}{1-k}*|p_1-p_0|.

2.3 Newton’s Method

2.3.1 Introduction

Status

The Newton-Raphson (or simply Newton’s) method is one of the most powerful and well-known numerical methods for solving a root-finding problem
f(x)=0. f(x)=0.

Rough Description

(1) Suppose that fC2[a,b]f\in C^2[a,b], and xx^* is a solution of f(x)=0f(x)=0.

(2) Let x^[a,b]\hat{x}\in [a,b] be an approximation to xx^* such that f(x^)0f'(\hat{x})\not=0 and x^x|\hat{x}-x^*| is “small”.

Consider the first Taylor polynomial for f(x)f(x) expanded about x^\hat{x},
f(x)=f(x^)+(xx^)f(x^)+(xx^)22f(ξ(x)). f(x)=f(\hat{x})+(x-\hat{x})f'(\hat{x})+\displaystyle\frac{(x-\hat{x})^2}{2}f''(\xi(x)).
where ξ(x)\xi(x) lies between xx and x^\hat{x}.

Thus, consider f(x)=0f(x^*)=0, and gives
0=f(x)f(x^)+(xx^)f(x^). 0=f(x^*)\approx f(\hat{x})+(x^*-\hat{x})f'(\hat{x}).

Solution for finding xx^* is
xx^f(x^)f(x^). x^*\approx \hat{x}-\displaystyle\frac{f(\hat{x})}{f'(\hat{x})}.

2.3.2 Algorithm

Definition

  • Starts with an initial approximation x0x_0
  • Defined iteration scheme by
    xn=xn1f(xn1)f(xn1),n1 x_n=x_{n-1}-\displaystyle\frac{f(x_{n-1})}{f'(x_{n-1})},\forall n\geq 1
  • This scheme generates the sequence {xn}0\{x_n\}_0^\infty

Example

数值计算详细笔记(二):非线性方程解法

Pseudo-Code

The function ff is differentiable and p0p_0 is an initial approximation.

  • INPUT: Initial approximation p0p_0, tolerance TOLTOL, Maximum number of iteration NN.

  • OUTPUT: approximate solution pp or message of failure.

  • Step 11: Set n=1n = 1.

  • Step 22: While nNn\leq N, do Steps 353~5.

    • Step 33: Set p=p0f(p0)f(p0)p= p_0-\displaystyle\frac{f(p_0)}{f'(p_0)}.
    • Step 44: If pp0<TOL|p-p_0|<TOL, then output pp; (Procedure complete successfully.) Stop!
    • Step 55: Set n=n+1,p0=pn=n+1, p_0=p.
  • Step 66: OUTPUT “Method failed after NN iterations.” STOP!

2.3.3 Convergence

Theorem

(1) fC2[a,b].f\in C^2[a,b].

(2) p[a,b]p\in [a,b] is such that f(p)=0f(p)=0 and f(p)0f'(p)\not=0.

Then there exists a δ>0\delta>0 such that Newton’s method generates a sequence {pn}1\{p_n\}_1^\infty converging to pp for any initial approximation p0[pδ,p+δ]p_0\in[p-\delta,p+\delta].

Proof

pn=g(pn1)g(x)=xf(x)f(x) p_n=g(p_{n-1})\\ g(x)=x-\displaystyle\frac{f(x)}{f'(x)}

Things need to prove: According to the proofing process of the fixed point method, we need to find an interval [pδ,p+δ][p-\delta,p+\delta] that gg maps into itself, and g(x)k<1|g'(x)|\leq k<1 for all x[pδ,p+δ].x\in[p-\delta,p+\delta]. (Existence and Uniqueness)

Proving Process:

  1. δ1>0,g(x)C[pδ1,p+δ1]\exists \delta_1>0,g(x) \in C[p-\delta_1,p+\delta_1]

    Since f(p)0f'(p)\not=0 and ff' is continuous, there exists δ1>0\delta_1>0 such that f(x)0f'(x)\not= 0 and $f’(x)\in C[a,b] $ with x[pδ1,p+δ1]\forall x\in [p-\delta_1,p+\delta_1].

  2. g(x)C[pδ1,p+δ1]g'(x) \in C[p-\delta_1,p+\delta_1]

    g(x)=f(x)f(x)(f(x))2g'(x)=\displaystyle\frac{f(x)f''(x)}{(f'(x))^2} for all x[pδ1,p+δ1]x\in [p-\delta_1,p+\delta_1]. Since fC2[a,b]f\in C^2[a,b], g(x)[pδ1,p+δ2]g'(x)\in [p-\delta_1,p+\delta_2].

  3. g(x)k<1|g'(x)|\leq k< 1

    f(p)=0,g(p)=0 f(p)=0,g'(p)=0

    Since gC[pδ1,p+δ1]g'\in C[p-\delta_1,p+\delta_1], there exists a δ\delta with 0<δ<δ10<\delta<\delta_1 and

    g(x)k<1,x[pδ,p+δ]. |g'(x)|\leq k<1, \forall x\in [p-\delta,p+\delta].

  4. g[pδ,p+δ][pδ,p+δ]g\in[p-\delta,p+\delta]\mapsto [p-\delta,p+\delta]

    According to the Mean Value Theorem, if x[pδ,p+δ]x\in[p-\delta,p+\delta], there exists
    ξ[x,p],g(x)g(p)=g(ξ)xpg(x)p=g(x)g(p)=g(ξ)xpkxp<xp<δ \xi \in [x,p],|g(x)-g(p)|=|g'(\xi)||x-p|\\ |g(x)-p|=|g(x)-g(p)|=|g'(\xi)||x-p|\leq k|x-p|<|x-p|<\delta
    Thus, g[pδ,p+δ][pδ,p+δ]g\in[p-\delta,p+\delta]\mapsto [p-\delta,p+\delta].

According to the proving process above, all the hypotheses of the Fixed-Point Theorem are satisfied for g(x)=xf(x)f(x)g(x)=x-\displaystyle\frac{f(x)}{f'(x)}. Therefore, the sequence ${p_n}_{n=1}^\infty $ defined by
pn=g(pn1),n1 p_n=g(p_n-1),\forall n\geq 1
converges to pp for any p0[pδ,p+δ].p_0\in[p-\delta,p+\delta].

2.3.4 Secant Method

Background

For Newton’s method, each iteration requires evaluation of both function f(xk)f(x_k) and its derivative f(xk)f'(x_k), which may be inconvenient or expensive.

Improvement

Derivative is approximated by finite difference using two successive iterates, so iteration becomes
xk+1=xkf(xk)xkxk1f(xk)f(xk1). x_{k+1}=x_k-f(x_k)*\displaystyle\frac{x_k-x_{k-1}}{f(x_k)-f(x_{k-1})}.
This method is known as secant method.

Example

数值计算详细笔记(二):非线性方程解法

Pseudo-Code

  • INPUT: Initial approximation p0,p1p_0,p_1, tolerance TOLTOL, Maximum number of iteration NN.

  • OUTPUT: approximate solution pp or message of failure.

  • Step 11: Set n=2,q0=f(p0),q1=f(p1)n = 2,q_0=f(p_0),q_1=f(p_1).

  • Step 22: While nNn\leq N, do Steps 353~5.

    • Step 33: Set p=p1q1p1p0q1q0p= p_1-q_1*\displaystyle\frac{p_1-p_0}{q_1-q_0}.(Compute pip_i)
    • Step 44: If pp1<TOL|p-p_1|<TOL, then output pp; (Procedure complete successfully.) Stop!
    • Step 55: Set n=n+1,p0=p1,p1=p,q0=q1,q1=f(p)n=n+1, p_0=p_1,p_1=p, q_0=q_1,q_1=f(p).
  • Step 66: OUTPUT “Method failed after NN iterations.” STOP!

2.3.5 False Position Method

Definition

To find a solution of f(x)=0f(x)=0 for a given continuous function ff on the interval [p0,p1][p_0,p_1], where f(p0)f(p_0) and f(p1)f(p_1) have opposite signs
f(p0)f(p1)<0. f(p_0)*f(p_1)<0.

The approximation p2p_2 is chosen in same manner as in Secant Method, as the xx-intercept (xx轴截距) of the line joining (p0,f(p0))(p_0,f(p_0)) and (p1,f(p1))(p_1,f(p_1)).

To decide the Secant Line to compute p3p_3, we need to check f(p2)f(p1)f(p_2)*f(p_1) or f(p2)f(p0)f(p_2)*f(p_0).

If f(p2)f(p1)f(p_2)*f(p_1) is negative, we choose p3p_3 as the xx-intercept for the line joining (p1,f(p1)(p_1,f(p_1) and p2,f(p2)p_2,f(p_2).

In a similar manner, we can get a sequence {pn}2\{p_n\}_2^\infty which approximates to the root.

Example

数值计算详细笔记(二):非线性方程解法

Pseudo-Code

  • INPUT: Initial approximation p0,p1p_0,p_1, tolerance TOLTOL, Maximum number of iteration NN.

  • OUTPUT: approximate solution pp or message of failure.

  • Step 11: Set n=2,q0=f(p0),q1=f(p1)n = 2,q_0=f(p_0),q_1=f(p_1).

  • Step 22: While nNn\leq N, do Steps 363~6.

    • Step 33: Set p=p1q1p1p0q1q0p= p_1-q_1*\displaystyle\frac{p_1-p_0}{q_1-q_0}.(Compute pip_i)
    • Step 44: If pp1<TOL|p-p_1|<TOL, then output pp; (Procedure complete successfully.) Stop!
    • Step 55: Set n=n+1,q=f(p)n=n+1, q=f(p).
    • Step 66: If qq1<0q*q_1<0, set p0=p,q0=qp_0=p, q_0=q; else set p1=p,q1=q.p_1=p, q_1=q.
  • Step 66: OUTPUT “Method failed after NN iterations.” STOP!

2.4 Error Analysis for Iteration Methods

2.4.1 The Rate of Sequence Convergence

Definition

Suppose {pn}n=0\{p_n\}_{n=0}^\infty is a sequence that converges to pp, with pnpp_n\not=p for all n.

If positive constants λ\lambda and α\alpha exist with
limnpn+1ppnpα=λ, \lim_{n\rightarrow\infty} \displaystyle\frac{|p_{n+1}-p|}{|p_n-p|^\alpha}=\lambda,

then {pn}n=0\{p_n\}_{n=0}^\infty converges to pp of order α\alpha, with asymptotic error constant λ\lambda.

Properties

  1. A sequence with a high order of convergence converges more rapidly than a sequence with a lower order.
  2. The asymptotic constant affects the speed of convergence but is not as important as the order.

Example

  1. If α=1\alpha=1, the sequence is linearly convergent.
  2. If α=2\alpha =2, the sequence is quadratically convergent.

Summary

Using the Mean Value Theorem to prove Linear Convergence and the Taylor’s Theorem to prove Quadratic Convergence with g(p)=0.g'(p)=0.

2.4.2 Convergent Order of Fixed-Point Iteration (Improved)

Convergent Oder of Fixed-Point Iteration

(1) gC[a,b]g\in C[a,b] for all x[a,b]x\in[a,b]
(2) g(x)g'(x) is continuous on (a,b)(a,b) and a positive constant 0<k<10<k<1 exists with g(x)k|g'(x)|\leq k, for all x(a,b)x\in(a,b).

If g(p)0g'(p)\not=0, then for any number p0p_0 in [a,b][a,b], the sequence pn=g(pn1)p_n=g(p_{n-1}), for n1n\geq 1, converges only linearly to the unique fixed point pp in [a,b][a,b].

Proof

pn+1p=g(pn)g(p)=g(ξn)(pnp), p_{n+1}-p=g(p_n)-g(p)=g'(\xi_n)(p_n-p),
where ξn\xi_n is between pnp_n and pp.

Since {pn}n=0\{p_n\}_{n=0}^\infty converges to pp, and ξn\xi_n is between pnp_n and pp, thus {ξn}n=0\{\xi_n\}_{n=0}^\infty also converges to pp.

Thus,
limnpn+1ppnp=limng(ξn)=g(p), \lim_{n\rightarrow\infty}\displaystyle\frac{|p_{n+1}-p|}{|p_n-p|}=\lim_{n\rightarrow\infty}|g'(\xi_n)|=|g'(p)|,
fixed-point iteration exhibits linear convergence with asymptotic error constant g(p)|g'(p)| whenever g(p)0g'(p)\not=0, which also implies that higher-order convergence for fixed-point methods can occur only when g(p)=0g'(p)=0.

Quadratical Convergence

Let pp be a solution of the equation x=g(x)x=g(x).

(1) g(p)=0g'(p)=0

(2) gg'' is continuous and strictly bounded by MM on an open interval II containing pp.

Then there exists a δ>0\delta >0 such that, for p0[pδ,p+δ]p_0\in [p-\delta, p+\delta], the sequence defined by pn=g(pn1)p_n=g(p_{n-1}), when n1n\geq 1, converges at least quadratically to pp.

Moreover, for sufficiently large values of nn,
pn+1p<M2pnp2. |p_{n+1}-p|<\displaystyle\frac{M}{2}|p_n-p|^2.

Proof

Due to the two conditions described above,
g(x)=g(p)+g(p)(xp)+g(ξ)2(xp)2=p+g(ξ)2(xp)2 g(x)=g(p)+g'(p)(x-p)+\displaystyle\frac{g''(\xi)}{2}(x-p)^2=p+\displaystyle\frac{g''(\xi)}{2}*(x-p)^2
is derived, that ξ\xi lies between xx and pp.

Thus,
pn+1=g(pn)=p+g(ξ)2(pnp)2pn+1p=g(ξ)2(pnp)2limng(ξ)=g(p)limnpn+1pnpnp2=limng(ξ)2=g(p)2 p_{n+1}=g(p_n)=p+\displaystyle\frac{g''(\xi)}{2}*(p_n-p)^2\\ p_{n+1}-p=\displaystyle\frac{g''(\xi)}{2}*(p_n-p)^2\\ \lim_{n\rightarrow\infty}g''(\xi)=g''(p)\\ \lim_{n\rightarrow\infty}\displaystyle\frac{|p_{n+1}-p_n|}{|p_n-p|^2}=\lim_{n\rightarrow\infty}\displaystyle\frac{g''(\xi)}{2}=\displaystyle\frac{g''(p)}{2}

Since gg'' is strictly bounded by MM on the interval pδ,p+δ|p-\delta,p+\delta|, for sufficiently large values of nn,
pn+1p<M2pnp2 |p_{n+1}-p|<\displaystyle\frac{M}{2}|p_n-p|^2
is also derived.

Construct a quadratically convergent fixed-point problem

Let
g(x)=xϕ(x)f(x)g(x)=1ϕ(x)f(x)ϕ(x)f(x) g(x)=x-\phi(x)f(x)\\ g'(x)=1-\phi'(x)f(x)-\phi(x)f'(x)

And the condition is g(p)=0g'(p)=0, thus ϕ(p)=1f(p)\phi(p)=\displaystyle\frac{1}{f'(p)}.

A reasonable approach is to let ϕ(x)=1f(x)\phi(x)=\displaystyle\frac{1}{f'(x)}, which is the Newton’s method.

Remarks

  1. the convergence rate of Fixed-Point iteration is usually linear, with constant C=g(p)C=|g'(p)|.
  2. But if g(p)=0g'(p)=0, then the convergence rate is at least quadraticquadratic.

2.4.3 Zero of Multiplicity

Definition

A solution pp of f(x)=0f(x)=0 is a zero of multiplicity mm of f(x)f(x) if for xpx\not=p, we can write
f(x)=(xp)mq(x), f(x)=(x-p)^mq(x),
where
limxpq(x)0. \lim_{x\rightarrow p}q(x)\not=0.

Theorem

  1. fC1[a,b]f\in C^1[a,b] has a simple zero at pp in (a,b)(a,b) if and only if f(p)=0f(p)=0, but f(p)0f'(p)\not=0.
  2. The function fCm[a,b]f\in C^m[a,b] has a zero of multiplicity mm at pp if and only if
    0=f(p)=f(p)=f(p)=...=f(m1)(p). 0=f(p)=f'(p)=f''(p)=...=f^{(m-1)}(p).
    but fm(p)0f^m(p)\not=0.

Proof

If ff has a simple zero at pp, then
f(p)=0f(x)=(xp)q(x)limxpq(x)0. f(p)=0\\ f(x)=(x-p)*q(x)\\ \lim_{x\rightarrow p}q(x)\not=0.

Since fC1[a,b]f\in C^1[a,b],
f(p)=limxpf(x)=limxp[q(x)+(xp)q(x)]=limxpq(x)0. f'(p)=\lim_{x\rightarrow p}f'(x)=\lim_{x\rightarrow p}[q(x)+(x-p)*q'(x)]=\lim_{x\rightarrow p}q(x)\not=0.

2.4.4 Convergence of Newton’s Method

Property

Newton’s method transforms nonlinear equation f(x)=0f(x)=0 into fixed-point problem x=g(x)x=g(x) with g(x)=xf(x)f(x)g(x)=x-\displaystyle\frac{f(x)}{f'(x)}.

  1. If pp is a simple root, f(p)=0,f(p)0,g(p)=0f(p)=0,f'(p)\not=0,g'(p)=0, thus the convergence rate is quadratic. (Iterations must start close enough to root.)

  2. If pp is a root of multiplicity,
    f(x)=(xp)mq(x)g(p)=11m0, f(x)=(x-p)^mq(x)\\ g'(p)=1-\displaystyle\frac{1}{m}\not=0,
    thus the convergence rate is linear.

the Method of avoiding multiple root

f(x)=(xp)mq(x)u(x)=f(x)f(x)u(x)=(xp)q(x)mq(x)+(xp)q(x)u(x)=1m0. f(x)=(x-p)^mq(x)\\ u(x)=\displaystyle\frac{f(x)}{f'(x)}\\ u(x)=(x-p)*\displaystyle\frac{q(x)}{mq(x)+(x-p)q'(x)}\\ u'(x)=\displaystyle\frac{1}{m}\not=0.

Thus pp is a simple root of u(x)u(x). Then we change the Newton’s method into
g(x)=xu(x)u(x)=xf(x)f(x)f(x)2f(x)f(x) g(x)=x-\displaystyle\frac{u(x)}{u'(x)}=x-\displaystyle\frac{f(x)f'(x)}{f'(x)^2-f(x)f''(x)}
whose convergence rate is also quadratic.

2.4.5 Convergence rate of Secant Method

  1. Convergence rate of secant method is normally superlinear, with r1.618r\approx1.618, which is lower than Newton’s method.
  2. Secant method need to evaluate two previous functions per iteration, there is no requirement to evaluate the derivative.
  3. Its disadvantage is that it needs two starting guesses which close enough to the solution in order to converge.

2.5 Accelerating Convergence

2.5.1 Aitken’s method

Background

Accelerating the convergence of a sequence that is linearly convergent, regardless of its origin or application.

limnpn+1ppnp=λ,λ0. \lim_{n\rightarrow \infty} \displaystyle\frac{p_{n+1}-p}{p_n-p}=\lambda,\lambda\not=0.
Thus, when nn is sufficiently large,
pn+1ppnppn+2ppn+1pppnpn+2pn+12pn+22pn+1+pnppn(pn+1pn)2pn+22pn+1+pn \displaystyle\frac{p_{n+1}-p}{p_n-p}\approx \displaystyle\frac{p_{n+2}-p}{p_n+1-p}\\ p\approx\displaystyle\frac{p_n*p_{n+2}-p_{n+1}^2}{p_{n+2}-2*p_{n+1}+p_n}\\ p\approx p_n-\displaystyle\frac{(p_{n+1}-p_{n})^2}{p_{n+2}-2*p_{n+1}+p_n}

Aitken’s Δ\Delta method is to define a new sequence p^n=0:{\hat{p}}_{n=0}^\infty:
p^=pn(pn+1pn)2pn+22pn+1+pn, \hat{p}=p_n-\displaystyle\frac{(p_{n+1}-p_{n})^2}{p_{n+2}-2*p_{n+1}+p_n},
whose convergence rate is faster than the original sequence {pn}n=0\{p_n\}_{n=0}^\infty.

Definition

Given the sequence {pn}n=0\{p_n\}_{n=0}^\infty, the forward difference Δpn\Delta p_n is defined by
Δpn=pn+1pn,n0. \Delta p_n=p_{n+1}-p_n,n\geq 0.
Higher powers Δkpn\Delta^kp_n are defined recursively by
Δkpn=Δ(Δk1pn),k2. \Delta^k p_n=\Delta(\Delta^{k-1}p_{n}),k\geq 2.
For example,
Δ2pn=Δ(Δpn)=Δ(pn+1pn)=Δ(pn+1)Δ(pn)=pn+22pn+1+pn. \Delta^2p_n=\Delta(\Delta p_{n})=\Delta(p_{n+1}-p_n )=\Delta(p_{n+1})-\Delta(p_n)=p_{n+2}-2p_{n+1}+p_n.
Thus, pn^=pn(Δpn)2Δ2pn\hat{p_n}=p_n-\displaystyle\frac{(\Delta p_n)^2}{\Delta^2p_n}.

Theorem

Condition:
limnpn+1ppnp=λ,λ0.(pnp)(pn+1p)>0 \lim_{n\rightarrow \infty} \displaystyle\frac{p_{n+1}-p}{p_n-p}=\lambda,\lambda\not=0.\\ (p_n-p)(p_{n+1}-p)>0

Result: the sequence {pn^}n=0\{\hat{p_n}\}_{n=0}^\infty converges to pp faster than {pn}n=0\{p_n\}_{n=0}^\infty in the sense that
limnp^nppnp=0. \lim_{n\rightarrow\infty}\displaystyle\frac{\hat{p}_n-p}{p_n-p}=0.

Proof:
数值计算详细笔记(二):非线性方程解法

2.5.2 Steffensen’s method

Definition

The function is p=g(p)p=g(p), and the initial approximation is p0p_0, p0^=p0(Δp0)2Δ2p0\hat{p_0}=p_0-\displaystyle\frac{(\Delta p_0)^2}{\Delta^2p_0}.

Assume that p0^\hat{p_0} is a better approximation than p2p_2, so applying fixed-point iteration to p0^\hat{p_0} instead of p2p_2, and the computing process shows below.

数值计算详细笔记(二):非线性方程解法

Theorem

Suppose that x=g(x)x=g(x) has the solution pp with g(p)1g'(p)\not=1.

If there exists a δ>0\delta>0 such that gC3[pδ,p+δ]g\in C^3[p-\delta,p+\delta],

then Steffensen’s method gives quadratic convergence for any p0[pδ,p+δ].p_0\in [p-\delta,p+\delta].

Pseudo-Code

  • INPUT: Initial approximation p0p_0, tolerance TOLTOL, Maximum number of iteration NN.

  • OUTPUT: approximate solution pp or message of failure.

  • Step 11: Set n=1n = 1.

  • Step 22: While nNn\leq N, do Steps 353~5.

    • Step 33: Set p1=g(p0),p2=g(p1),p=p0(p1p0)2p22p1+p0p_1=g(p_0),p_2=g(p_1),p= p_0-\displaystyle\frac{(p_1-p_0)^2}{p_2-2p_1+p_0}.
    • Step 44: If pp0<TOL|p-p_0|<TOL, then output pp; (Procedure complete successfully.) Stop!
    • Step 55: Set n=n+1,p0=pn=n+1, p_0=p.
  • Step 66: OUTPUT “Method failed after NN iterations.” STOP!

2.6 Zeros of Polynomials and Muller’s Method

2.6.1 Polynomial Theorem

数值计算详细笔记(二):非线性方程解法

数值计算详细笔记(二):非线性方程解法

2.6.2 Horner’s Method

Background

A more efficient method to calculate the P(x0)P(x_0) and P(x0)P'(x_0) for a polynomial P(x)P(x).

Theorem

Let
P(x)=i=0i=naixi. P(x)=\sum\limits_{i=0}^{i=n}a_ix^i.

  • Construction process for P(x_0) (Substitute formulas one by one to verify)

if bn=anb_n=a_n and
bk=ak+bk+1x0,k[0,n1], b_k=a_k+b_{k+1}x_0,k\in [0,n-1],
then b0=P(x0)b_0=P(x_0).

Moreover, if
Q(x)=i=1nbixi1 Q(x)=\sum\limits_{i=1}^{n}b_ix^{i-1}
then
P(x)=(xx0)Q(x)+b0. P(x)=(x-x_0)Q(x)+b_0.

  • Construction process for P’(x_0) (Substitute formulas one by one to verify)

P(x)=(xx0)Q(x)+b0P(x)=Q(x)+(xx0)Q(x)P(x0)=Q(x0) P(x)=(x-x_0)Q(x)+b_0\\ P'(x)=Q(x)+(x-x_0)Q'(x)\\ P'(x_0)=Q(x_0)

Let Q(x)=i=1nbixi1=(xx0)R(x)+c1Q(x)=\sum\limits_{i=1}^{n}b_ix^{i-1}=(x-x_0)R(x)+c_1, where R(x)=i=2ncixi2R(x)=\sum\limits_{i=2}^{n}c_ix^{i-2}. Thus
cn=bn,ck=bk+ck+1x0,k[1,n1],Q(x0)=c1=P(x0). c_n=b_n,\\ c_k=b_k+c_{k+1}x_0,k\in[1,n-1],\\ Q(x_0)=c_1=P'(x_0).

Pseudo-Code

To compute the value P(x0)P(x_0) and P(x0)P'(x_0) for a function P(x)=i=0naixi.P(x)=\sum\limits_{i=0}^{n}a_ix^i.

  • INPUT: Degree nn, coefficients a0,a1,...,ana_0,a_1,...,a_n of polynomial P(x)P(x), point x0x_0.

  • OUTPUT: Values of P(x0)P(x_0) and P(x0)P'(x_0).

  • Step 11: Set y=any = a_n (bnb_n for QQ), z=0z=0 (cn+1c_{n+1} for RR).

  • Step 22: For j=n1,n2,...,0j=n-1,n-2,...,0, set

    • z=y+zx0z=y+z*x_0 (cj+1c_{j+1} for RR),
    • y=aj+yx0y =a_j+y*x_0 (bjb_j for QQ).
  • Step 33: OUTPUT y:P(x0)y:P(x_0) and z:P(x0)z:P'(x_0).

2.6.3 Deflation Method

Newton’s method combined with Honor’s method

数值计算详细笔记(二):非线性方程解法

Deflation Method (压缩技术)

Suppose that xNx_N in the Nth iteration of the Newton-Raphson procedure, is an approximation zero of P(x)P(x), then
P(x)=(xxN)Q(x)+b0=(xxN)Q(x)+P(xN)(xxN)Q(x). P(x)=(x-x_N)Q(x)+b_0=(x-x_N)Q(x)+P(x_N)\approx (x-x_N)Q(x).

Let x1^=xN\hat{x_1}=x_N be the approximate zero of PP, and Q1(x)=Q(x)Q_1(x)=Q(x) be the approximate factor, then we have
P(x)(xx1^)Q1(x). P(x)\approx (x-\hat{x_1})Q_1(x).

To find the second approximate zero of P(x)P(x), we can use the same procedure to Q1(x)Q_1(x), give Q1(x)(xx2^)Q2(x)Q_1(x)\approx(x-\hat{x_2})Q_2(x), where Q2(x)Q_2(x) is a polynomial of degree n2n-2. Thus P(x)(xx1^)Q1(x)(xx1^)(xx2^)Q2(x)P(x)\approx (x-\hat{x_1})Q_1(x)\approx (x-\hat{x_1})(x-\hat{x_2})Q_2(x).

Repeat this procedure, till Qn2(x)Q_{n-2}(x) which is an quadratic polynomial and can be solved by quadratic formula. We can get all approximate zeros of P(x)P(x). This method is called deflation method.

2.6.4 Muller’s Algorithm

数值计算详细笔记(二):非线性方程解法

数值计算详细笔记(二):非线性方程解法