Since pn=2an+bn and p∈(an,pn] or p∈[pn,bn) for all n≥1, it follows that
∣pn−p∣≤2∣bn−an∣=2nb−a.
Convergence Rate
Since
∣pn−p∣≤2∣bn−an∣=2n∣b−a∣
The Sequence {pn}n=1∞ converges to p with rate of convergence O(2n1), that is
pn=p+O(2n1)
Other Property
Bisection is certain to converge, but does so slowly.
Given starting interval [a,b], length of interval after k iterations is 2kb−a, so achieving error tolerance of ε(2k(b−a)<ε) requires k≈[log2εb−a] iterations, regardless of function f involved.
2.2 Fixed Point Method
2.2.1 Introduction
Fixed point of given function g:R→R is value x∗ such that x∗=g(x∗)
Many iterative methods for solving nonlinear equations use fixed-point iteration scheme of form xk+1=g(xk)
where fixed points for g are solutions for f(x)=0.
Example
If f(x)=x2−x−2, it has two roots x∗=2 and x∗=−1. Then fixed points of each of functions g(x)=x2−2g(x)=x+2g(x)=1+x2g(x)=2∗x−1x2+2
are solutions to equation f(x)=0.
2.2.2 Algorithm
Definition
To approximate the fixed point of a function g(x), we choose an initial approximation p0, and generate the sequence {pn}n=0∞ by letting {Givenpn=p0g(pn−1),n=0,1,...,
for each n≥1.
If the sequence {pn}n=0∞ converges to p and g(x) is continuous, then we have p=n→∞limpn=n→∞limg(pn)=g(n→∞limpn)=g(p).
and a solution to x=g(x) is obtained.
This technique described above is called fixed point iteration (or functional iteration).
Example
Pseudo-Code
INPUT: Initial approximation p0, tolerance TOL, Maximum number of iteration N.
OUTPUT: approximate solution p or message of failure.
Step 1: Set n=1.
Step 2: While n≤N, do Steps 3~5.
Step 3: Set p=g(p0).
Step 4: If ∣p−p0∣<TOL, then output p; (Procedure complete successfully.) Stop!
Step 5: Set n=n+1,p0=p.
Step 6: OUTPUT “Method failed after N iterations.” STOP!
2.2.3 Existence and Uniqueness
Theorem: Sufficient Conditions for the Existence and Uniqueness of a Fixed Point
【Existence】If g(x)∈C[a,b] and g(x)∈[a,b] for all x∈[a,b], then g(x) has a fixed point in [a,b]. (g(x)∈[a,b] means max{g(x)}≤b,min{g(x)}≥a)
【Uniqueness】If, in addition, g′(x) exists on (a,b), and a positive constant k<1 exists with ∣g′(x)∣≤k, for all x∈(a,b).
Meet the two conditions above means the fixed point in [a,b] is unique.
Proof for Existence
If g(a)=a or g(b)=b, then g(x) has a fixed point at an endpoint.
Suppose not, then it must be true that g(a)>a and g(b)<b.
Thus the function h(x)=g(x)−x is continuous on [a,b], and we have h(a)=g(a)−a>0,h(b)=g(b)−b<0.
The Intermediate Value Theorem implies that there exists p∈(a,b) for h(x)=g(x)−x which h(p)=0.
Thus g(p)−p=0, and p is a fixed point of g(x).
The proving process above changes g(x)=x to f(x)=g(x)−x, changing fixed point problem into zero point existing problem.
Proof for Uniqueness
Using contradiction method to prove this characteristic.
Suppose, in addition, that ∣g′(x)≤k≤1∣ and that p and q are both fixed points in [a,b] with p=q.
Then by the Mean Value Theorem, a number ξ exists between p and q. Hence, in [a,b], p−qg(p)−g(q)=g′(ξ)
exists.
Then ∣p−q∣=∣g(p)−g(q)∣=∣g′(ξ)∣∣p−q∣≤k∗∣p−q∣<∣p−q∣,
which is a contradiction.
So p=q, and the fixed point in [a,b] is unique.
2.2.4 Convergence Analysis
Theorem
Meet the two conditions above, we can find that, for any number p0∈[a,b], the sequence {pn}0∞ defined by pn=g(pn−1),n≥1
converges to the unique fixed point p in [a,b].
Proof
Using the Intermediate Value Theorem, we can prove the existence.
Using the fact that ∣g′(x)∣≤k and the Mean Value Theorem, we have ∣pn−p∣=∣g(pn−1−g(p))∣=∣g′(ξ)∣∣pn−1−p∣≤k∗∣pn−1−p∣,
where ξ∈(a,b).
Applying this inequality inductively shows ∣pn−p∣≤k∗∣pn−1−p∣≤k2∗∣pn−2−p∣≤...≤kn∗∣p0−p∣.
Since k<1, n→∞lim∣pn−p∣≤n→∞limkn∣p0−p∣=0,
and {pn}0∞ converges to p.
Convergence Rate
Since ∣pn−p∣≤kn∗∣p0−p∣≤kn∗∣b−a∣,
the Sequence {pn}n=0∞ converges to p with rate of convergence O(kn) with k<1, that is pn=p+O(kn),k<1.
2.2.5 Bounds for the Error
Corollary
If the solution for g(x)=x exists and is unique, then the bounds for the error involved in using pn to approximate p are given by ∣pn−p∣≤kn∗max{p0−a,b−p0}
and ∣pn−p∣≤1−kkn∗∣p1−p0∣,
for all n≥1.
Let m>n, using the theorem ∣a+b∣≤∣a∣+∣b∣, then we have ∣pm−pn∣≤∣pm−pm−1∣+∣pm−1−pm−2∣+...+∣pn+1−pn∣≤(km−1+km−2+...+kn)∣p1−p0∣∣pm−pm∣≤kn∗(1+k+...+km−n−1)∣p1−p0∣≤kn∗1−k1−km−n−1∗∣p1−p0∣.
Let m→∞, we have m→∞lim∣pm−pn∣=∣p−pn∣≤1−kkn∗∣p1−p0∣.
2.3 Newton’s Method
2.3.1 Introduction
Status
The Newton-Raphson (or simply Newton’s) method is one of the most powerful and well-known numerical methods for solving a root-finding problem f(x)=0.
Rough Description
(1) Suppose that f∈C2[a,b], and x∗ is a solution of f(x)=0.
(2) Let x^∈[a,b] be an approximation to x∗ such that f′(x^)=0 and ∣x^−x∗∣ is “small”.
Consider the first Taylor polynomial for f(x) expanded about x^, f(x)=f(x^)+(x−x^)f′(x^)+2(x−x^)2f′′(ξ(x)).
where ξ(x) lies between x and x^.
Thus, consider f(x∗)=0, and gives 0=f(x∗)≈f(x^)+(x∗−x^)f′(x^).
Solution for finding x∗ is x∗≈x^−f′(x^)f(x^).
2.3.2 Algorithm
Definition
Starts with an initial approximation x0
Defined iteration scheme by xn=xn−1−f′(xn−1)f(xn−1),∀n≥1
This scheme generates the sequence {xn}0∞
Example
Pseudo-Code
The function f is differentiable and p0 is an initial approximation.
INPUT: Initial approximation p0, tolerance TOL, Maximum number of iteration N.
OUTPUT: approximate solution p or message of failure.
Step 1: Set n=1.
Step 2: While n≤N, do Steps 3~5.
Step 3: Set p=p0−f′(p0)f(p0).
Step 4: If ∣p−p0∣<TOL, then output p; (Procedure complete successfully.) Stop!
Step 5: Set n=n+1,p0=p.
Step 6: OUTPUT “Method failed after N iterations.” STOP!
2.3.3 Convergence
Theorem
(1) f∈C2[a,b].
(2) p∈[a,b] is such that f(p)=0 and f′(p)=0.
Then there exists a δ>0 such that Newton’s method generates a sequence {pn}1∞ converging to p for any initial approximation p0∈[p−δ,p+δ].
Proof
pn=g(pn−1)g(x)=x−f′(x)f(x)
Things need to prove: According to the proofing process of the fixed point method, we need to find an interval [p−δ,p+δ] that g maps into itself, and ∣g′(x)∣≤k<1 for all x∈[p−δ,p+δ]. (Existence and Uniqueness)
Proving Process:
∃δ1>0,g(x)∈C[p−δ1,p+δ1]
Since f′(p)=0 and f′ is continuous, there exists δ1>0 such that f′(x)=0 and $f’(x)\in C[a,b] $ with ∀x∈[p−δ1,p+δ1].
g′(x)∈C[p−δ1,p+δ1]
g′(x)=(f′(x))2f(x)f′′(x) for all x∈[p−δ1,p+δ1]. Since f∈C2[a,b], g′(x)∈[p−δ1,p+δ2].
∣g′(x)∣≤k<1
f(p)=0,g′(p)=0
Since g′∈C[p−δ1,p+δ1], there exists a δ with 0<δ<δ1 and
∣g′(x)∣≤k<1,∀x∈[p−δ,p+δ].
g∈[p−δ,p+δ]↦[p−δ,p+δ]
According to the Mean Value Theorem, if x∈[p−δ,p+δ], there exists ξ∈[x,p],∣g(x)−g(p)∣=∣g′(ξ)∣∣x−p∣∣g(x)−p∣=∣g(x)−g(p)∣=∣g′(ξ)∣∣x−p∣≤k∣x−p∣<∣x−p∣<δ
Thus, g∈[p−δ,p+δ]↦[p−δ,p+δ].
According to the proving process above, all the hypotheses of the Fixed-Point Theorem are satisfied for g(x)=x−f′(x)f(x). Therefore, the sequence ${p_n}_{n=1}^\infty $ defined by pn=g(pn−1),∀n≥1
converges to p for any p0∈[p−δ,p+δ].
2.3.4 Secant Method
Background
For Newton’s method, each iteration requires evaluation of both function f(xk) and its derivative f′(xk), which may be inconvenient or expensive.
Improvement
Derivative is approximated by finite difference using two successive iterates, so iteration becomes xk+1=xk−f(xk)∗f(xk)−f(xk−1)xk−xk−1.
This method is known as secant method.
Example
Pseudo-Code
INPUT: Initial approximation p0,p1, tolerance TOL, Maximum number of iteration N.
OUTPUT: approximate solution p or message of failure.
Step 1: Set n=2,q0=f(p0),q1=f(p1).
Step 2: While n≤N, do Steps 3~5.
Step 3: Set p=p1−q1∗q1−q0p1−p0.(Compute pi)
Step 4: If ∣p−p1∣<TOL, then output p; (Procedure complete successfully.) Stop!
Step 5: Set n=n+1,p0=p1,p1=p,q0=q1,q1=f(p).
Step 6: OUTPUT “Method failed after N iterations.” STOP!
2.3.5 False Position Method
Definition
To find a solution of f(x)=0 for a given continuous function f on the interval [p0,p1], where f(p0) and f(p1) have opposite signs f(p0)∗f(p1)<0.
The approximation p2 is chosen in same manner as in Secant Method, as the x-intercept (x轴截距) of the line joining (p0,f(p0)) and (p1,f(p1)).
To decide the Secant Line to compute p3, we need to check f(p2)∗f(p1) or f(p2)∗f(p0).
If f(p2)∗f(p1) is negative, we choose p3 as the x-intercept for the line joining (p1,f(p1) and p2,f(p2).
In a similar manner, we can get a sequence {pn}2∞ which approximates to the root.
Example
Pseudo-Code
INPUT: Initial approximation p0,p1, tolerance TOL, Maximum number of iteration N.
OUTPUT: approximate solution p or message of failure.
Step 1: Set n=2,q0=f(p0),q1=f(p1).
Step 2: While n≤N, do Steps 3~6.
Step 3: Set p=p1−q1∗q1−q0p1−p0.(Compute pi)
Step 4: If ∣p−p1∣<TOL, then output p; (Procedure complete successfully.) Stop!
Step 5: Set n=n+1,q=f(p).
Step 6: If q∗q1<0, set p0=p,q0=q; else set p1=p,q1=q.
Step 6: OUTPUT “Method failed after N iterations.” STOP!
2.4 Error Analysis for Iteration Methods
2.4.1 The Rate of Sequence Convergence
Definition
Suppose {pn}n=0∞ is a sequence that converges to p, with pn=p for all n.
If positive constants λ and α exist with n→∞lim∣pn−p∣α∣pn+1−p∣=λ,
then {pn}n=0∞ converges to p of order α, with asymptotic error constant λ.
Properties
A sequence with a high order of convergence converges more rapidly than a sequence with a lower order.
The asymptotic constant affects the speed of convergence but is not as important as the order.
Example
If α=1, the sequence is linearly convergent.
If α=2, the sequence is quadratically convergent.
Summary
Using the Mean Value Theorem to prove Linear Convergence and the Taylor’s Theorem to prove Quadratic Convergence with g′(p)=0.
2.4.2 Convergent Order of Fixed-Point Iteration (Improved)
Convergent Oder of Fixed-Point Iteration
(1) g∈C[a,b] for all x∈[a,b]
(2) g′(x) is continuous on (a,b) and a positive constant 0<k<1 exists with ∣g′(x)∣≤k, for all x∈(a,b).
If g′(p)=0, then for any number p0 in [a,b], the sequence pn=g(pn−1), for n≥1, converges only linearly to the unique fixed point p in [a,b].
Proof
pn+1−p=g(pn)−g(p)=g′(ξn)(pn−p),
where ξn is between pn and p.
Since {pn}n=0∞ converges to p, and ξn is between pn and p, thus {ξn}n=0∞ also converges to p.
Thus, n→∞lim∣pn−p∣∣pn+1−p∣=n→∞lim∣g′(ξn)∣=∣g′(p)∣,
fixed-point iteration exhibits linear convergence with asymptotic error constant ∣g′(p)∣ whenever g′(p)=0, which also implies that higher-order convergence for fixed-point methods can occur only when g′(p)=0.
Quadratical Convergence
Let p be a solution of the equation x=g(x).
(1) g′(p)=0
(2) g′′ is continuous and strictly bounded by M on an open interval I containing p.
Then there exists a δ>0 such that, for p0∈[p−δ,p+δ], the sequence defined by pn=g(pn−1), when n≥1, converges at least quadratically to p.
Moreover, for sufficiently large values of n, ∣pn+1−p∣<2M∣pn−p∣2.
Proof
Due to the two conditions described above, g(x)=g(p)+g′(p)(x−p)+2g′′(ξ)(x−p)2=p+2g′′(ξ)∗(x−p)2
is derived, that ξ lies between x and p.
Thus p is a simple root of u(x). Then we change the Newton’s method into g(x)=x−u′(x)u(x)=x−f′(x)2−f(x)f′′(x)f(x)f′(x)
whose convergence rate is also quadratic.
2.4.5 Convergence rate of Secant Method
Convergence rate of secant method is normally superlinear, with r≈1.618, which is lower than Newton’s method.
Secant method need to evaluate two previous functions per iteration, there is no requirement to evaluate the derivative.
Its disadvantage is that it needs two starting guesses which close enough to the solution in order to converge.
2.5 Accelerating Convergence
2.5.1 Aitken’s method
Background
Accelerating the convergence of a sequence that is linearly convergent, regardless of its origin or application.
n→∞limpn−ppn+1−p=λ,λ=0.
Thus, when n is sufficiently large, pn−ppn+1−p≈pn+1−ppn+2−pp≈pn+2−2∗pn+1+pnpn∗pn+2−pn+12p≈pn−pn+2−2∗pn+1+pn(pn+1−pn)2
Aitken’s Δ method is to define a new sequence p^n=0∞: p^=pn−pn+2−2∗pn+1+pn(pn+1−pn)2,
whose convergence rate is faster than the original sequence {pn}n=0∞.
Definition
Given the sequence {pn}n=0∞, the forward difference Δpn is defined by Δpn=pn+1−pn,n≥0.
Higher powers Δkpn are defined recursively by Δkpn=Δ(Δk−1pn),k≥2.
For example, Δ2pn=Δ(Δpn)=Δ(pn+1−pn)=Δ(pn+1)−Δ(pn)=pn+2−2pn+1+pn.
Thus, pn^=pn−Δ2pn(Δpn)2.
Result: the sequence {pn^}n=0∞ converges to p faster than {pn}n=0∞ in the sense that n→∞limpn−pp^n−p=0.
Proof:
2.5.2 Steffensen’s method
Definition
The function is p=g(p), and the initial approximation is p0, p0^=p0−Δ2p0(Δp0)2.
Assume that p0^ is a better approximation than p2, so applying fixed-point iteration to p0^ instead of p2, and the computing process shows below.
Theorem
Suppose that x=g(x) has the solution p with g′(p)=1.
If there exists a δ>0 such that g∈C3[p−δ,p+δ],
then Steffensen’s method gives quadratic convergence for any p0∈[p−δ,p+δ].
Pseudo-Code
INPUT: Initial approximation p0, tolerance TOL, Maximum number of iteration N.
OUTPUT: approximate solution p or message of failure.
Step 1: Set n=1.
Step 2: While n≤N, do Steps 3~5.
Step 3: Set p1=g(p0),p2=g(p1),p=p0−p2−2p1+p0(p1−p0)2.
Step 4: If ∣p−p0∣<TOL, then output p; (Procedure complete successfully.) Stop!
Step 5: Set n=n+1,p0=p.
Step 6: OUTPUT “Method failed after N iterations.” STOP!
2.6 Zeros of Polynomials and Muller’s Method
2.6.1 Polynomial Theorem
2.6.2 Horner’s Method
Background
A more efficient method to calculate the P(x0) and P′(x0) for a polynomial P(x).
Theorem
Let P(x)=i=0∑i=naixi.
Construction process for P(x_0) (Substitute formulas one by one to verify)
if bn=an and bk=ak+bk+1x0,k∈[0,n−1],
then b0=P(x0).
Moreover, if Q(x)=i=1∑nbixi−1
then P(x)=(x−x0)Q(x)+b0.
Construction process for P’(x_0) (Substitute formulas one by one to verify)
Let Q(x)=i=1∑nbixi−1=(x−x0)R(x)+c1, where R(x)=i=2∑ncixi−2. Thus cn=bn,ck=bk+ck+1x0,k∈[1,n−1],Q(x0)=c1=P′(x0).
Pseudo-Code
To compute the value P(x0) and P′(x0) for a function P(x)=i=0∑naixi.
INPUT: Degree n, coefficients a0,a1,...,an of polynomial P(x), point x0.
OUTPUT: Values of P(x0) and P′(x0).
Step 1: Set y=an (bn for Q), z=0 (cn+1 for R).
Step 2: For j=n−1,n−2,...,0, set
z=y+z∗x0 (cj+1 for R),
y=aj+y∗x0 (bj for Q).
Step 3: OUTPUT y:P(x0) and z:P′(x0).
2.6.3 Deflation Method
Newton’s method combined with Honor’s method
Deflation Method (压缩技术)
Suppose that xN in the Nth iteration of the Newton-Raphson procedure, is an approximation zero of P(x), then P(x)=(x−xN)Q(x)+b0=(x−xN)Q(x)+P(xN)≈(x−xN)Q(x).
Let x1^=xN be the approximate zero of P, and Q1(x)=Q(x) be the approximate factor, then we have P(x)≈(x−x1^)Q1(x).
To find the second approximate zero of P(x), we can use the same procedure to Q1(x), give Q1(x)≈(x−x2^)Q2(x), where Q2(x) is a polynomial of degree n−2. Thus P(x)≈(x−x1^)Q1(x)≈(x−x1^)(x−x2^)Q2(x).
Repeat this procedure, till Qn−2(x) which is an quadratic polynomial and can be solved by quadratic formula. We can get all approximate zeros of P(x). This method is called deflation method.