The Proof of Hoeffding's Inequality

Hoeffding’s Inequality:

Suppose x1,x2,xNx_1,x_2,……x_N are independent random variables, and xi[ai,bi]x_i∈[a_i, b_i], i=1,2,,Ni=1,2,……,N; Xˉ\bar{X} is the expirical mean of x1,x2,xNx_1,x_2,……x_N, namely, Xˉ=1Ni=1Nxi\bar{X}=\frac{1}{N}\sum_{i=1}^{N}{x_i}, then for any t>0t>0, the following inequality holds:
P[XˉE(Xˉ)t]exp(2N2t2i=1N(biai)2)P[ \bar{X}-E(\bar{X})\geq t ]\leq exp\left(-\frac{2N^2t^2}{\sum_{i=1}^{N}{(b_i-a_i)^2}} \right)
P[E(Xˉ)Xˉt]exp(2N2t2i=1N(biai)2)P[E(\bar{X})- \bar{X}\geq t ]\leq exp\left(-\frac{2N^2t^2}{\sum_{i=1}^{N}{(b_i-a_i)^2}} \right)

Hoeffding’s lemma:

Suppose xx is an random variable, and x[a,b]x∈[a, b], and E(x)=0E(x)=0, then for any t>0t>0,the following inequality holds:
E(etx)expt2(ba)28E(e^{tx})\leq exp\frac{t^2(b-a)^2}{8}

We prove the lemma first:

Obviously, f(x)=etxf(x)=e^{tx} is a convex function, so for any α[0,1]α∈[0,1], we have:
f(αx1+(1α)x2)αf(x1)+(1α)f(x2)f(αx_1+(1-α)x_2)\le αf(x_1)+(1-α)f(x_2)
The Proof of Hoeffding's Inequality
Let α=bxba,x[a,b]α=\frac{b-x}{b-a}, \forall x∈[a,b], then αx1+(1α)x2=xαx_1+(1-α)x_2=x, so we have:
etxbxbaeta+xabaetb,x[a,b]e^{tx} \leq \frac{b-x}{b-a} e^{ta}+\frac{x-a}{b-a} e^{tb}, \forall x \in[a, b]
take expectations on both sides, and because of E(x)=0E(x)=0, there is:
E(etx)bbaetaabaetbE(e^{tx}) \leq \frac{b}{b-a} e^{ta}-\frac{a}{b-a} e^{tb}
Try to simplify it.
Let p=abap=-\frac{a}{b-a}, then right=(1p)et(ba)p+pet(ba)(p1)right=(1-p)e^{-t(b-a)p}+pe^{-t(b-a)(p-1)},
let h=t(ba)h=t(b-a), then right=(1p)ehp+peh(1p)=ehp(1p+peh)right=(1-p)e^{-hp}+pe^{h(1-p)}=e^{-hp}(1-p+pe^h), note the function L(h)=hp+ln(1p+peh),h>0L(h)=-hp+ln(1-p+pe^h),h>0, now we need to prove L(h)t2(ba)28=h28L(h) \le \frac{t^{2}(b-a)^{2}}{8}=\frac{h^{2}}{8}.

Give two ways to evident this inequality above:

  1. Construct function, g(h)=L(h)h28=hp+ln(1p+peh)h28,h>0g(h)=L(h)-\frac{h^{2}}{8}=-hp+ln(1-p+pe^h)-\frac{h^{2}}{8},h>0
    Obviously, g(0)=0,g(0)=0,g(h)=(1p)peh[(1p)+(peh)]2140g(0)=0, g'(0)=0,g''(h)=\frac{(1-p) p e^{h}}{[(1-p)+(p e^{h})]^2}-\frac{1}{4} \le0, according to the monotonicity, g(h)0g(h)\le0, namely L(h)h28=t2(ba)28L(h) \le \frac{h^{2}}{8}=\frac{t^{2}(b-a)^{2}}{8}.

  2. Apply taylor expansion to function g(h)g(h), g(h)=0+0+g(0)2!(h0)2+o(h2)=((1p)peh[(1p)+(peh)]214)h2+o(h2)0g(h)=0+0+\frac{g''(0)}{2!}(h-0)^2+o(h^2)=(\frac{(1-p) p e^{h}}{[(1-p)+(p e^{h})]^2}-\frac{1}{4})h^2 +o(h^2)\le0, namely L(h)h28=t2(ba)28L(h) \le \frac{h^{2}}{8}=\frac{t^{2}(b-a)^{2}}{8}.

So the hoeffding’s lemma is true.

To be continue……

reference:

https://www.cnblogs.com/kolmogorov/p/9518867.html
http://dict.cnki.net/
李航《统计学习方法》 北京:清华大学出版社,2019