Smooth Sensitivity_Smooth Sensitivity and Sampling in Private Data Analysis
Smooth Sensitivity and Sampling in Private Data Analysis
文章:Nissim, Kobbi, Sofya Raskhodnikova, and Adam Smith. "Smooth sensitivity and sampling in private data analysis." symposium on the theory of computing (2007): 75-84.
说明:部分证明是自己推导地,因此和论文中的可能有一点差异;第5节没有放上来。
Abstract
- Framework: release functions f of the data with instance-specific additive noise.
\[ Noise\ Magnitude \leftarrow function\ f \ +\ database\ x \nonumber \]
- Smooth sensitivity (of f on database x): a measure of variability of f in the neighborhood of x.
- Work:
- How to compute or approximate the smooth sensitivity of f for several functions
- A generic procedure based on sampling
1. Intro
- Data privacy
- Data analysis
- Challenges: release aggregate information about the databases while protecting the privacy of individual contributors.
- Differential Privacy
- Output perturbation
- Global sensitivity
- Noise <-- only on function f
- Instance-specific Noise
- Smooth sensitivity
- Noise <-- function f + database x
- Pros: release f(x) with less noise --> expand the application
- Real number in a bounded range
- Graphs where individuals hold information about edge
- Cons: smooth sensitivity <-- non-trivial, not automated
- Sample and Aggregate
- fully automated, efficient
- even when f is a black box
- E.g. k-means clustering, mixture of Gaussian model
1.1 privacy Definition
- Def 1.1 -- Differential Privacy
- Lemma 1.2 -- Composition
1.2 Calibrating Noise to Sensitivity
- Def 1.3 -- Global Sensitivity
- Def 1.4 -- Laplacian Distribution
\[ \begin{align} Lap(\lambda): \nonumber \\ & pdf:\ \ h(z) = \frac 1 {2\lambda} e^ {-|z|/\lambda} \nonumber \\ & mean:\ \ E(z) = 0 \nonumber \\ &std: \ \ std(z) = \sqrt{2}\lambda \nonumber \end{align} \]
- Claim 1.5 Laplacian Mechanism
- Two approaches:
(1) GSf is low --> add noise Lap(GSf /ε)
(2) Express f by g1, g2 ...--> add noise to g1, g2 ... --> composition property mark
1.3 Smooth Sensitivity
- Def 1.6 -- Local sensitivity
- Relationship with global sensitivity
\[ GS_f(x) = max_x LS_f(x) \nonumber \]
-
Problem: Noise magnitude reveals information about database
--> Noise magnitude is guaranteed to be an insensitive function
--> A class of smooth upper bounds Sf on LSf --> Add noise ∝ Sf is safe
--> Add less noise --> Optimal Sf*: Sf ≥ Sf*
1.4 The Sample and Aggregate Framework
- Idea: Replace f with a related function f which smooth sensitivity is
- low
- efficiently computable
- Steps
- evaluate f on random samples from x for several times
- combine the results with aggregation function(center of attention)
Intuition: Each entry xi can be changed without affecting the value of function signicantlt
- Applications:
- k-means cluster centers (data is well separated)
- accurate parameters of a mixture of k spherical Gaussian distributions
2. Instance-Specific Additive Noise
- Interactive framework:
\[ {\cal A}(x) = f(x)+N(x)Z \nonumber \]
Z: a random variable drawn from a noise distribution in Rd with std=1
N(x): noise magnitude ∝ Sensitivity
2.1 Smooth Upper Bounds on LSf and Smooth Sensitivity
-
Example:
fmed(x) = median(x1 ... xn), xi∈ D = [0, Λ]
Simplify: n is odd, x1 < x2 < ... < xn --> m = (n+1)/2, fmed(x) = xm
Global Sensitivity: GSf = Λ --> N(x) ∝ GSf is too large
-
Local Sensitivity: LSf = max(xm-xm-1, xm+1-xm) --> N(x) ∝ LSf reveal information about database
--> E.g.
database x : x1 = ... = xm+1 = 0, xm+2 = ... = xn = Λ --> LSf (x) = 0 --> Pr[A(x)≠0] = 0
database x' : x1 = ... = xm = 0, xm+1 = ... = xn = Λ --> LSf (x)= Λ -->Pr[A(x')≠0] > 0
--> can not satisfy differential privacy:
\[ {Pr[A(x)≠0] \over Pr[A(x')≠0]} \leq e^\epsilon +\delta \nonumber \] Lesson: N(x) has to be an insensitive function
Def 2.1 -- A Smooth Bound on LSf
(1) upper bounds on LSf
(2) not sensitive
- Def 2.2 -- Smooth Sensitivity
- Lemma 2.3
-
proof
Def 2.1_(1)
\[ \begin{align} S_{f,\beta}^*(x) &= max\left(LS_f(x),\mathop{max} \limits_{y \neq x;\ y \in D^n}(LS_f(y)e^{-\beta d(x,y)})\right) \nonumber \\ &{\geq} LS_f(x) \nonumber \end{align} \]-
Def 2.1_(2)
\[ \begin{align} Suppose: & \nonumber \\ &S_{f,\beta}^*(x) = LS_f(x')e^{-\beta d(x,x')} \nonumber \\ &d(x,y)=1 \rightarrow d(x',y) \leq d(x',x)+d(x,y) = d(x,x')+1 \nonumber \end{align} \]\[ \begin{align} S_{f,\beta}^*(y) &\geq LS(x')e^{-\beta d(x',y)} \nonumber \\ &\geq LS(x')e^{-\beta \left(d(x,x')+1\right)} \nonumber \\ &\geq e^{-\beta} \cdot LS(x')e^{-\beta d(x,x')} \nonumber \\ &=e^{-\beta}S_{f,\beta}^*(x)\nonumber \end{align} \]
-
Optimal: S(x) ≥ Sf,β*(x)
\[ S(x) ≥ S_{f,β}^*(x) \text{ for all}\ x\in D^n \rightarrow S(x)\geq LS_f(y)e^{-\beta d(x,y)}\text{ for all}\ x,y\in D^n \nonumber \]\[ \begin{align} Suppose: & \nonumber \\ &S(x') \geq LS_f(y)e^{-\beta d(x',y)}\ \ \ \text{for all x', y at d(x',y)=k} \nonumber \\ Consider: & \nonumber \\ &d(x,y)=k+1\ \text{,there eixsts}\ \ x':d(x,x')=1, d(x',y)=k \nonumber \end{align} \]
\[ \begin{align} S(x)&\geq e^{-\beta}S(x') \nonumber \\ &\geq LS_f(y) e^{-\beta(d(x',y)+1)} \nonumber \\ &= LS_f(y) e^{-\beta d(x,y)} \nonumber \end{align} \]
2.2 Calibrating Noise
Noise Magnitude
\[ N(x)={S_f(x) \over \alpha} \nonumber \]Corollary 2.4
2.2.1 Admissible Noise Distribution
Notion:
\[ S+\Delta \ \mathop = \limits^\Delta \ \{z+\Delta|z\in S\} \nonumber \\ e^\lambda \cdot S \ \mathop = \limits^\Delta \ \{e^\lambda \cdot z|z\in S\} \nonumber \\ a±b \ \mathop = \limits^\Delta \ [a-b,a+b] \nonumber \]Def 2.5 -- Admissible Noise Distribution
- Lemma 2.6 -- (α, β)-admissible vs. (ε, δ)-differential privacy
-
Proof.
Assumption
\[ N(x) = {S(x) \over \alpha},\ N(y) = {S(y) \over \alpha} \nonumber \]proof of conclusion
\[ \begin{align} Pr[A(x) \in S] &= Pr\left[Z \in {S-f(x) \over N(x)}\right] \nonumber \\ &\leq e^{\epsilon/2} Pr\left[Z \in {S-f(y) \over N(x)}\right] +\delta/2 \nonumber \\ &\leq e^{\epsilon} Pr\left[Z \in {S-f(y) \over N(y)}\right] +\delta \nonumber \\ &= e^{\epsilon} Pr[f(y) + N(y)Z \in S] +\delta \nonumber \\ &= e^{\epsilon} Pr[A(y)\in S] +\delta \nonumber \\ \end{align} \]-
proof on condition
\[ \begin{align} ||\Delta||_1 &= ||f(x)-f(y)||_1/N(x) \nonumber \\ &= \alpha ||f(x)-f(y)||_1/S(x) \nonumber \\ &\leq \alpha ||f(x)-f(y)||_1/LS_f(x) \nonumber \\ &\leq \alpha \nonumber \end{align} \]\[ \begin{align} |\lambda| &= \left|ln{N(x) \over N(y)}\right| \nonumber \\ &= \left|ln{S(x) \over S(y)}\right|\nonumber \\ &\leq |ln(e^{±\beta})| \nonumber \\ &=\beta \nonumber \end{align} \]
2.2.2 Examples of Admissible Noise Distributions
-
proof ideas: (ε, δ)-differential privacy --> (α, β)-admissible -->
sliding property
\[ ln{S \over S+\Delta}=ln{h(z) \over h(z+\Delta)} \nonumber \]dilation property
\[ ln{S \over e^ \lambda S} = ln{h(z)\over e^{d\lambda}h(e^\lambda z)} \nonumber \]
-
Three families:
- a generation of Cauchy distribution
- Laplacian distribution
- Gaussian distribution
Lemma 2.7 -- Generation of Cauchy Distribution
-
proof.
dilation
\[ \begin{align} ln{h(z)\over e^{\lambda}h(e^\lambda z)} &= ln{1+(e^\lambda|z|)^\gamma \over e^\lambda(1+|z|^\gamma)} \ \ \ \ \ \ (λ>0) \nonumber \\ &\leq ln{e^{\lambda \gamma}|z|^\gamma \over e^\lambda|z|^\gamma} \nonumber \\ &=\lambda(\gamma-1) \leq \epsilon/2 \nonumber \\ \lambda &\leq {\epsilon \over 2(\gamma-1)} \nonumber \end{align} \]sliding
\[ \begin{align} ln{h(z) \over h(z+\Delta)} &= \varPhi(|z+\Delta|)-\varPhi(|z|)\ \ \ \ \varPhi(x) = ln(1+x^\lambda) \nonumber \\ &\leq |\Delta| \varPhi'(\xi) \nonumber \\ \varPhi'(x) &= {\lambda x^{\lambda-1} \over x^\lambda +1} \nonumber \\ &= {\lambda \over x+x^{1-\lambda}} \nonumber \\ &\leq \lambda \nonumber \\ ln{h(z) \over h(z+\Delta)} &\leq |\Delta| \lambda \leq \epsilon/2 \nonumber \\ ||\Delta||_1 &\leq {\epsilon \over 2 \lambda} \nonumber \end{align} \]-
d-dim
\[ \begin{align} ln{h(z)\over e^{d\lambda}h(e^\lambda z)} &= ln{\left(1+(e^\lambda|z|)^\gamma\right)^d \over e^{d\lambda}(1+|z|^\gamma)^d} \ \ \ \ \ \ (λ>0) \nonumber \\ &\leq d\lambda(\gamma-1)<\epsilon/2 \nonumber \\ \lambda &\leq {\epsilon \over 2d(\gamma-1)}\nonumber \end{align} \]\[ \begin{align} ln{h(z) \over h(z+\Delta)} &= \varPhi(|z_1+\Delta_1|)-\varPhi(|z_1|)+\ ...\ +\varPhi(|z_d+\Delta_d|)-\varPhi(|z_d|) \nonumber \\ &\leq |\Delta_1| \varPhi'(\xi_1)+\ ...\ +|\Delta_d| \varPhi'(\xi_d) \nonumber \\ &\leq \lambda(|\Delta_1|+\ ...\ |\Delta_d|) \nonumber \\ &=\lambda|\Delta|\leq \epsilon/2 \nonumber \\ ||\Delta||_1 &\leq {\epsilon \over 2 \lambda} \nonumber \end{align} \]
-
Condition for δ=0
- Assume:
\[ \begin{align} h(z) = e^ {-f(z)},f(z)>0 \nonumber \end{align} \]
Sliding:
\[ \begin{align} ln{h(z) \over h(z+\Delta)} &=f(z+\Delta)-f(z) \leq \epsilon/2 \nonumber \\ \Delta f'(\xi) &\leq \epsilon/2 \nonumber \\ f'(\xi) &\leq C \nonumber \end{align} \]Dilation
\[ \begin{align} ln{h(z)\over e^{\lambda}h(e^\lambda z)} &= f(e^\lambda z)-f(z)-\lambda\leq \epsilon/2 \nonumber \\ g(\lambda) &= f(e^\lambda z) \nonumber \\ &= g(0)+g'(0)\lambda+O(\lambda^2) \nonumber \\ &= f(z)+zf'(z)\lambda+O(\lambda^2) \nonumber \\ f'(z) &\leq C/Z \nonumber \end{align} \]
Def 2.8 -- quantile
- Lemma 2.9 -- Laplacian Distribution
-
proof.
- Sliding:
\[ \begin{align} ln{h(z) \over h(z+\Delta)} & =||z+\Delta||_1-|z||_1 \leq \epsilon/2 \nonumber \\ ||\Delta||_1 &\leq \epsilon/2 \nonumber \end{align} \]
- Sliding:
Dilation
\[ \begin{align} ln{h(z)\over e^{d\lambda}h(e^\lambda z)} &= ||z||_1(e^{\lambda}-1) - d\lambda \nonumber \\ &\geq ||z||_1\lambda-d\lambda \nonumber \\ when\ ||z||_1 \leq \rho_{\delta/2}&,\ ln{h(z)\over e^{d\lambda}h(e^\lambda z)} \leq \epsilon/2 \nonumber \\ (\rho_{\delta/2}-d)\lambda &\leq\epsilon/2 \nonumber \\ \lambda &\leq{\epsilon \over 2(\rho_{\delta/2}-d)} \nonumber \\ sufficient\ to\ use\ \beta={\epsilon \over 2\rho_{\delta/2}} &\leq{\epsilon \over 2(\rho_{\delta/2}-d)} \nonumber \\ \end{align} \]Lemma 2.10 -- Gaussian Distribution
3. Computing Smooth Sensitivity
- Def 3.1
compute the smooth sensitivity of f at x --> Compute A(k)(x).
3.1 Smooth Sensitivity of the Median
-
Proposition 3.4 -- Smooth Sensitivity of fmed
Assume:
\[ n\ \ is\ \ odd,\ m=\frac {n+1} 2 \nonumber \\ 0 \leq x_1 \leq \ ...\ \leq x_n \leq \Lambda \nonumber \]
\[ A^{(k)}(x) \rightarrow O(k) \nonumber \\ S_{f_{med,\epsilon}}^*(x) = \mathop{max}_{k=0..n}e^{-k\epsilon}A^{(k)}(x) \rightarrow O(n^2) \nonumber \\ Divide\ and\ Conquer\ algorithm \rightarrow O(n\ logn) \]
-
Algorithm
Idea:
\[ S_{f_{med}}^*(x) = \mathop{max}_{i \leq n/2 \leq j}(x_j - x_i) e^ {-\epsilon(j-i-1)} \nonumber \\ compute\ S_{f_{med}}^*(x) \rightarrow \ find\ j^*(i) = \mathop{argmax}_{j \geq n/2}(x_j-x_i)e^ {-\epsilon(j-i-1)} \nonumber \]Recursive algorithm:
3.2 Smooth Sensitivity of the Minimum
- Claim 3.6 -- Smooth Sensitivity of fmin
Assume:
\[
0 \leq x_1 \leq \ ...\ \leq x_n \leq \Lambda
\]
3.3 Smooth Sensitivity of The Cost of a Minimum Spanning Tree
-
Notion:
Graph
\[ G=(V,E) \]edge weights:
\[ w(e) \in [0, \Lambda] \]cut: A cut in G is a partition of the vertexes V in two nonempty subsets, S and V/S.
\[ S \subset V \rightarrow (S,V/S) \]-
edge (i, j) crosses the cut S:
\[ \{(i,j)\ |\ i \in S, j \in V/S\}\\ \]\[ w_t(S)=t^{th}\ element\ in\ the\ list\{w\left( (i,j) \right)\ |\ i \in S,j \in V/S \} \text{ sorted in non-decreasing order} \\ w_t(S) = \Lambda \text{ if the cut S has less than t edges} \]
\[ l(S) = |\{(i,j)\ |\ i \in S, j \in V/S\}| \]
Lemma 3.7 -- Smooth Sensitivity of fMST
- Lemma 3.8 -- Smooth Sensitivity of fMST _ time
3.4 Smooth Sensitivity of the Number of Triangles in a Graph
-
Notion:
adjacency matrix x --> undirected graph G(n) with n nodes
xi,j = xj,i --> edge
Hamming distance
\[ d(x,x') = \sum_{i \geq j} I \left( x(i,j) \neq x'(i,j)\right) \]Let fΔ(x) --> the number of triangles in the graph represented by x.
ai,j --> the number of triangles involving a potential edge (i, j)
\[ a_{i,j} = \sum_{k \in [n]} x_{i,k} \cdot x_{j,k} \]
- bi,j --> the number of half-built triangles involving a potential edge (i, j)
\[ b_{i,j} = \sum_{k \in [n]} x_{i,k} \bigoplus x_{j,k} \]
Global Sensitivity
\[ GS_f = max(a_{i,j}) = n-2 \]Claim 3.13 -- Smooth Sensitivity of fΔ
-
proof:
\[ y = x + s\ edges \rightarrow d(x,y)=s \]\[ original\ triangles \rightarrow a_{i,j} \nonumber \\ new\ triangles \rightarrow \begin{cases} s &\ if\ s \leq b_{i,j}\\ \left\lfloor\frac {s+b_{i,j}} 2 \right\rfloor&\ if\ s>b_{i,j} \end{cases} \nonumber \]
\[ LS_f(y) = \mathop{min}_{i \neq j} \left( a_{i,j}+\left\lfloor \frac {s+min(s,b_{i,j})} 2 \right\rfloor, n-2\right) = c_{i,j}(s) \]
\[ A^{(s)}(x) = max\left(LS_f(y)\right)=max\left(c_{i,j}(s)\right) \]
4. Sample-Aggregate Framework
4.1 A Motivating Example: Clustering
-
Example:
k-squared-error-distortion (k-SED) clustering (also called “k-means”)
input
\[ x=\{x_1,\ ...\ ,x_n\},\ x_i \in \Bbb{R}^l\ with\ diameter\ \Lambda \]output
\[ f_{cc}=\{c_1,\ ...\ ,c_k\} \ \ \ with\ minimum\ cost \\ cost_x(c_1,...c_k)=\frac 1 n \sum_{i=1}^n \mathop{min}_j||x_i-c_j||_2^2 \]Wasserstein Distance
\[ d_W\{(c_1,...c_k),\ (\hat{c}_1,...\hat{c}_k)\} =\left( \mathop{min}_\pi \sum_{j=1}^k ||c_j-\hat{c}_\pi||_2^2 \right) ^ \frac 1 2 \]
-
Global Sensitivity:
\[ GS_{f_{cc}} = \Omega (\Lambda) \]- Problem
In “well-clustered” instances, the local sensitivity LSf should be low.
Can not approximate a smooth bound on LSf efficiently.
4.2 Basic Framework
Metric Space:
\[ Metric\ {\cal M}, with\ distance\ function\ {\cal d_M(.,.)}\ and\ diameter\ \Lambda \]-
Idea: Replace f with a related function f which smooth sensitivity is
- low
- efficiently computable
-
Framework
Randomly partitioning the database x into m small databases
\[ databases:x|_{U_1} ... x|{U_m} \]Evaluate f on m small databases
\[ f(x|_{U_i})=z_i \]Aggregation by the center of attention g()
\[ \mathop{f}^-(x) = g\left( f(x|_{U_1}), ... , f(x|_{U_m})\right) \]
- Def 4.1 -- Good Approximation
- Theorem 4.2
4.3 Good Aggregations for General Metric Spaces
-
Idea
Independently select m small databases of size n/m (each chosen uniformly without replacement)
Chain (Lemma 4.11)
\[ \begin{align} 1\ points &\rightarrow (at\ most)\ \ s\ databases\ \ (with\ probability\ p) \nonumber \\ &\rightarrow s\ input\ z=f(x|_{U_i})\ of\ g \nonumber \end{align} \]
Def 4.4 -- Smooth Upper Bound with step s
- Def 4.5 -- Good Aggregation
-
Example: Median function
Sensitivity
\[ g(z) = median(z),\ z \in [0,\Lambda]\\ S(z) = \mathop{max}_{k\geq0} LS_g^{(s(k+1))}(z) \cdot e^{-\beta k}\\ \]-
proof.
Smooth Upper Bound (a):
\[ S(z) = max\left(LS_g^{(s)}(z),\ \mathop{max}_{k>0} LS_g^{(s(k+1))}(z) \cdot e^{-\beta k}\right) \geq LS_g^{s}(z) \\ \]
Smooth Upper Bound(b):
\[ \begin{align} Suppose:\ &d(z,z_a) = ks,\ d(z,z') = s\ \rightarrow\ d(z',z_a)\leq (k+1)s \nonumber \\ &S(z) = LS_g^{(s)}(z_a)e^{-\beta k} \nonumber \end{align} \]\[ \begin{align} S(z') &\geq LS_g^{(s)}(z_a)e^{-\beta d(z',z_a)} \nonumber \\ &\geq LS_g^{(s)}(z_a)e^{-\beta (k+1)} \nonumber \\ &= e^{-\beta}S(z) \nonumber \end{align} \]
2(a):
\[ \frac {2m} 3\ entries\ in\ z \in \cal{B}(r) \rightarrow g(z) \in B(2r) \]
2(b):
\[ \begin{align} Condition 1&: k < \frac m {6s} \rightarrow LS_g^{(s(k+1))}(z) \leq 2r \rightarrow S(z)\leq2r \nonumber \\ Condition 2&:k \geq \frac m {6s} \rightarrow LS_g^{(s(k+1))}(z) \leq \Lambda \rightarrow S(z)\leq\Lambda e^{-\frac m {6s}} \nonumber \end{align} \]\[ S(z) \leq max(2r,\Lambda e^{-\frac m {6s}}) \]
NP-hard
4.4 The Center of Attention is a Good Aggregation
- Def 4.6 -- Unconstrained Center of Attention
- when s points are removed from z --> a majority of the remaining points is contained inside the ball
Always exists. May not be unique.
-
Smooth Sensitivity
\[ S_0(z) = 2\ \mathop{max}_{k\geq0}(r^{(z)}(t_0+(k+1)s)e^{-\beta k}) \]\[ r^{(z)}(t):\ the\ minimum\ t-radius\ of\ any\ point\ in\ \cal{M} \]
-
Proof.
- Smooth Upper Bound (a):
\[ (a)\ Ball\ {\cal B}(g_0(z),r^{(z)}(t_0)) \\ (b)\ Ball\ {\cal B}(g_0(z'),r^{(z')}(t_0)) \\ \{z\ \cap\ z'\} \neq \varnothing \ \ \rightarrow\ \ d(g_0(z),g_0(z')) \leq r^{(z)}(t_0)+r^{(z')}(t_0)\\ r^{(z')}(t_0) \leq r^{(z)}(t_0+s) \ \ \rightarrow\ \ LS_{g_0}^{(s)} \leq r^{(z)}(t_0)+r^{(z)}(t_0+s) \leq 2r^{(z)}(t_0+s) \]
- Smooth Upper Bound (b):
\[ \begin{align} S_0(z) &\leq 2\ \mathop{max}_{k\geq0}(r^{(z)}(t_0+(k+2)s)e^{-\beta k}) \nonumber \\ &\leq e^\beta \cdot 2\ \mathop{max}_{k'\geq1}(r^{(z)}(t_0+(k'+1)s)e^{-\beta k'}) \nonumber \end{align} \]
- Suppose:
\[ \frac {2m} 3\ entries\ in\ z \in {\cal B}(c,r)\\ r^{(z)}(t_0)\leq r \]
- 2(a):
\[ \frac {2m} 3\ entries\ in\ z \in {\cal B}(c,r) \rightarrow \{z\ \cap\ {\cal B}(c,r)\} \neq \varnothing \\ d(c,g_0(z)) \leq r+r^{(z)}(t_0) \leq 2r \]
- 2(b):
\[ \begin{align} Condition 1&: k < \frac m {6s} \rightarrow r^{(z)}(t_0+(k+1)s) \leq r \rightarrow S(z)\leq2r \nonumber \\ Condition 2&:k \geq \frac m {6s} \rightarrow r^{(z)}(t_0+(k+1)s) \leq \Lambda \rightarrow S(z)\leq\Lambda e^{-\frac m {6s}} \nonumber \\ \end{align} \]
\[ S_0(z) \leq 2max(r,\Lambda e^{-\frac m {6s}}) \]
Def 4.8 -- Center of Attention
- Smooth sensitivity
\[ S(z) = 2\ \mathop{max}_{k\geq0}(\rho(t_0+(k+1)s)e^{-\beta k}) \]
\[ \rho(t) = \frac1a \sum_{i=1}^ar_i(t) \]
- Theorem 4.9
4.5 Using a Good Aggregation: Proof of Theorem 4.2
- Lemma 4.11 -- Chain's Probability
-
Proof.
Binomial distribution:
\[ f(k,n,p) = \begin{pmatrix} n\\k \end{pmatrix} p^{k}(1-p)^{n-k} \\ \begin{pmatrix} n\\k \end{pmatrix} = {n!\over(n-k)!k!} \]
\[ for\ a\ point\ x,\ the\ probability\ p<\sum_{k \geq s} \begin{pmatrix} m\\k \end{pmatrix} (\frac 1m)^{k}(1-\frac 1m)^{m-k} \leq \sum_{k \geq s}\frac 1 {k!} \\ s>1:\ \sum_{i \geq s}\frac 1 {k!}\leq \frac 2 {s!} \\ Pr(n\ points\ less\ than\ s\ sets) = 1-\frac{2n}{s!} \]
- Protocol
Proof.
\[
s=\sqrt m \\
S(z) = O(r) +\Lambda e^{-\Omega(\beta m/s)} \\
\]
\[ \begin{align} N(x) &= S(z)/\alpha\leftarrow\alpha=\frac \epsilon 2,\ \beta=\frac \epsilon {4(d+ln2/\delta)} \nonumber \\ &= 2O(\frac r \epsilon) +\frac {2\Lambda} \epsilon e^ {-\Omega(\frac {\epsilon \sqrt m} {4(d+ln2/\delta)})} \nonumber \\ &= O(\frac r \epsilon) +\frac {\Lambda} \epsilon e^ {-\Omega(\frac {\epsilon \sqrt m} d)} \nonumber \end{align} \]