## Saturday, August 17, 2019

### Estimating a sum with Probability

Let's say we want to find the the sum
$S=\displaystyle\sum_{k=1}^nf(k)$
but we don't have a closed form or any easier way to calculate this. One way of doing this is to use the idea of approximating an integral.

We can take choose some $m$ random values $0=X_0,X_1,X_2,\cdots,X_m$ between $1$ and $n$ such that $0=X_0<X_1<X_2<\cdots,X_m\leq n$ and use the following expression to get an approximation.

$S_r=\displaystyle\sum_{i=1}^m f(X_i)(X_i-X_{i-1})$

But how good is this approximation? If we define the error as $\Delta=S-S_r$, what is $\mathbb{E}(\Delta)$? In this post, we are precisely trying to answer that.

Let's first calculate $\mathbb{E}(S_r)$. Using linearity of expectation, we can find the expected value of each term in the expression for $S_r$.

$\mathbb{E}(f(X_i)(X_i-X_{i-1}))=\displaystyle\frac{1}{\binom{n}{m}}\sum_{x_1=1}^{x_2-1}\sum_{x_2=1}^{x_3-1}\cdots\sum_{x_{i-1}=1}^{x_i-1}\sum_{x_i=1}^n\sum_{x_{i+1}=x_i+1}^n\cdots\sum_{x_{m-1}=x_{m-2}+1}^n\sum_{x_{m}=x_{m-1}+1}^nf(x_i)(x_i-x_{i-1})$

To simplify this sum, we need the following beautiful result.

$\displaystyle\sum_{x_1=1}^{x_2-1}\sum_{x_2=1}^{x_3-1}\cdots\sum_{x_k=1}^{x_{k+1}-1}1=\frac{(x_{k+1}-1)_{(k)}}{k!}$ and $\displaystyle\sum_{x_2=x_1+1}^n\sum_{x_3=x_2+1}^n\cdots\sum_{x_{k+1}=x_k+1}^n1=\frac{(n-x_1)_{(k)}}{k!}$

Note that how they mimic repeated integration of the constant function. Using the above results, we first simplify the expectation to

\begin{align} \mathbb{E}(f(X_i)(X_i-X_{i-1}))&=\displaystyle\frac{1}{\binom{n}{m}}\sum_{x_{i-1}=1}^{x_i-1}\sum_{x_i=1}^nf(x_i)(x_i-x_{i-1})\frac{(x_{i-1}-1)_{(i-2)}}{(i-2)!}\frac{(n-x_i)_{(m-i)}}{(m-i)!}\\ &=\frac{1}{\binom{n}{m}}\sum_{x_i=1}^nf(x_i)\frac{(x_i)_i}{i!}\frac{(n-x_i)_{(m-i)}}{(m-i)!}\\ &=\frac{1}{\binom{n}{m}}\sum_{k=1}^nf(k)\frac{(k)_i}{i!}\frac{(n-k)_{(m-i)}}{(m-i)!}\\ &=\frac{1}{(n)_{(m)}}\sum_{k=1}^nf(k)\binom{m}{i}(k)_i(n-k)_{(m-i)}\\ \end{align}

Therefore,
\begin{align} \mathbb{E}(S_r)&=\displaystyle\frac{1}{(n)_{(m)}}\sum_{i=1}^m\sum_{k=1}^nf(k)\binom{m}{i}(k)_i(n-k)_{(m)}\\ &=\frac{1}{(n)_{(m)}}\sum_{k=1}^nf(k)\bigl((n)_{(m)}-(n-k)_{(m)}\bigl)\\ &=S-\frac{1}{(n)_{(m)}}\sum_{k=1}^nf(k)(n-k)_{(m)}\\ \end{align}

where we have the used the falling factorial analogue of the binomial theorem in the second equality. Now that we can have the result for the expected value of the error as,

\begin{align} \mathbb{E}(\Delta)&=\displaystyle\frac{1}{(n)_{(m)}}\sum_{k=1}^{n-m}f(k)(n-k)_{(m)}=\frac{1}{\binom{n}{m}}\sum_{k=1}^{n-m}f(k)\binom{n-k}{m} \end{align}

That's pretty nice, isn't it?

I find this random sums extremely good in certain cases. Especially, I was surprised when I tried to approximate the sum of totients and the divisor function. Considering that these number theoretic functions are themselves apparently random, the expected relative errors are surprising small. You should try them out for yourselves.

Until then
Yours Aye
Me