Saturday, December 30, 2023

Is $e$ less than $(1+\sqrt{21})/2$?

One of my friends recently posed the problem of finding the expected number of uniform random variables needed to get a sum greater than $1$ without using calculus.

Even though I've seen this problem before, I was not able to recollect it and the calculus proof was all I could come up with until he eventually showed the me the solution. But this time, something different struck me.

The solution to this problem uses that fact that

$\displaystyle \mathbb{P}(N>n)=\frac{1}{n!}$

where $N$ is the random variable indicating the number of Uniform random variables needed. But given this, we could do more than the expected value. We know that,

$\displaystyle \mathbb{E}(N)=\sum_{n=0}^\infty\mathbb{P}(N>n)=\sum_{n=0}^\infty \frac{1}{n!}=e$

which follows from the definition of Euler's number.

But suppose, we don't know the exact value of $e$. With elementary methods, we can show that $2<e<3$.

In the problem that we were discussing, we can use the expression for probability to find $\mathbb{V}(N)$ as well. We know that

$\displaystyle \mathbb{E}(N^2)=\sum_{n=0}^\infty(2n+1)\mathbb{P}(N>n)=\sum_{n=0}^\infty \frac{2n+1}{n!}=3e$

Therefore, $\mathbb{V}(N)=\mathbb{E}(N^2)-\mathbb{E}(n)=e(e-3)$

As variance of any random variable $\geq 0$ and we know that $e>0$, this shows us that $e<3$ which is something we already know. But this gives us a nice idea.

Let $N_{(2)}$ be the random variable denoting the number of Uniform random variables needed to get a sum a sum greater than $2$. Uniform Sum Distribution shows us that,

$\displaystyle \mathbb{P}(N_{(2)}=n)=\frac{(n-2)(2^{n-1}-n)}{n!}$

With this, we can see that

$\displaystyle \mathbb{E}(N_{(2)})=e(e-1)$ and $\displaystyle \mathbb{E}(N_{(2)}^2)=5e(e-1)$

Therefore, $\displaystyle \mathbb{V}(N_{(2)})=e(e-1)(5-e^2+e)$

Using the same idea as before, we see that $5-e^2+e>0$ which shows us

$\displaystyle \frac{1}{2}(1-\sqrt{21}) \leq e \leq \frac{1}{2}(1+\sqrt{21})$

Numerically, this shows that $e \leq 2.791288$!!

As you may have already guessed, this could be extended for higher moments and / or sum-greater-than-3 and so on. But the resulting expressions gives higher degree polynomials which, even though gives better estimates for $e$, cannot be easily solved in closed form. For example, using

$\displaystyle \mathbb{E}(N_{(3)})=\frac{1}{2}(2e^3-4e^2+e)$ and $\displaystyle \mathbb{E}(N_{(3)}^2)=\frac{7}{2}(2e^3-4e^2+e)$

Solving the resulting expression for variance, we get to see that $2e^3-4e^2+e-14 \leq 0$. Solving this with WA, we see that

$\displaystyle e < \frac{1}{6}(4+\sqrt[3]{784+18\sqrt{1894}}+\sqrt[3]{784-18\sqrt{1894}}) \approx 2.74615$

Hope you enjoyed this. See ya soon.

Until then, Yours Aye

Me

A nice property on the recurrence of Eulerian Numbers

I was recently looking at Eulerian numbers for a problem and I noticed a nice, and probably lesser/never known, property on their recursive definition. In summary, it seemed like an exercise problem out of a Combinatorics textbook except I couldn't find one.

Eulerian numbers $A(n,k)$ count the number of permutation of numbers from $1$ to $n$ in which exactly $k$ elements are smaller than the previous element. In other, words they count the number of permutation with exactly $k$ descents. 

With this interpretation, we can setup the following recurrence as shown in this Proposition 3 of this lecture.

$A(n,k)=(n-k)A(n-1,k-1)+(k+1)A(n-1,k)$

The first term here corresponds to the number of permutations in which the element $n$ contributes to the descent count and the second term to those where it doesn't.

The problem that I was interested in was that of counting the number of permutation with $k$ descents and end with a descent.

After some thought, it seemed easy to setup a recurrence for this count as well.

Let $A_d(n,k)$ be the number of permutations of $n$ elements with exactly $k$ descents and end with a descent. Let $A_a(n,k)$ be the same count but those that end with an ascent. Then,

$A_d(n,k)=k A_d(n-1,k)+A_a(n-1,k-1)+(n-k)A_d(n-1,k-1)$

The first term here corresponds to inserting $n$ in an existing descent. The second term is where we insert $n$ as second to last position thereby simultaneously making the permutation end with an descent and increasing the descent count.

The last one is where we take a permutation with $n-1$ elements. There are '$n-1$' gaps where we can insert $n$ (we don't $n$ to be in the last position as it would make the permutation to end with an ascent). Of the $n-1$ gaps, $k-1$ are already descents which means there are $n-k$ gaps that correspond to an ascent. Inserting $n$ here would increase descent count by one giving us the requisite permutation.

Similarly, we can write the following recurrence for the $A_a(n-k)$

$A_a(n,k)=A_d(n-1,k)+(n-k-1)A_a(n-1,k-1)+(k+1)A_a(n-1,k)$

Using the fact that $A(n,k)=A_a(n,k)+A_d(n,k)$ and some simple manipulations, we get

$A_d(n,k)=(n-k)A(n-1,k-1)$ and $A_a(n,k)=(k+1)A(n-1,k)$

I hope you can see now why this surprised me. The recurrence usually written for Eulerian numbers also has another simpler combinatorial representation without any change!!

 That is, the number of permutations with $k$ descents in which $n$ contributes to the descent count is equinumerous with the permutations with $k$ descents that end with a descent.

Hope you enjoyed this. See ya soon.

Until then, Yours Aye

Me

Sunday, December 24, 2023

The Rounding Error Conundrum

Like all managers, I too, from time to time, will be in a position to present a bunch of numbers in an Excel Sheet. To bring clarity to the sheet, I follow the etiquette of rounding the numbers to (MS-Excel's default) two decimal places.

The problem this brings is that when you sum of list of numbers and round everything to, say, two decimal places, the numbers doesn't always add up and someone or the other points out that the numbers are not adding up in the last decimal place. I got tired of pointing out that it is a result of rounding.

At first, I thought rounding with more decimal places might bring some resolution to this situation. But then, it became apparent (and intuitively obvious) that no matter the rounding, the problem persists. Being an amateur mathematician at heart, I wanted to solve this problem mathematically.

Let $X_1,X_2,\cdots,X_n$ be standard uniform random variables $U(0,1)$ and $Y=X_1+X_2+\cdots+X_n$. We are then interested in the probability that $[Y]=[X_1]+[X_2]+\cdots+[X_n]$ where $[x]$ denotes the value of $x$ rounded to $r$ places after the decimal point.

As I started thinking about this problem, the only progress I could make is that $r$ doesn't affect the probability and we could as well consider it to be $0$. Frustrated, I posted in my puzzle group and Atul Gupta from the group was able to quickly solve this.

The key idea is to use the periodicity of the 'round' function. That is, $[n+x]=n+[x]$ for integer $n$ and real $x$.

The way to solve this problem then is transform the $X$'s such that $X_k=B_k+Z_k$ where $B_k$'s are either $0$ or $1$ and $Z_k \sim U(-0.5,0.5)$.

Because we take $r=0$, we can see that $[Z_k]=0$ for all $k$. Therefore, $[X_k]=[B_k+Z_k]=B_k+[Z_k]=B_k$ which means $[X_1]+[X_2]+\cdots+[X_n]=B_1+B_2+\cdots+B_n=\text{(say)}B$.

On the other hand,

$\begin{align} Y&=X_1+X_2+\cdots+X_n\\ &=B_1+Z_1+B_2+Z_2+\cdots+B_n+Z_n\\ &=B+Z_1+Z_2+\cdots+Z_n \end{align}$

Using the periodicity of 'round' function, $[Y]=B+[Z_1+Z_2+\cdots+Z_n]$.

Hence, the condition we are interested in simplifies to $[Z_1+Z_2+\cdots+Z_n]=0$ which is equivalent to saying $-0.5 \leq Z_1+Z_2+\cdots+Z_n \leq 0.5$

Transforming the $Z_k$'s to standard Uniform variables $U_k=Z_k+1/2$, this becomes $(n-1)/2 \leq U_1+U_2+\cdots+U_n \leq (n+1)/2$

It is well known that the sum of $n$ standard uniform random variables defines the Irwin-Hall distribution (also known as Uniform Sum distribution) $IH_n$.

Irwin-Hall distribution and Eulerian numbers are closely related. For example, this should already be apparent from the PDF of $IH_n$ in the Wikipedia page and the integral identity of Eulerian Numbers.

While the CDF of $IH_n$ also nearly resembles Eulerian numbers, it is not quite there. To make Eulerian numbers more useful to our cause, we slightly modify the formula given in Wiki page.

$\displaystyle A(n,x)=\sum_{i=0}^{x+1}(-1)^i\binom{n+1}{i}(x+1-i)^n$

where $x$ can be a real number. With this definition, we have the following useful result.

$\displaystyle\mathbb{P}(t \leq IH_n \leq t+1)=\frac{1}{n!}A(n,t)$

Therefore, we finally have what we were looking for.

$\displaystyle \mathbb{P}([Y]=[X_1]+[X_2]+\cdots+[X_n])=\mathbb{P}\left(\frac{n-1}{2} \leq \sum_{k=1}^nU_k \leq \frac{n+1}{2}\right)=\frac{1}{n!}A(n,(n-1)/2)$

For large $n$, we can use the normal approximation to show that 

$\displaystyle \mathbb{P}([Y]=[X_1]+[X_2]+\cdots+[X_n]) \approx \sqrt{\frac{2}{\pi}}\sin\left(\sqrt{\frac{3}{n}}\right)$

which shows the probability of the sum matching only worsens as $n$ increases.

On an unrelated note, I also noticed that

$\displaystyle \mathbb{E}(IH_n|t \leq IH_n \leq t+1)=(t+1)\frac{n}{n+1}$

Given the simple expression for the expected value, I suspect there is a very nice reason for this but am yet to see it. In case you can, please do share it with us in the comments.

Hope you enjoyed this. See ya soon.

Until then, Yours Aye

Me