Saturday, December 30, 2023

Is $e$ less than $(1+\sqrt{21})/2$?

One of my friends recently posed the problem of finding the expected number of uniform random variables needed to get a sum greater than $1$ without using calculus.

Even though I've seen this problem before, I was not able to recollect it and the calculus proof was all I could come up with until he eventually showed the me the solution. But this time, something different struck me.

The solution to this problem uses that fact that

$\displaystyle \mathbb{P}(N>n)=\frac{1}{n!}$

where $N$ is the random variable indicating the number of Uniform random variables needed. But given this, we could do more than the expected value. We know that,

$\displaystyle \mathbb{E}(N)=\sum_{n=0}^\infty\mathbb{P}(N>n)=\sum_{n=0}^\infty \frac{1}{n!}=e$

which follows from the definition of Euler's number.

But suppose, we don't know the exact value of $e$. With elementary methods, we can show that $2<e<3$.

In the problem that we were discussing, we can use the expression for probability to find $\mathbb{V}(N)$ as well. We know that

$\displaystyle \mathbb{E}(N^2)=\sum_{n=0}^\infty(2n+1)\mathbb{P}(N>n)=\sum_{n=0}^\infty \frac{2n+1}{n!}=3e$

Therefore, $\mathbb{V}(N)=\mathbb{E}(N^2)-\mathbb{E}(n)=e(e-3)$

As variance of any random variable $\geq 0$ and we know that $e>0$, this shows us that $e<3$ which is something we already know. But this gives us a nice idea.

Let $N_{(2)}$ be the random variable denoting the number of Uniform random variables needed to get a sum a sum greater than $2$. Uniform Sum Distribution shows us that,

$\displaystyle \mathbb{P}(N_{(2)}=n)=\frac{(n-2)(2^{n-1}-n)}{n!}$

With this, we can see that

$\displaystyle \mathbb{E}(N_{(2)})=e(e-1)$ and $\displaystyle \mathbb{E}(N_{(2)}^2)=5e(e-1)$

Therefore, $\displaystyle \mathbb{V}(N_{(2)})=e(e-1)(5-e^2+e)$

Using the same idea as before, we see that $5-e^2+e>0$ which shows us

$\displaystyle \frac{1}{2}(1-\sqrt{21}) \leq e \leq \frac{1}{2}(1+\sqrt{21})$

Numerically, this shows that $e \leq 2.791288$!!

As you may have already guessed, this could be extended for higher moments and / or sum-greater-than-3 and so on. But the resulting expressions gives higher degree polynomials which, even though gives better estimates for $e$, cannot be easily solved in closed form. For example, using

$\displaystyle \mathbb{E}(N_{(3)})=\frac{1}{2}(2e^3-4e^2+e)$ and $\displaystyle \mathbb{E}(N_{(3)}^2)=\frac{7}{2}(2e^3-4e^2+e)$

Solving the resulting expression for variance, we get to see that $2e^3-4e^2+e-14 \leq 0$. Solving this with WA, we see that

$\displaystyle e < \frac{1}{6}(4+\sqrt[3]{784+18\sqrt{1894}}+\sqrt[3]{784-18\sqrt{1894}}) \approx 2.74615$

Hope you enjoyed this. See ya soon.

Until then, Yours Aye

Me

A nice property on the recurrence of Eulerian Numbers

I was recently looking at Eulerian numbers for a problem and I noticed a nice, and probably lesser/never known, property on their recursive definition. In summary, it seemed like an exercise problem out of a Combinatorics textbook except I couldn't find one.

Eulerian numbers $A(n,k)$ count the number of permutation of numbers from $1$ to $n$ in which exactly $k$ elements are smaller than the previous element. In other, words they count the number of permutation with exactly $k$ descents. 

With this interpretation, we can setup the following recurrence as shown in this Proposition 3 of this lecture.

$A(n,k)=(n-k)A(n-1,k-1)+(k+1)A(n-1,k)$

The first term here corresponds to the number of permutations in which the element $n$ contributes to the descent count and the second term to those where it doesn't.

The problem that I was interested in was that of counting the number of permutation with $k$ descents and end with a descent.

After some thought, it seemed easy to setup a recurrence for this count as well.

Let $A_d(n,k)$ be the number of permutations of $n$ elements with exactly $k$ descents and end with a descent. Let $A_a(n,k)$ be the same count but those that end with an ascent. Then,

$A_d(n,k)=k A_d(n-1,k)+A_a(n-1,k-1)+(n-k)A_d(n-1,k-1)$

The first term here corresponds to inserting $n$ in an existing descent. The second term is where we insert $n$ as second to last position thereby simultaneously making the permutation end with an descent and increasing the descent count.

The last one is where we take a permutation with $n-1$ elements. There are '$n-1$' gaps where we can insert $n$ (we don't $n$ to be in the last position as it would make the permutation to end with an ascent). Of the $n-1$ gaps, $k-1$ are already descents which means there are $n-k$ gaps that correspond to an ascent. Inserting $n$ here would increase descent count by one giving us the requisite permutation.

Similarly, we can write the following recurrence for the $A_a(n-k)$

$A_a(n,k)=A_d(n-1,k)+(n-k-1)A_a(n-1,k-1)+(k+1)A_a(n-1,k)$

Using the fact that $A(n,k)=A_a(n,k)+A_d(n,k)$ and some simple manipulations, we get

$A_d(n,k)=(n-k)A(n-1,k-1)$ and $A_a(n,k)=(k+1)A(n-1,k)$

I hope you can see now why this surprised me. The recurrence usually written for Eulerian numbers also has another simpler combinatorial representation without any change!!

 That is, the number of permutations with $k$ descents in which $n$ contributes to the descent count is equinumerous with the permutations with $k$ descents that end with a descent.

Hope you enjoyed this. See ya soon.

Until then, Yours Aye

Me

Sunday, December 24, 2023

The Rounding Error Conundrum

Like all managers, I too, from time to time, will be in a position to present a bunch of numbers in an Excel Sheet. To bring clarity to the sheet, I follow the etiquette of rounding the numbers to (MS-Excel's default) two decimal places.

The problem this brings is that when you sum of list of numbers and round everything to, say, two decimal places, the numbers doesn't always add up and someone or the other points out that the numbers are not adding up in the last decimal place. I got tired of pointing out that it is a result of rounding.

At first, I thought rounding with more decimal places might bring some resolution to this situation. But then, it became apparent (and intuitively obvious) that no matter the rounding, the problem persists. Being an amateur mathematician at heart, I wanted to solve this problem mathematically.

Let $X_1,X_2,\cdots,X_n$ be standard uniform random variables $U(0,1)$ and $Y=X_1+X_2+\cdots+X_n$. We are then interested in the probability that $[Y]=[X_1]+[X_2]+\cdots+[X_n]$ where $[x]$ denotes the value of $x$ rounded to $r$ places after the decimal point.

As I started thinking about this problem, the only progress I could make is that $r$ doesn't affect the probability and we could as well consider it to be $0$. Frustrated, I posted in my puzzle group and Atul Gupta from the group was able to quickly solve this.

The key idea is to use the periodicity of the 'round' function. That is, $[n+x]=n+[x]$ for integer $n$ and real $x$.

The way to solve this problem then is transform the $X$'s such that $X_k=B_k+Z_k$ where $B_k$'s are either $0$ or $1$ and $Z_k \sim U(-0.5,0.5)$.

Because we take $r=0$, we can see that $[Z_k]=0$ for all $k$. Therefore, $[X_k]=[B_k+Z_k]=B_k+[Z_k]=B_k$ which means $[X_1]+[X_2]+\cdots+[X_n]=B_1+B_2+\cdots+B_n=\text{(say)}B$.

On the other hand,

$\begin{align} Y&=X_1+X_2+\cdots+X_n\\ &=B_1+Z_1+B_2+Z_2+\cdots+B_n+Z_n\\ &=B+Z_1+Z_2+\cdots+Z_n \end{align}$

Using the periodicity of 'round' function, $[Y]=B+[Z_1+Z_2+\cdots+Z_n]$.

Hence, the condition we are interested in simplifies to $[Z_1+Z_2+\cdots+Z_n]=0$ which is equivalent to saying $-0.5 \leq Z_1+Z_2+\cdots+Z_n \leq 0.5$

Transforming the $Z_k$'s to standard Uniform variables $U_k=Z_k+1/2$, this becomes $(n-1)/2 \leq U_1+U_2+\cdots+U_n \leq (n+1)/2$

It is well known that the sum of $n$ standard uniform random variables defines the Irwin-Hall distribution (also known as Uniform Sum distribution) $IH_n$.

Irwin-Hall distribution and Eulerian numbers are closely related. For example, this should already be apparent from the PDF of $IH_n$ in the Wikipedia page and the integral identity of Eulerian Numbers.

While the CDF of $IH_n$ also nearly resembles Eulerian numbers, it is not quite there. To make Eulerian numbers more useful to our cause, we slightly modify the formula given in Wiki page.

$\displaystyle A(n,x)=\sum_{i=0}^{x+1}(-1)^i\binom{n+1}{i}(x+1-i)^n$

where $x$ can be a real number. With this definition, we have the following useful result.

$\displaystyle\mathbb{P}(t \leq IH_n \leq t+1)=\frac{1}{n!}A(n,t)$

Therefore, we finally have what we were looking for.

$\displaystyle \mathbb{P}([Y]=[X_1]+[X_2]+\cdots+[X_n])=\mathbb{P}\left(\frac{n-1}{2} \leq \sum_{k=1}^nU_k \leq \frac{n+1}{2}\right)=\frac{1}{n!}A(n,(n-1)/2)$

For large $n$, we can use the normal approximation to show that 

$\displaystyle \mathbb{P}([Y]=[X_1]+[X_2]+\cdots+[X_n]) \approx \sqrt{\frac{2}{\pi}}\sin\left(\sqrt{\frac{3}{n}}\right)$

which shows the probability of the sum matching only worsens as $n$ increases.

On an unrelated note, I also noticed that

$\displaystyle \mathbb{E}(IH_n|t \leq IH_n \leq t+1)=(t+1)\frac{n}{n+1}$

Given the simple expression for the expected value, I suspect there is a very nice reason for this but am yet to see it. In case you can, please do share it with us in the comments.

Hope you enjoyed this. See ya soon.

Until then, Yours Aye

Me

Friday, October 27, 2023

Generalising a famous Problem in Probability

 As always, when I was looking for newer problems in probability, I came across the famous problem of determining the probability that $n$ points chosen uniformly randomly on a circle of unit circumference lying in one semicircle.

This problem is quite famous with a neat solution as posted in the above SE post. However, I was intrigued by the choice of semicircle in the question. I mean what if we choose a quarter-circle? or a three-fourths circular arc? or any arc of length $x$ from that circle for that matter.

It should be obvious that the argument given in the SE post applies verbatim whenever $x \leq 1/2$. It's the other case that makes for an interesting question.

This complication arises primarily because the $x \leq 1/2$ cases exploits the fact that the event 'all points lying on the $x$-circular arc' is independent for each of the points (which in fact contributes to the factor of $n$ in the numerator). And this independency breaks when $x>1/2$.

A better way to solve this problem is to realize that if the largest gap between two consecutive chosen points is greater than '$1-x$', the selected points can all be covered by a circular segment of length $x$.

This is great because it takes us into the realm of order statistics on a circle which is an academic problem that was intensively studied. There was so much literature on this, I did not even find the motivation to work it out myself.

For example, On the Lengths of the Pieces of a Stick Broken at Random, Lars Holst shows, among other results, that

$\displaystyle\mathbb{P}(\text{largest spacing created by }n\text{ points chosen on a circle} \leq x)=\sum_{k=0}^n (-1)^k \binom{n}{k}(1-kx)_+^{n-1}$

where $x_+=\text{max}(x,0)$.

Therefore,

$\displaystyle \mathbb{P}(n \text{ points can be covered by }x \text{-circle})=\sum_{k=1}^n (-1)^{k-1} \binom{n}{k}(1-k(1-x))_+^{n-1}$

We now turn our attention towards the following problem: Points are selected one at a time on the circumference of an unit circle till we get a configuration of points which cannot be covered by a circular arc of length $x$. What is the expected number of points that will be chosen given $x$?

If we let $N$ be the random variable denoting the number of points needed, then

$\begin{align}\displaystyle\mathbb{E}(N|x) &= \sum_{n=0}^\infty\mathbb{P}(N>n)\\ &= 1+\sum_{n=1}^\infty\mathbb{P}(n\text{ points lie on circular segment of length }x) \\ &= 1+\sum_{n=1}^\infty\sum_{k=1}^n (-1)^{k-1} \binom{n}{k}(1-k(1-x))_+^{n-1}\\ \end{align}$

With WA's help, we can simplify this expression to give

$\displaystyle \mathbb{E}(N|x)=1+\frac{1}{1-x^2}\sum_{k=1}^{\lfloor 1 / (1-x) \rfloor}\frac{1}{k^2}\left(1-\frac{1}{k(1-x)}\right)^{k-1}$


Hope you enjoyed this. See ya soon.

Until then, Yours Aye

Me

Saturday, July 15, 2023

Vechtmann's theorem and more on Bernoulli Lemniscate

Bernoulli Lemniscate was one beautiful curves in Mathematics. The curve, which in some sense kick started the study of elliptic functions, has a lot of interesting properties and some of those will be subject of this post.

Bernoulli Lemniscate can be defined in atleast two ways. It is the locus of a point $P$ such that the product of its distance from two points, say $P_1$ and $P_2$, stays a constant. Or we as the inverse of the hyperbola $x^2-y^2=1$ w.r.t the unit circle which we will be using in this post.

$DG \cdot DF = \text{constant}$
$F \equiv (1/\sqrt{2},0)$, $G\equiv (-1/\sqrt{2},0)$

One of the first things that we might want to do is to construct a tangent at a given point on the Lemniscate.

A useful property of the hyperbola that comes in handy here is the fact that the line from the origin drawn perpendicular to the tangent of hyperbola intersects the hyperbola again at a point which the reflection of the tangent point w.r.t the $x$-axis. Also, the point of intersection of the tangent and the perpendicular line is inverse of the reflected point w.r.t the unit circle.

$I$ and $D'$ are reflections w.r.t $x$ axis
$I$ and $H$ are reflections w.r.t the unit circle

The inverse of the tangent line shown above w.r.t the unit circle becomes a circle passing through the origin. Because the tangent line intersects the blue line intersects at $H$, their corresponding inverses intersect at $I$ - which is the inverse of $H$ - showing that the inverted circle also passes through $I$. Note that $BI$ is the diameter of the circle.

 


For a given point $D$ on the lemniscate, we can extend the radial line $BD$ so that it intersects the hyperbola at $D'$. As the green line is the tangent to the hyperbola at $D'$, we know that the green circle must be tangent to the lemniscate at $D$. As $BD$ is a chord on the circle, the point of intersection of their perpendicular bisector (not shown in the picture below) with the diameter $BI$ gives the center of the circle $E$.

As the lemniscate and the green circle are tangent at $D$, they share a common tangent at that point. With the circle's center known, it is now trivial to draw both the tangent and normal to the lemniscate at $D$.

In summary, to draw the tangent at point $D$ on the lemniscate, extend the radial line to find $D'$ on the hyperbola. Reflect it w.r.t the $x$-axis to get $I$. We can then locate $E$ as the intersection of $BI$ and the perpendicular bisector of $BD$. Then $ED$ is the normal to the lemniscate at $D$ from which the tangent follows trivially.

Let $D \equiv (r,\theta)$ be the polar coordinates in the following figure. Because $D$ and $D'$ are inverses w.r.t the unit circle, we have $BD'=1/r$. As $D'$ and $I$ are reflections, we have $BI=BD'=1/r$ and $\angle D'BI=2\theta$.


Because $BI$ is the diameter of the circle, we know that $BD \perp DI$. Therefore using this right angled triangle, we can see that $\cos 2\theta=r/(1/r)$, thereby showing the polar equation of the lemniscate is $r^2=\cos 2\theta$.

Also, $DE$ (which is a normal to the lemniscate at $D$) and $EB$ are radius to the same circle, we have $\angle BDE=2\theta$ showing that the angle between the radial line and normal is twice the radial angle. This result is called the Vechtmann theorem which is mostly proved using calculus.

Vechtmann theorem shows that the normal to the lemniscate makes an angle of $3\theta$ with the $x$-axis i.e. three times the radial angle. Therefore, we can see that the angle between two normals (and hence two tangents) at two given points is three times the difference between the radial angle of those points.

Time to go infinitesimal. Consider two infinitesimally close points $D \equiv (r,\theta)$ and $J$. Let $DM$ be a line perpendicular to the radial line $BJ$.

From the infinitesimal right $\triangle DMN$ in the figure above, we easily see that $DM \approx r d\theta$, $DN \approx ds$ and $MN \approx dr$. Also, because the angle between the radial line and the normal at $D$ is $2\theta$, we have $\angle MDN=2\theta$. Therefore,

$\displaystyle \cos2\theta=\frac{DM}{DN}=\frac{r d\theta}{ds} \implies r \cdot ds=d\theta$ using the polar equation of the lemniscate.

The radius of curvature $R$ for a curve is given by the relation $ds=R\cdot d\varphi$ where $\varphi$ is the tangential angle (the angle a tangent makes with a common reference line). Therefore, change in tangential angle between two points is just the angle between the tangents.

We just saw that for a lemniscate, the tangential angle is thrice the polar angle. Therefore, $d\phi=3 d\theta$. The above three relations clearly show that the curvature $\kappa$, the inverse of the radius of curvature $R$, is $3r$.

Going back again to the infinitesimal triangle, we have

$\displaystyle \sin 2\theta=\frac{dr}{ds} \implies ds=\frac{dr}{\sqrt{1-r^4}}$

again using the polar equation of the lemniscate. It is this relation that made the lemniscate an extremely important, and the first, curve to be studied in the context of elliptic integrals.

If we think about it, the idea that makes this all work is the fact that polar equation of the lemniscate is of a very specific form. In fact, we can generalize the same to see that curves defined by the polar equation

$r^n=\cos n\theta$

are all amenable to a similar approach with which we can see that their tangential angles is $(n+1)\theta$, their curvature is $(n+1)r^{n-1}$ and their differential arc length is

$\displaystyle ds=\frac{dr}{\sqrt{1-r^{2n}}}$

These curves are called the Sinusoidal spirals, which includes circles, parabolas, hyperbolas among others as special cases, and have many other interesting properties as well which I leave the readers to explore.

Until then
Yours Aye
Me

Thursday, June 29, 2023

'Leftmost' random points on Circles and Spheres

As usual, I was randomly (pun unintended) thinking about some problems in Geometric Probability and thought would share some of those.

Consider a unit circle centered at the origin. If we select $n$ points uniformly randomly in this circle, how 'left' the leftmost point be?

Mathematically, I'm looking for the expected value of the abscissa of the point with the 'minimum' x-coordinate. This way, this becomes an order statistic problem on a circle.

If we parametrize the chords perpendicular to the x-axis by the central angle 2$\theta$ they subtend, then the probability that a given point lies on the right of the chord is

$\displaystyle\mathbb{P}(\text{point lies on the right side of chord})=\frac{\theta-\sin\theta\cos\theta}{\pi}$

In other words, the above is the complementary CDF $\bar{F}(x)$ for a given value of $x=\cos\theta$.

Now the expected value of min. of $n$ i.i.d. random variables $X_1,X_2,\cdots,X_n$ each with a Complementary CDF of $\bar{F}(x)$ is given by

$\displaystyle\mathbb{E}(\text{min.})=\int x \,d(1-\bar{F}(x)^n)=x(1-\bar{F}(x)^n)-\int (1-\bar{F}(x)^n)\,dx$

The expected value of the min. abscissa is given by

$\displaystyle\mathbb{E}(\text{min. abscissa})=\int_\pi^0 \cos\theta \,d(1-\bar{F}(\cos\theta)^n)$

Therefore, in our case, we have

$\begin{align}\displaystyle\mathbb{E}(\text{min.})&=\cos 0 (1-\bar{F}(\cos 0)^n)-\cos\pi(1-\bar{F}(\cos\pi)^n)-\int_\pi^0 \left(1-\bar{F}(\cos\theta)^n\right)\,d(\cos\theta)\\ &= 1 - \int_0^\pi \sin\theta(1-\bar{F}(\cos\theta)^n)\,d\theta \\ &= \int_0^\pi \sin\theta\left(\frac{\theta-\sin\theta\cos\theta}{\pi}\right)^n\,d\theta -1 \end{align}$

From this, we can see

$\displaystyle \mathbb{E}(\text{min. abscissa among two points})=-\frac{128}{45\pi^2} \approx -0.2882$ using WA and

$\displaystyle \mathbb{E}(\text{min. abscissa among three points})=-\frac{64}{15\pi^2} \approx -0.4323$ using WA with

further values having a slightly complicated expressions. However, we can get somewhat decent approximations by noting the following graph.


Therefore,

$\displaystyle \mathbb{E}(\text{min. abscissa of }n\text{ points}) \approx \int_0^\pi \sin\theta \sin^{2n}(\theta/2)\,d\theta -1 = \frac{1-n}{1+n}$

which unfortunately only gets worse as $n$ gets larger.

A very similar approach can be used in case of a sphere as well just by noting that

$\displaystyle F(x)=\frac{1}{4}(2+\cos\theta)(1-\cos\theta)^2$

for $x=\cos\theta$ using the volume of a spherical cap.

Therefore,

$\displaystyle\mathbb{E}(\text{min. abscissa of }n\text{ points})=\int_0^\pi \sin\theta \left(\frac{(2+\cos\theta)(1-\cos\theta)^2}{4}\right)^n\,d\theta - 1=-1+\int_{-1}^1 \left(\frac{(2+x)(1-x)^2}{4}\right)^n\,dx$

Because of the polynomial integrand, the expected values in case of a sphere all turn out to be rationals. In fact by expanding the integral range (and adjusting later), we will be able to apply Laplace's method we can show that

$\displaystyle \mathbb{E}(\text{min. abscissa of }n\text{ points}) \approx \sqrt{\frac{\pi}{3n}}-1$

which get better for large $n$.

Infact using the power series of the log of integrand at its maximum of $x=-1$ and more terms for Laplace method like we did here, we can show that

$\displaystyle \int_{-2}^1 \left(\frac{(2+x)(1-x)^2}{4}\right)^n\,dx \approx \sqrt{\frac{4\pi}{3n}}\left(1-\frac{17}{72n}\right)$

which can be used to give a slightly better approximation.

More generally, by this point, it should be clear that for $d$-dimensional case, the expected value can be written as

$\displaystyle \mathbb{E}(\text{min. abscissa of }n\text{ points in }d\text{ dimensions})=\int_{-1}^1 \bar{F}_d(x)^n\,dx-1$

where $\bar{F}_d(x)$ is the ratio between the volume of $d$-ball cap to the volume of $d$-ball. By symmetry, the expected value for the 'rightmost' point will be the negative of this.

An associated question could be how far is the 'leftmost' point from the center of the $d$-ball?

Let $L_d$ denote the random variable we are interested in. In the circle case, we could construct the density function of $L_d$ by proceeding as follows:

First, any of the $n$ points could be the 'leftmost' points.

Second, the probability that the point lies at a distance $r$ from the center is $2rdr$.

Third, we note that this point will be uniformly distributed on the circle of radius $r$. The probability that the line joining the point to the origin makes an angle $\theta$ with the $x$-axis is $d\theta / \pi$ (considering only the upper half of the circle).

Finally, If we want this point to have the min. abscissa, the remaining points should have abscissa greater than $r\cos\theta$.

Therefore, the pdf of $L_2$ would be

$\displaystyle f(l_2) = n \cdot 2r dr \cdot \frac{d\theta}{\pi}\cdot \left(\frac{\cos^{-1}(r\cos\theta)-r\cos\theta\sqrt{1-r^2\cos^2\theta}}{\pi}\right)^{n-1}$

with $0 \le r \le 1$ and $0 \le \theta \le \pi$

We can verify that the integral of this pdf over the interval is indeed $1$ using WA.

We then have

$\displaystyle \mathbb{E}(L_2)=\int_0^1\int_0^\pi r\cdot n \cdot 2r \cdot \frac{1}{\pi}\cdot \left(\frac{\cos^{-1}(r\cos\theta)-r\cos\theta\sqrt{1-r^2\cos^2\theta}}{\pi}\right)^{n-1}\,d\theta\,dr$

Using WA, we can see

$\displaystyle \mathbb{E}(L_2\text{ with 2 points})=\frac{2}{3}$ and $\displaystyle \mathbb{E}(L_2\text{ with 3 points})=\frac{4200\pi^2+3360\log{2}-457}{6300\pi^2}\approx 0.69677$

We can do something similar in the Sphere case as well. The density function of $L_3$ would be

$\displaystyle f(l_3) = n \cdot 3r^2 dr \cdot \frac{2\pi r\sin\theta\cdot r d\theta}{4\pi r^2}\cdot \left(\frac{(2+r\cos\theta)(1-r\cos\theta)^2}{4}\right)^{n-1}$

using the idea that the first point will be uniformly distributed on the surface of a sphere with radius $r$ which could then be sliced into rings (as shown here) parametrized by $\theta$.

Again using WA, we can see

$\displaystyle \mathbb{E}(L_3\text{ with 2 points})=\frac{3}{4}$ and $\displaystyle \mathbb{E}(L_3\text{ with 3 points})=\frac{1719}{2240}\approx 0.7674$

Hope you enjoyed this. See ya later.

Until then
Yours Aye
Me

Saturday, June 17, 2023

Certain Inverse Sums involving Binomial Coefficients

Consider the PMF of the binomial function denoted here by $f(p,k,n)$. We know what happens to this function when it summed over $k$ or integrated over $p$ in the appropriate interval. That is,

$\displaystyle \sum_{k=0}^n\binom{n}{k}p^kq^{n-k}=1$ and $\displaystyle \int_0^1 \binom{n}{k}p^kq^{n-k} \,dp=\frac{1}{n+1}$

The first by the story of the binomial and the second by Bayes' billiard argument. But what is the sum of this PMF summed over all possible $n$? If we let $S$ denote the sum, we want

$\displaystyle S=\sum_{n=k}^\infty \binom{n}{k}p^kq^{n-k}$

This is not too hard. If we multiply the above by $p$, we can see that the summand becomes the PMF of the Negative binomial distribution and must therefore sum to $1$ by definition.  That clearly shows $S=1/p$.

We can the same for the PMF of the Hypergeometric distribution as well. That is we are interested in,

$\displaystyle S=\sum_{N=n+K-k}^\infty \frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}$

I wasn't able to solve this directly but with some luck and Wolfram Alpha, I was able to guess that

$\displaystyle S=\sum_{N=n+K-k}^\infty \frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}=\frac{nK}{k(k-1)}$

At about this time, I saw the following two identities in PiMuEpsilon proved using telescoping sums.

$\displaystyle \sum_{n=k}^\infty\frac{1}{\binom{n}{k}}=\frac{k}{k-1}$ and $\displaystyle \sum_{n=m}^\infty\frac{1}{\binom{n}{k}}=\frac{k}{k-1}\binom{m-1}{k-1}^{-1}$

But with these many binomials, I was sure there must be some probabilistic interpretation for the same. Am I'm gladI found we are going to solve both of these using probability in this post.

Consider a Polya Urn containing $k$ marbles of which one is white and the rest are black. Our task is to pick up one black marble from this Urn in a maximum of $m$ draws. Because this a Polya Urn, everytime we draw a ball, we replace with that ball back in the Urn along with an additional ball of the same color.

Let $I_j$ ($0\leq j<m$)be an Indicator random variable for the event of picking a black marble in the '$j+1$'th draw. Then,

$\displaystyle\mathbb{P}(I_j=1)=\frac{1}{k}\cdot\frac{2}{k+1}\cdot\frac{3}{k+2}\cdots\frac{j}{k+j-1}\cdot\frac{k-1}{k+j}=\frac{k-1}{k}\binom{k+j}{j}^{-1}$

Therefore,

$\displaystyle\mathbb{P}(\text{picking a black marble in }m\text{ draws})=\sum_{j=0}^{m-1}\mathbb{P}(I_j=1)=\frac{k-1}{k}\sum_{j=0}^{m-1}\binom{k+j}{j}^{-1}$

On the other hand, probability of picking a black marble is the complementary probability of not picking a black marble in $m$ draws.

$\displaystyle\mathbb{P}(\text{Not picking a black marble in }m\text{ draws})=\frac{1}{k}\frac{2}{k+1}\cdots\frac{m}{k+m-1}=\binom{k+m-1}{m}^{-1}$

This clearly shows that

$\displaystyle\sum_{j=0}^{m-1}\binom{k+j}{j}^{-1}=\frac{k}{k-1}\left[1-\binom{k+m-1}{m}^{-1}\right]$

The above after suitable relabelling gives,

$\displaystyle\sum_{n=k}^{k+m-1}\binom{n}{k}^{-1}=\frac{k}{k-1}\left[1-\binom{k+m-1}{k-1}^{-1}\right]$

Both the PiMuEpsilon identities given above are easy corollaries of the above identity.

We could have also started with $a$ white marbles and $b$ black marbles. In this case we would have arrived at the following result.

$\displaystyle \frac{a}{a+b}\sum_{k=0}^{c-1}\binom{b-1+k}{b-1}\binom{a+b+k}{a+b}^{-1}=1-\binom{b+c-1}{c}\binom{a+b+c-1}{c}^{-1}$

The above after some relabelling can also be written as

$\displaystyle \frac{a}{a+b}\sum_{n=b}^{c-1}\binom{n-1}{b-1}\binom{n+a}{a+b}^{-1}=1-\binom{c-1}{b-1}\binom{a+c-1}{a+b-1}^{-1}$


Hope you enjoyed this. See ya later.

Until then
Yours Aye
Me

Saturday, May 6, 2023

Gaps between cards after a random shuffle

This post is more of a continuation of Probability on a standard deck of cards. I was randomly thinking about some probability problems when I came across The probability of cards meeting after a shuffle. The author of this paper discusses about two problems both of which is solved by simulation. This post aims at giving explicit closed form solutions to both of the said problems.

The first problem deals with the probability of finding cards with the same rank adjacent to other after a random shuffle.

Before I could even give it a thought, I just remembered the connection between this problem and AMBALLS problem in CodeChef. The following Mathematica code based on the tutorial immediately confirms the author's simulation results.

Clear["Global`*"];
(* https://discuss.codechef.com/t/amballs-editorial/1953 *)
n = 13; m = 4;
vals[x_, y_] := 0;
vals[0, 0] = 1;
Do[
    Do[
        Do[
            If[vals[x - 1, y] > 0,
                Do[
                    i = ij - j;
                    z = 1 + 4 (x - 1) - y;
                    vals[x, y - j + m - ij] += vals[x - 1, y] Binomial[y, j] Binomial[z, i] Binomial[m - 1, ij - 1];
                , {j, 0, Min[ij, y]}
                ];
            ];
        , {y, 0, 1 + 4 (x - 1)}
        ];
    , {ij, 0, m}
    ];
, {x, n}
];
vals[n, 0] * Power[Factorial[m], 13] / Factorial[m n]
N[%]
1 - %

I searched for a few other answers for a same question when I found this stack exchange post. The generating function result in the SE post took me by surprise.

Based on the post, we can see that when we have $c_j$ balls of color $j$ encoded in array $\mathbf{c}=(c_1,c_2,c_3,\cdots)$, then the number of ways $W(\mathbf{c})$ of arranging the balls linearly such that no two balls same of the same color are adjacent to each other is given by

$\displaystyle W(\mathbf{c})=\sum_{k_1=1}^{c_1} \sum_{k_2=1}^{c_2}\cdots (-1)^{c-k}\binom{k}{\mathbf{k}}\prod_i \binom{c_i-1}{k_i-1}$

where $\mathbf{k}=(k_1,k_2,k_3,\cdots)$, $k = k_1+k_2+\cdots$, $c=c_1+c_2+\cdots$ and $\binom{k}{\mathbf{k}}$ is the multinomial coefficient.

This gives us a closed form solution. I don't understand the reasoning behind the Generating function approach yet and that will be a subject of a post later. In any case, all these approaches confirm that the probability is about $94.45\%$.

We now come to the second problem of finding the probability that the Kings and Queens are separated by more than one card after a random shuffle. The author (possibly via simulations) suspects that the complementary probability is above 70% without giving a final result.

This question seemed very interesting to me and I couldn't find a clear idea with a casual Google search. Anyway, stripping the problem of labels, this questions boils down to counting the number of words with alphabets $\{a,b,c\}$ such that the $a$'s and $b$'s are atleast two $c$'s apart.

After trying multiple approaches, I realized that the best way to solve this problem is to first solve the following sub-problem: Find the number of words that can be formed with $a$ and $b$ such that we have $k$ change of 'runs'.

For example $aabbb$ has one change of run whereas both $baba$ and $aaabbbbaabbb$ both has three change of runs.

Fortunately, this can be easily solved with Generating functions. Let $\mathcal{G}$ be the language of such words. Then, it is easy to see that

$\displaystyle \mathcal{G}=\mathcal{W}_{a,b}+\mathcal{W}_{b,a}$

where $\displaystyle \mathcal{W}_{a,b}=\mathrm{SEQ}_{\geq 1}(a)\cdot \mathrm{SEQ}(t\mathrm{SEQ}_{\geq 1}(b)t\mathrm{SEQ}_{\geq 1}(a))\cdot(1+t\mathrm{SEQ}_{\geq 1}(b))$

Therefore, the generating function is given by

$\displaystyle G(a,b)=\frac{a+b+2ab-2abt}{1-a-b+ab-abt^2}$

Now $g(m,n,k)=[a^mb^nt^k]G(a,b)$ gives the number of words having $m$ a's, $n$ b's and $k$ change of runs. Note that the special case of $m=n$ is given in A152659.

But the approach above is not easily generalizable. A better way is to use the idea of Smirnov words which counts the number of words in which no adjacent letters are the same. The generating function of Smirnov words is given by

$\displaystyle S(v_1,v_2,v_3,\cdots)=\left(1- \sum_i \frac{v_i}{1+v_i}\right)^{-1}$

If we take a non-empty Smirnov word and use the substitution $v_i \mapsto t v_i/(1-v_i)$, we get the generating function we need with one extra $t$ (which counts the zeroth run). Therefore, we have

$\displaystyle G(v_1,v_2,v_3,\cdots,t)=\sum_i \frac{v_i}{1-v_i+tv_i}\cdot\left(1-t\sum_i \frac{v_i}{1-v_i+tv_i}\right)^{-1}$

We can use the above to solve our problem in greater generality. Let's $g_p(m,n,r)$ be the number of words with $m$ a's, $n$ b's and $r$ c's such that there are atleast $p$ c's between the a's and b's. Then,

$\displaystyle g_p(m,n,r)=\sum_{k=1}^{m+n-1}g(m,n,k)\binom{r-pk+m+n+1-1}{m+n+1-1}=\sum_{k=1}^{m+n-1}g(m,n,k)\binom{r-pk+m+n}{m+n}$

That is, we first arrange the a's and b's such that there are $k$ change of runs. In each of these $k$ changes, we insert $p$ c's. We then have '$r-pk$' c's which we insert in the available $m+n+1$ spaces between the a's and b's.

Therefore,

$\displaystyle \mathbb{P}(\text{Ks and Qs separated by atleast two cards})=\frac{g_2(4,4,44)4!4!44!}{52!}\approx 26.4\%$

This approach also answer's our question from the previous post. We have,

$\displaystyle \mathbb{P}(\text{No Ks and Qs adjacent})=\frac{g_1(4,4,44)4!4!44!}{52!}\approx 51.37\%$

The following Mathematica code was used to calculate the above values.

Clear["Global`*"];
f[a_, b_] := a / (1 - a) (1 + t b / (1 - b)) / (1 - t b / (1 - b) * t a / (1 - a));
g = f[a, b] + f[b, a];
k = 4; q = 4; c = 44;
(* SeriesCoefficient[g, {a, 0, k}, {b, 0, q}, {t, 0, c}] *)
Ways[m0_, n0_, t0_] := Ways[m0, n0, t0] = Module[{m = m0, n = n0, t = t0},
    If[t <= 0, Return[Boole[Xor[m <= 0, n <= 0]]];];
    If[t <= 1, Return[2 Boole[And[m > 0, n > 0]]];];
    If[Or[m <= 0, n <= 0], Return[0];];
    Ways[m - 1, n, t] + Ways[m, n - 1, t] - Ways[m - 1, n - 1, t] + Ways[m - 1, n - 1, t - 2]
];
Sum[Ways[k, q, j] Binomial[c - 2 * j + k + q, k + q], {j, k + q - 1}]
% * Factorial[4] * Factorial[4] * Factorial[44] / Factorial[52]
N[%]
1 - %
Sum[Binomial[44 + 1, m] Binomial[4 - 1, m - 1] Binomial[4 + 44 - m, 4], {m, 4}]
% * Factorial[4] * Factorial[4] * Factorial[44] / Factorial[52]
N[%]
1 - %
Clear["Global`*"];
Factor[(a / (1 - a + a t) + b / (1 - b + b t)) / (1 - t (a / (1 - a + a t) + b / (1 - b + b t)))]

Finally, a more direct generating function would be the following.

$\displaystyle G^*(v_1,v_2,\cdots,c)=\frac{1}{1-c}\cdot G\left(\frac{v_1}{1-c},\frac{v_2}{1-c},\cdots,c^p\right)$

Clear["Global`*"];
f = a / (1 - c - a + a c c) + b / (1 - c - b + b c c);
f = f / (1 - c c f) / (1 - c);
SeriesCoefficient[f, {a, 0, 4}, {b, 0, 4}, {c, 0, 44}]

Until then
Yours Aye
Me

Saturday, February 18, 2023

Tangent-Asymptote triangle of a Hyperbola, Part 1

While exploring a different area of math, I came across the fact that a tangential segment of a hyperbola between the asymptotes is bisected at the tangent point.

Several attempts and internet searches later, I realised that this was already known to Apollonius of Perga. Reading the Conics, it became clear Apollonius proves this through a series of propositions.

While it was a great read, I was a little disappointed at the length of the proof and wanted to find a shorter one. The content of this post consequently is the proof that I found in this regard.

Consider the hyperbola $y^2-x^2=1$. Take any point $B$ on the upper branch and draw a tangent at that point. Draw a line parallel to the tangent such that it cuts upper branch of the hyperbola at two points as shown in the following figure. We show that the midpoint of $DE$ coincides with that of $FG$.

Figure 1

The idea of the proof is to use the third dimension as well. We know that the hyperbola is the intersection curve of a cone and a perpendicular plane. 

Figure 2

If we now consider the line through $EF$ as a plane in the above figure cutting the cone, giving an ellipse as shown in the following figure. Note that this ellipse also cuts the hyperbola.

Figure 3

Looking from the $x$-axis, this arrangement would look like the following figure.

Figure 4

Looking through the line perpendicular to the plane containing the ellipse, it is clear that the tangent segment caught between the asymptotes is the major of the ellipse and the segment caught between the hyperbola is a chord parallel to the major axis of the ellipse. It is clear then that the midpoints of both the segments will lie on the minor axis.

Figure 5

But looking through the line perpendicular to the plane containing the hyperbola, the minor axis is just a point proving that the midpoint of both the segments coincide.

Moving back to 2D, this property is true for all the line cutting the hyperbola. In the limiting case of a tangent line to the hyperbola, the segment caught between the hyperbola collapses to a point which is its own midpoint and therefore the midpoint of the tangential segment caught between the asymptotes.

This is very nice as we can use it give a geometric proof of $x^{-1}$ which I wasn't able to find even after an extensive internet search.

Note that Apollonius showed something more. He also proved for a given tangent, the midpoints of all the chords parallel to the tangent lie on a straight line passing through the center of the hyperbola and too for all the conics. Truly spectacular!

Until then
Yours Aye
Me


Friday, February 3, 2023

An Inversion problem from PiMuEpsilon

Often times in math, a given problem can be solved in multiple ways using algebra, geometry, calculus etc. and we try to find the shortest proof among them. But say if we are interested in Geometry, we would still look for a geometrical proof even if it is more cumbersome. This post is a perfect example of such an instance.

Let's say we a given a semicircle $C_1$ with diameter $AB$ and a circle $C_2$ with center $P$ internally tangent to $C_1$. Let the two (half) chords of $C_1$ tangent to $C_2$ and perpendicular to diameter $AB$ meet $AB$ at $R$ and $S$ respectively. Let the line perpendicular to $AB$ and passing through $P$ meet $AB$ at $Q$. We asked to prove that $PQ$ is the geometric mean of $AR$ and $SB$.



An algebraic proof is easy to find but we attempt to find a geometric one. To do that, we begin by constructing a few circles.

We first construct a circle tangent at and passing through $B$ and also is orthogonal to the blue circle, which can be constructed by drawing a tangent from $B$ to the blue circle and using the distance from $B$ and the tangent point as the radius. This is shown in orange in the next figure.

We construct another circle that is internally tangent to the green circle at $B$ and externally tangent to the blue circle. We also reflect this circle w.r.t line $PQ$. Both these are shown in violet in the following figure.

A little thought shows that the violet circle and the green circle must be inversions of each other under the orange circle.

We finally contruct another circle using the following three points (i) intersection of line $PQ$ and the orange circle and (ii) the points of intersection of the green circle and the mirrored violet circle. This is shown in red in the following figure.


We will now show that the mirrored violet circle and the green circle are inversion of each other under the red circle.

To this effect, we construct an inversion circle with center at the (lower) point of intersection of line $PQ$ and the violet circle, and make the inversion circle orthogonal to the violet circles. The inversion circle is shown in black in the following figure.


The violet circles and line $PQ$ are all invariant under this inversion.

Both the orange and the red circle pass through the center of inversion. Therefore they become straight lines. These lines can be drawn using the points of intersection of the respective circles with the inversion circle. Note that the point of intersection of the lines must lie on line $PQ$.

Line $PQ$ and the orange circle are both orthogonal to the blue circle. Therefore, their inversions will also be so. This shows the point of intersection of the orange line and line $PQ$ is the center of the blue circle's inversion. Using the fact the blue circle is tangent to both the (invariant) violet circles, we can also construct the 'inverted' blue circle.

Because the red line also passes through the point of intersection of the orange line and line $PQ$ (because the red circle passes through the intersection of the orange circle and line $PQ$), we see that the red line is orthogonal to the 'inverted' blue circle. That means, the red circle and the blue circle are actually orthogonal to each other.


As we know that the green circle and the violet circle are inversion of each other under the orange circle, we get to see that the inversion of the green circle must be the reflection of the (invariant) violet circle w.r.t the orange line.



Also, the red circle passes through the point of intersection of the green circle and the mirrored violet circle which must also be the case after inversion. Therefore we see that the 'inverted' green circle and the mirrored violet circle are reflections of each other w.r.t red line.

That clearly means that the green circle and the mirrored violet circle are actually inversions of each other under the red circle.


Because of the said inversion, we see that $FR^2=RA \cdot RA'$

But $FR=PQ$. Also, $RA'=SB$ by symmetry. Therefore, $PQ^2=AR \cdot SB$ as claimed.

Until then
Yours Aye
Me

Thursday, January 19, 2023

Expected area of Routh's triangle

Hi everyone! Wish you a very happy New Year! Now onto my first post of the year...

I recently came up a problem in Geometric probability which really had a stunning result that took by surprise. 

Consider a triangle $ABC$. Choose points $D$, $E$ and $F$ uniformly randomly on sides $BC$, $CA$ and $AB$ respectively. The three cevians $AD$, $BE$ and $CF$ form a smaller central triangle, which we call as Routh's triangle, between them as shown in the figure below.


Now what might be the expected value of the ratio of the area of $\triangle PQR$ to that of $\triangle ABC$?

Let the random variable $T$ denote the ratio we are interested in. Then given the positions of $D$, $E$ and $F$, Routh's theorem tells us,

$\displaystyle \frac{\triangle PQR}{\triangle ABC}=\frac{(uvw-(1-u)(1-v)(1-w))^2}{(1-u+uv)(1-v+vw)(1-w+wu)}$

where $DC=u\cdot BC$, $EA=v\cdot CA$ and $FB=w\cdot AB$.

Because $u$, $v$ and $w$ are IID uniform random variables by definition,

$\displaystyle \mathbb{E}(T)=\int_0^1\int_0^1\int_0^1 \frac{\triangle PQR}{\triangle ABC} \,dudvdw$

This integral is very unwieldy despite the symmetry between the variables and I can't see a clear way of solving this triple integral. We can solve this with Wolfram Alpha, but let's see how we can solve this without a CAS.

The trick is to solve this problem is shift our focus on triangles $ARC$, $BPA$ and $CQB$. In fact, over the course of proving Routh's theorem we get as an intermittent result that

$\displaystyle \frac{\triangle{ARC}}{\triangle{ABC}}=\frac{u(1-w)}{1-w+wu}$

and similar results for $\triangle{BPA}$ and $\triangle{CQB}$. Because,

$\triangle{ABC}=\triangle{PQR}+\triangle{ARC}+\triangle{BPA}+\triangle{CQB}$

we can now use the linearity of expectation and idea of symmetry to get

$\displaystyle \mathbb{E}(T)=1-3\int_0^1\int_0^1\frac{u(1-w)}{1-w+wu}\,dudw=1-3\int_0^1\int_0^1\frac{(1-w)(1-u)}{1-wu}\,dudw$

Now using Geometric series and term-by-term integration it is easy

$\displaystyle \int_0^1\int_0^1\frac{1}{1-wu}\,dwdu=\zeta(2)$, $\displaystyle \int_0^1\int_0^1\frac{wu}{1-wu}\,dwdu=\zeta(2)-1$ and
$\displaystyle \int_0^1\int_0^1\frac{u}{1-wu}\,dwdu=\int_0^1\int_0^1\frac{w}{1-wu}\,dwdu=1$

Plugging in all these values, we get the amazing amazing result that $E(T)=10-\pi^2$

Because we've calculated an expected value of a non-negative random variable, we know that this quantity is greater than zero and hence $10>\pi^2$.

Such a beautiful result that is atleast as remarkable as A cute sum of Ramanujan!!

Until then
Yours Aye
Me