I write, therefore I am: June 2023

As usual, I was randomly (pun unintended) thinking about some problems in Geometric Probability and thought would share some of those.

Consider a unit circle centered at the origin. If we select $n$ points uniformly randomly in this circle, how 'left' the leftmost point be?

Mathematically, I'm looking for the expected value of the abscissa of the point with the 'minimum' x-coordinate. This way, this becomes an order statistic problem on a circle.

If we parametrize the chords perpendicular to the x-axis by the central angle 2$\theta$ they subtend, then the probability that a given point lies on the right of the chord is

$\displaystyle\mathbb{P}(\text{point lies on the right side of chord})=\frac{\theta-\sin\theta\cos\theta}{\pi}$

In other words, the above is the complementary CDF $\bar{F}(x)$ for a given value of $x=\cos\theta$.

Now the expected value of min. of $n$ i.i.d. random variables $X_1,X_2,\cdots,X_n$ each with a Complementary CDF of $\bar{F}(x)$ is given by

$\displaystyle\mathbb{E}(\text{min.})=\int x \,d(1-\bar{F}(x)^n)=x(1-\bar{F}(x)^n)-\int (1-\bar{F}(x)^n)\,dx$

The expected value of the min. abscissa is given by

$\displaystyle\mathbb{E}(\text{min. abscissa})=\int_\pi^0 \cos\theta \,d(1-\bar{F}(\cos\theta)^n)$

Therefore, in our case, we have

$\begin{align}\displaystyle\mathbb{E}(\text{min.})&=\cos 0 (1-\bar{F}(\cos 0)^n)-\cos\pi(1-\bar{F}(\cos\pi)^n)-\int_\pi^0 \left(1-\bar{F}(\cos\theta)^n\right)\,d(\cos\theta)\\ &= 1 - \int_0^\pi \sin\theta(1-\bar{F}(\cos\theta)^n)\,d\theta \\ &= \int_0^\pi \sin\theta\left(\frac{\theta-\sin\theta\cos\theta}{\pi}\right)^n\,d\theta -1 \end{align}$

From this, we can see

$\displaystyle \mathbb{E}(\text{min. abscissa among two points})=-\frac{128}{45\pi^2} \approx -0.2882$ using WA and

$\displaystyle \mathbb{E}(\text{min. abscissa among three points})=-\frac{64}{15\pi^2} \approx -0.4323$ using WA with

further values having a slightly complicated expressions. However, we can get somewhat decent approximations by noting the following graph.

Therefore,

$\displaystyle \mathbb{E}(\text{min. abscissa of }n\text{ points}) \approx \int_0^\pi \sin\theta \sin^{2n}(\theta/2)\,d\theta -1 = \frac{1-n}{1+n}$

which unfortunately only gets worse as $n$ gets larger.

A very similar approach can be used in case of a sphere as well just by noting that

$\displaystyle F(x)=\frac{1}{4}(2+\cos\theta)(1-\cos\theta)^2$

for $x=\cos\theta$ using the volume of a spherical cap.

Therefore,

$\displaystyle\mathbb{E}(\text{min. abscissa of }n\text{ points})=\int_0^\pi \sin\theta \left(\frac{(2+\cos\theta)(1-\cos\theta)^2}{4}\right)^n\,d\theta - 1=-1+\int_{-1}^1 \left(\frac{(2+x)(1-x)^2}{4}\right)^n\,dx$

Because of the polynomial integrand, the expected values in case of a sphere all turn out to be rationals. In fact by expanding the integral range (and adjusting later), we will be able to apply Laplace's method we can show that

$\displaystyle \mathbb{E}(\text{min. abscissa of }n\text{ points}) \approx \sqrt{\frac{\pi}{3n}}-1$

which get better for large $n$.

Infact using the power series of the log of integrand at its maximum of $x=-1$ and more terms for Laplace method like we did here, we can show that

$\displaystyle \int_{-2}^1 \left(\frac{(2+x)(1-x)^2}{4}\right)^n\,dx \approx \sqrt{\frac{4\pi}{3n}}\left(1-\frac{17}{72n}\right)$

which can be used to give a slightly better approximation.

More generally, by this point, it should be clear that for $d$-dimensional case, the expected value can be written as

$\displaystyle \mathbb{E}(\text{min. abscissa of }n\text{ points in }d\text{ dimensions})=\int_{-1}^1 \bar{F}_d(x)^n\,dx-1$

where $\bar{F}_d(x)$ is the ratio between the volume of $d$-ball cap to the volume of $d$-ball. By symmetry, the expected value for the 'rightmost' point will be the negative of this.

An associated question could be how far is the 'leftmost' point from the center of the $d$-ball?

Let $L_d$ denote the random variable we are interested in. In the circle case, we could construct the density function of $L_d$ by proceeding as follows:

First, any of the $n$ points could be the 'leftmost' points.

Second, the probability that the point lies at a distance $r$ from the center is $2rdr$.

Third, we note that this point will be uniformly distributed on the circle of radius $r$. The probability that the line joining the point to the origin makes an angle $\theta$ with the $x$-axis is $d\theta / \pi$ (considering only the upper half of the circle).

Finally, If we want this point to have the min. abscissa, the remaining points should have abscissa greater than $r\cos\theta$.

Therefore, the pdf of $L_2$ would be

$\displaystyle f(l_2) = n \cdot 2r dr \cdot \frac{d\theta}{\pi}\cdot \left(\frac{\cos^{-1}(r\cos\theta)-r\cos\theta\sqrt{1-r^2\cos^2\theta}}{\pi}\right)^{n-1}$

with $0 \le r \le 1$ and $0 \le \theta \le \pi$

We can verify that the integral of this pdf over the interval is indeed $1$ using WA.

We then have

$\displaystyle \mathbb{E}(L_2)=\int_0^1\int_0^\pi r\cdot n \cdot 2r \cdot \frac{1}{\pi}\cdot \left(\frac{\cos^{-1}(r\cos\theta)-r\cos\theta\sqrt{1-r^2\cos^2\theta}}{\pi}\right)^{n-1}\,d\theta\,dr$

Using WA, we can see

$\displaystyle \mathbb{E}(L_2\text{ with 2 points})=\frac{2}{3}$ and $\displaystyle \mathbb{E}(L_2\text{ with 3 points})=\frac{4200\pi^2+3360\log{2}-457}{6300\pi^2}\approx 0.69677$

We can do something similar in the Sphere case as well. The density function of $L_3$ would be

$\displaystyle f(l_3) = n \cdot 3r^2 dr \cdot \frac{2\pi r\sin\theta\cdot r d\theta}{4\pi r^2}\cdot \left(\frac{(2+r\cos\theta)(1-r\cos\theta)^2}{4}\right)^{n-1}$

using the idea that the first point will be uniformly distributed on the surface of a sphere with radius $r$ which could then be sliced into rings (as shown here) parametrized by $\theta$.

Again using WA, we can see

$\displaystyle \mathbb{E}(L_3\text{ with 2 points})=\frac{3}{4}$ and $\displaystyle \mathbb{E}(L_3\text{ with 3 points})=\frac{1719}{2240}\approx 0.7674$

Hope you enjoyed this. See ya later.

Until then
Yours Aye
Me

Consider the PMF of the binomial function denoted here by $f(p,k,n)$. We know what happens to this function when it summed over $k$ or integrated over $p$ in the appropriate interval. That is,

$\displaystyle \sum_{k=0}^n\binom{n}{k}p^kq^{n-k}=1$ and $\displaystyle \int_0^1 \binom{n}{k}p^kq^{n-k} \,dp=\frac{1}{n+1}$

The first by the story of the binomial and the second by Bayes' billiard argument. But what is the sum of this PMF summed over all possible $n$? If we let $S$ denote the sum, we want

$\displaystyle S=\sum_{n=k}^\infty \binom{n}{k}p^kq^{n-k}$

This is not too hard. If we multiply the above by $p$, we can see that the summand becomes the PMF of the Negative binomial distribution and must therefore sum to $1$ by definition. That clearly shows $S=1/p$.

We can the same for the PMF of the Hypergeometric distribution as well. That is we are interested in,

$\displaystyle S=\sum_{N=n+K-k}^\infty \frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}$

I wasn't able to solve this directly but with some luck and Wolfram Alpha, I was able to guess that

$\displaystyle S=\sum_{N=n+K-k}^\infty \frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}=\frac{nK}{k(k-1)}$

At about this time, I saw the following two identities in PiMuEpsilon proved using telescoping sums.

$\displaystyle \sum_{n=k}^\infty\frac{1}{\binom{n}{k}}=\frac{k}{k-1}$ and $\displaystyle \sum_{n=m}^\infty\frac{1}{\binom{n}{k}}=\frac{k}{k-1}\binom{m-1}{k-1}^{-1}$

But with these many binomials, I was sure there must be some probabilistic interpretation for the same. Am I'm gladI found we are going to solve both of these using probability in this post.

Consider a Polya Urn containing $k$ marbles of which one is white and the rest are black. Our task is to pick up one black marble from this Urn in a maximum of $m$ draws. Because this a Polya Urn, everytime we draw a ball, we replace with that ball back in the Urn along with an additional ball of the same color.

Let $I_j$ ($0\leq j<m$)be an Indicator random variable for the event of picking a black marble in the '$j+1$'th draw. Then,

$\displaystyle\mathbb{P}(I_j=1)=\frac{1}{k}\cdot\frac{2}{k+1}\cdot\frac{3}{k+2}\cdots\frac{j}{k+j-1}\cdot\frac{k-1}{k+j}=\frac{k-1}{k}\binom{k+j}{j}^{-1}$

Therefore,

$\displaystyle\mathbb{P}(\text{picking a black marble in }m\text{ draws})=\sum_{j=0}^{m-1}\mathbb{P}(I_j=1)=\frac{k-1}{k}\sum_{j=0}^{m-1}\binom{k+j}{j}^{-1}$

On the other hand, probability of picking a black marble is the complementary probability of not picking a black marble in $m$ draws.

$\displaystyle\mathbb{P}(\text{Not picking a black marble in }m\text{ draws})=\frac{1}{k}\frac{2}{k+1}\cdots\frac{m}{k+m-1}=\binom{k+m-1}{m}^{-1}$

This clearly shows that

$\displaystyle\sum_{j=0}^{m-1}\binom{k+j}{j}^{-1}=\frac{k}{k-1}\left[1-\binom{k+m-1}{m}^{-1}\right]$

The above after suitable relabelling gives,

$\displaystyle\sum_{n=k}^{k+m-1}\binom{n}{k}^{-1}=\frac{k}{k-1}\left[1-\binom{k+m-1}{k-1}^{-1}\right]$

Both the PiMuEpsilon identities given above are easy corollaries of the above identity.

We could have also started with $a$ white marbles and $b$ black marbles. In this case we would have arrived at the following result.

$\displaystyle \frac{a}{a+b}\sum_{k=0}^{c-1}\binom{b-1+k}{b-1}\binom{a+b+k}{a+b}^{-1}=1-\binom{b+c-1}{c}\binom{a+b+c-1}{c}^{-1}$

The above after some relabelling can also be written as

$\displaystyle \frac{a}{a+b}\sum_{n=b}^{c-1}\binom{n-1}{b-1}\binom{n+a}{a+b}^{-1}=1-\binom{c-1}{b-1}\binom{a+c-1}{a+b-1}^{-1}$

Hope you enjoyed this. See ya later.

Until then
Yours Aye
Me

I write, therefore I am

Thursday, June 29, 2023

'Leftmost' random points on Circles and Spheres

Saturday, June 17, 2023

Certain Inverse Sums involving Binomial Coefficients