Friday, January 28, 2022

Matt Parker and the London Lottery

In his (aptly) named video 'Matt Explains: The Lottery', Matt explains the probabilities associated with the London lottery. In the first video he made on this topic, he worked out the probability of winning the Jackpot with the new system where we have a chance to win an extra lottery.

To make things clear, here are the winnings that I understood from his video. Six numbers are drawn (without repetition) from a pool of 59 numbers. If all 6 numbers match, then we win the Jackpot. If we match 5 numbers, then we get a bonus number whose match win us 50000 pounds and 1000 pounds otherwise. If we match 4 numbers, we win a 100 pounds and 3 matches wins us 25 pounds. Finally, under the new system, matching 2 numbers gives us no cash prize but gives us a new lottery which we can use the next time.

The number of matches are given by the Hypergeometric distribution as explained (and calculated) by Matt in the video. In his first video, Matt ignores all cash prizes and focuses only on the Jackpot and the extra lottery.

If we let $p_k$ denote the probability of $k$ matches and $p$ denote the probability of ultimately winning the Jackpot, we have

$p=p_6+p_2p$ which upon simplification gives $\displaystyle p = \frac{p_6}{1-p_2}$

Substituting the known values, gives a probability of $p=2.459111 \times 10^{-8}$ (or) an Odds of 1 in 40665098.

As we will be dealing with extremely small probabilities, let's define 1 micro-bps as a millionth of a basis point. In this sense, the probability in the above case is approx. 245.91 micro-bps.

At this point, as per a Subscriber's suggestion, Matt tries to attack the problem where we repeatedly buy lotteries using all our winnings until we win the Jackpot. Here Matt calculates the expected number of lotteries won with a given lottery.

If we let $l$ denote the expected number of lotteries won per lottery, then

$l=p_2 \cdot 1 + p_3 \cdot 12.5 + p_4 \cdot 50 + p_{5,1} \cdot 500 + p_{5,2} \cdot 25000 \approx 0.257191$

where we have used the fact that each lottery cost 2 pounds. Matt now uses the following equation (indirectly), to get the probability of winning the Jackpot.

$p=p_6+l \cdot p \tag{1} \label{wrongeqn}$ which gives $\displaystyle p = p_6/(1-l)$.

Using the known values, we get $p=298.7830655$ micro-bps (or) an Odds of about 1 in 33469098.

But something is definitely amiss in $\eqref{wrongeqn}$. For example, we can easily construct a case where $l > 1$ which will then give negative probabilities. What then is the error, if any, in Matt's reasoning? Or is there a hidden assumption that was not explicitly stated here?

To understand whats going on here, we should use idea that the probability of getting at least one win in $n$ trials is $1-(1-p)^n$ and interpret $p$ as the probability of winning atleast one Jackpot. Using this, a better expression is given by,

$p=p_6+(1 - p_6)(1 - (1 - p)^\bar{l}) \tag{2} \label{crudeeqn}$

where $\bar{l}=l/(1-p_6)$ is the expected number of lotteries won conditional on not winning the Jackpot directly. The above equation then means that we either win the Jackpot or get some lotteries to continue our pursuit of winning the Jackpot. Not so surprisingly, this equation can be solved in closed form. We have,

$\displaystyle p = 1 - (1 - p_6)^{1 / (1 - \bar{l})} \tag{3} \label{cruderes}$

This gives a probability of $p=298.7830666$ micro-bps (or) an Odds of about 1 in 33469097.

If we assume $p \approx 0$ in $\eqref{crudeeqn}$, then using the approximation $(1+x)^n \approx nx$ for small $x$, we end up with Matt's result. This clearly shows the hidden assumption in result derived by Matt.

But even $\eqref{crudeeqn}$ is still not exactly correct. This can still result in a value that is less than 0 or greater than 1 in certain cases which doesn't make any sense.

This problem occurs because, in the RHS of $\eqref{crudeeqn}$, we casually make an assumption that $\mathbb{E}(1-(1-p)^L)=1-(1-p)^{\mathbb{E}(L)}$ where $L$ is the random number of lotteries won (conditional on not winning the Jackpot). But this is clearly wrong because of Jensen's Inequality.

Therefore, the correct way to do this would be to solve the following equation.

$p=p_6+p_{5,2}(1-(1-p)^{25000})+p_{5,1}(1-(1-p)^{500})+p_4(1-(1-p)^{50})+p_3(1-(1-p)^{12.5})+p_2(1-(1-p))$

where we explicitly condition on the resulting coupons. Solving this, we get $p=298.782539$ micro-bps (or) an Odds of about 1 in 33469156. 

This is good but there is still one final piece of hiccup here. In case of 2 matches, we get 25 pounds which we have assumed is equal to 12.5 lotteries. But non-integer tickets doesn't make much sense.

In case 2 matches wins us 24 pounds, then we would have $p=296.70581679$ micro-bps (or) an Odds of about 1 in 33703416. Had it been 26 pounds, then we would have $p=300.888537999$ micro-bps (or) an Odds of about 1 in 33234897.

The precise way to model this is to split the problem into cases: one in which we have just a lottery and the other in which we have the lottery and a pound. If we let $p$ and $p_e$ be the probabilities of winning at least one Jackpot in the respective cases, then we have

$p=p_6+p_{5,2}(1-(1-p)^{25000})+p_{5,1}(1-(1-p)^{500})+p_4(1-(1-p)^{50})+p_3(1-(1-p_e)(1-p)^{11})+p_2(1-(1-p)) \tag{4} \label{firstsimul}$
and
$p_e=p_6+p_{5,2}(1-(1-p_e)(1-p)^{24999})+p_{5,1}(1-(1-p_e)(1-p)^{499})+p_4(1-(1-p_e)(1-p)^{49})+p_3(1-(1-p)^{13})+p_2(1-(1-p_e)) \tag{5} \label{secondsimul} $

Solving this simultaneous equations gives, $p=296.75282794$ micro-bps (or) an Odds of about 1 in 33698077 and $p_e=300.13462527$ micro-bps (or) an Odds of about 1 in 33318380.

These equations have to be solved numerically which may be compute-intensive. A compromise would be let Jensen's inequality slide and consider the following equations.

$p=p_6+(1-p_6)(1-(1-p)^{\bar{l}_{11}}(1-p_e)^{\bar{l}_{12}})$ and $p_e=p_6+(1-p_6)(1-(1-p)^{\bar{l}_{21}}(1-p_e)^{\bar{l}_{22}})$

where $l_{11}=25000p_{5,2}+500p_{5,1}+50p_4+11p_3+p_2$, $l_{12}=p_3$, $l_{21}=24999p_{5,2}+499p_{5,1}+49p_4+13p_3$ and $l_{22}=p_{5,2}+p_{5,1}+p_4+p_2$ are the (unconditional) expected lotteries and lottery-pounds in the respective cases, and $\bar{l}_k=l_k/(1-p_6)$ for $k \in \{11, 12, 21, 22\}$. 

Like before, this has the advantage of being solved in closed form. We have,

$p=1-(1-p_6)^{\bar{m}}$ and $p_e=1-(1-p_6)^{\bar{m}_e}$

where $\displaystyle \bar{m} = \frac{1-\bar{l}_{22}+\bar{l}_{12}}{(1-\bar{l}_{11})(1-\bar{l}_{22})-\bar{l}_{12}\bar{l}_{21}}$ and $\displaystyle \bar{m}_e=\frac{1-\bar{l}_{11}+\bar{l}_{21}}{(1-\bar{l}_{11})(1-\bar{l}_{22})-\bar{l}_{12}\bar{l}_{21}}$

Using the known values, we get $p=296.7533437$ micro-bps and $p_e=300.1351482$ micro-bps. Like before, if we assume $p_6$ is also small, then we get an even simpler linear relation.

Or we could assume both $p$ and $p_e$ are small in $\eqref{firstsimul}$ and $\eqref{secondsimul}$, which will give us a pair of linear simultaneous equations. Solving them would give us,

$\displaystyle p = mp_6$ and $\displaystyle p_e = m_ep_6$

where $m$ (and $m_e$) is the same as $\bar{m}$ (and $\bar{m}_e$) except with unconditional expectations. Either way, these approximations become susceptible to absurd probabilities if not applied carefully.

Now, for the next part, let's say we solve everything exactly. If we have 4 pounds, then we could buy 2 lotteries. If we assume that we don't mix the proceeds from on with the other, then the probability of winning atleast one Jackpot is $1-(1-p)^2$ which with small value approximation gives $2p$. Similarly, With 5 pounds, the probability of winning atleast one Jackpot is $p+p_e$.

But with 6 pounds, we have a choice. We could either play three individual lotteries with winning probability $3p$ or two lottery-pounds with winning probability $2p_e$. With the values at hand, playing three individual lotteries is the clear choice.

This shows that it is better to convert all cash into lotteries to the fullest extent possible. Therefore, for integer $n$, we have,

$p_{2n}=np$ and $p_{2n+1}=(n-1)p+p_e$

All of the above were calculated using the following Mathematica code.

Clear["Global`*"];
e1 = Binomial[6, 1] Binomial[53, 5] / Binomial[59, 6];
e2 = Binomial[6, 2] Binomial[53, 4] / Binomial[59, 6];
e3 = Binomial[6, 3] Binomial[53, 3] / Binomial[59, 6];
e4 = Binomial[6, 4] Binomial[53, 2] / Binomial[59, 6];
e51 = Binomial[6, 5] Binomial[53, 1] / Binomial[59, 6] * 52 / 53;
e52 = Binomial[6, 5] Binomial[53, 1] / Binomial[59, 6] * 1 / 53;
e6 = Binomial[6, 6] Binomial[53, 0] / Binomial[59, 6];

f[p_] := e6 + e52 (1 - Power[1 - p, 25000]) + e51 (1 - Power[1 - p, 500]) + e4 (1 - Power[1 - p, 50]) + e3 (1 - Power[1 - p, 25 / 2]) + e2 p - p;
g[p1_, p2_] := {e6 + e52 (1 - Power[1 - p1, 25000]) + e51 (1 - Power[1 - p1, 500]) + e4 (1 - Power[1 - p1, 50]) + e3 (1 - (1 - p2) Power[1 - p1, 11]) + e2 p1,
e6 + e52 (1 - (1 - p2) Power[1 - p1, 24999]) + e51 (1 - (1 - p2) Power[1 - p1, 499]) + e4 (1 - (1 - p2) Power[1 - p1, 49]) + e3 (1 - Power[1 - p1, 13]) + e2 p2};

(* NSolve[p == e6 + e52 (1 - Power[1 - p, 25000]) + e51 (1 - Power[1 - p, 500]) + e4 (1 - Power[1 - p, 50]) + e3 (1 - Power[1 - p, 12]) + e2 p && 0 < p < 1, p] 
NSolve[p == e6 + e52 (1 - Power[1 - p, 25]) + e51 (1 - Power[1 - p, 5]) + e4 (1 - Power[1 - p, 3]) + e3 (1 - Power[1 - p, 2]) + e2 p && 0 < p < 1, p] *)

(* n = 1; nmax = 1000;
a = N[0, 50]; b = N[1 / 1024 / 1024, 50]; tol = Power[10, -75];
While[n <= nmax,
    c = (a + b) / 2; temp = f[c];
    If[Or[(b - a) / 2 < tol, Abs[temp] < tol tol], Break[];];
    n += 1;
    If[temp > 0, a = c;, b = c;];
];
c *)

(* With only Jackpot and extra lottery *)
p = e6 / (1 - e2);
N[p, 15]
Floor[(1 - p) / p]

(* Matts Solution *)
l = e2 + (25 / 2) e3 + 50 e4 + 500 e51 + 25000 e52;
p = e6 / (1 - l);
N[p, 20]
Floor[(1 - p) / p]

(* Ignoring Jensens *)
lb = l / (1 - e6);
p = 1 - Power[1 - e6, 1 / (1 - lb)];
N[p, 20]
Floor[(1 - p) / p]

(* FindRoot[f[p1], {p1, 1}, WorkingPrecision -> 25] *)

(* With 25 units *)
p = 2.9878253945 Power[10, -8];
N[p, 20]
Floor[(1 - p) / p]

(* With 24 units *)
p = 2.9670581679 Power[10, -8];
N[p, 20]
Floor[(1 - p) / p]

(* With 26 units *)
p = 3.00888537999 Power[10, -8];
N[p, 20]
Floor[(1 - p) / p]

(* With 25 units but with extra lottery*)
(* FindRoot[g[p1, p2] == {p1, p2}, {p1, 1}, {p2, 1}, WorkingPrecision -> 25] *)
p = 2.9675282794 Power[10, -8];
Floor[(1 - p) / p]
p = 3.0013462527 Power[10, -8];
Floor[(1 - p) / p]

(* Ignoring Jensens *)
l11 = 25000 e52 + 500 e51 + 50 e4 + 11 e3 + e2; l12 = e3;
l21 = 24999 e52 + 499 e51 + 49 e4 + 13 e3;l22 = e52 + e51 + e4 + e2;
l11b = l11 / (1 - e6); l12b = l12 / (1 - e6); l21b = l21 / (1 - e6); l22b = l22 / (1 - e6);
bdeno = (1 - l11b) (1 - l22b) - l12b l21b;
mb = (1 - l22b + l12b) / bdeno; meb = (1 - l11b + l21b) / bdeno;
p = 1 - Power[1 - e6, mb];
N[p, 15]
Floor[(1 - p) / p]
pe = 1 - Power[1 - e6, meb];
N[pe, 15]
Floor[(1 - pe) / pe]

(* Small value approximations *)
deno = (1 - l11)(1 - l22) - l12 l21;
N[e6 (1 + (l12 - l22)) / deno, 10];
N[e6 (1 + (l21 - l11)) / deno, 10];
m = (1 - l22 + l12) / deno;
me = (1 - l11 + l21) / deno;
N[e6 m, 10]
N[e6 me, 10]

(* temp[n_] := temp[n] = If[n == 2, p, pe];
Calc[n_] := Module[{res = {}, m = IntegerPartitions[n, All, {2, 3}]},
    If[n < 2, Return[0];];
    Do[
        res = Join[res, {1 - Times @@ (1 - temp /@ k)}];
    , {k, m}];
    Max[res]
];
Power[10, 10] Table[Calc[n], {n, 30}] *)


Hope you enjoyed the discussion. See you in the next post.

Yours Aye
Until then
Me

Wednesday, November 10, 2021

Some more problems on a Deck of cards

We already saw how a standard deck of cards forms a nice setting for studying both Coupon Collector problem without replacement and Birthday problem without replacement. Here we extend those to some more problems for a fun!

The first problem is to ask for the expected number of cards such that we have atleast one matching suit or one matching rank. This problem is relatively easy. We proceed using the same idea as before.

Let $X$ be the random variable denoting the number of cards need to get atleast one matching suit or rank. Then, the probability that we need more than $k$ cards to get this,

$\displaystyle \mathbb{P}(X>k)=\frac{k!\binom{13}{k}\binom{4}{k}}{\binom{52}{k}}$

That is, we are finding the probability that we neither have a mathcing suit nor a matching rank after we have drawn $k$ cards without replacement. Therefore,

$\displaystyle \mathbb{E}(X)=\sum_{k>0}\mathbb{P}(X>k)$

So, we need, on average, $\approx 3.0799$ cards to get either a matching suit or a matching rank when drawing cards without replacement from a randomly shuffled standard deck of cards.

To simplify text, let $S_k$ denote the event of having atleast one matching suit after having drawn $k$ cards without replacement and $R_k$ denote the same event for rank. Then, we know $\mathbb{P}(S^c_k)$, $\mathbb{P}(R^c_k)$, and $\mathbb{P}(S^c_k\text{ and }R^c_k)$.

Using Inclusion-Exclusion principle, we have

$\displaystyle \mathbb{P}(S^c_k \text{ or } R^c_k)=\mathbb{P}(S^c_k)+\mathbb{P}(R^c_k)-\mathbb{P}(S^c_k\text{ and }R^c_k)$

This allows to calculate the expected number of cards need to get both a matching suit and a matching rank when drawing cards without replacement. If $X$ denotes this random variable, then

$\mathbb{P}(X>k)=1-\mathbb{P}(S_k\text{ and }R_k)=\mathbb{P}(S^c_k \text{ or } R^c_k)$

This shows that

$\displaystyle \mathbb{E}(S \text{ and } R)=\mathbb{E}(S)+\mathbb{E}(R)-\mathbb{E}(S\text{ or }R)\approx 3.2678 + 5.6965 - 3.0799=5.8844$

Therefore, we need, on average, $\approx 5.8844$ cards to get both a matching suit and a matching rank when drawing cards without replacement from a randomly shuffled standard deck of cards.

Now we ask a similar question in the Coupon collector problem. That is, we seek the expected number of cards needed to collect all suits and all ranks when drawing cards without replacement from a well shuffled pack of cards.

Like we did before, we will be using idea of negative Geometric random variable and the Min-Max identity. Let $X_k$ ($k \in \{\text{Spades}, \text{Clubs}, \text{Hearts}, \text{Diamonds}\}$)be the random number of draws needed without replacement to get first suit. Similarly, let $Y_k$ for $1\leq k \leq 13$ be the corresponding random variables for ranks.

If we let $Z$ denote the random number of draws needed to collect all suits and ranks, and $A$ be the set $\{X_S, X_C,\cdots,Y_1,Y_2,\cdots,Y_{13}\}$, then by the Min-Max identity, we know

$\displaystyle \mathbb{E}(Z)=\mathbb{E}(\text{max }A)=\sum_{M \subset_\emptyset A}(-1)^{|M|}\mathbb{E}(\text{min }M)$

where we have used $\subset_\emptyset$ to denote a non-empty subset.

It just remains to calculate the expected values of the subsets. This is not very difficult and we can proceed as we did in the earlier post. For example, let's solve for $M=\{X_S,X_C,Y_5,Y_6,Y_7\}$.

Then $\mathbb{E}(\text{min }M)$ is the expected number of cards needed to get either a Spade or a Clubs or any of ranks $5$, $6$ or $7$. This is again a Negative Geometric variable with population size $N=52$ and "successes" $K=13+13+4+4+4-6=32$. We are subtracting the six cards which are the 5, 6 and 7 of Clubs and Spades to avoid double counting.

Now using the expectation of Negative Geometric variable, we easily see that $\mathbb{E}(\text{min }M)=(52+1)/(32+1)$.

Expanding the idea, the expected value of $Z$ can be seen to be

$\begin{align}\displaystyle \mathbb{E}(Z)&=(52+1)\left[1-\sum_{i=0}^4\sum_{j=0}^{13}\binom{4}{i}\binom{13}{j}\frac{(-1)^{i+j}}{13i+4j-ij+1}\right]\\ &=(52+1)\left[1-(-1)^{4+13}\sum_{i=0}^4\sum_{j=0}^{13}\binom{4}{i}\binom{13}{j}\frac{(-1)^{i+j}}{52-ij+1}\right]\end{align}$

Evaluating this with Wolfram Alpha, we finally see that the number of cards needed, on average, to collect atleast one card from each suit and rank when drawing cards without replacement from a shuffled deck of cards is $\approx 27.9998$, very close to an integer.

Until then
Yours Aye
Me

Tuesday, November 9, 2021

Estimating the height of a Mountain (or a Building)

 I recently watched Matt Parker's attempt to measure the Earth's radius with Hannah Fry. As always, Matt made it entertaining but there was something missing there which I wanted to share in this post.

I remember seeing the idea used in the video in History of Geodesy and the related idea of finding the height of a mountain by ancient mathematicians. The idea that I find the most problematic in the approach used in the video is the use of sextant in finding the tower's height.

In my opinion, parallax error induced in hand made sextants is difficult to control and can completely throw off estimates. While we can't do away with the sextant in finding the Earth's circumference, we certainly can do better in the other case.

All we need now is a pole of known length and two measurements using that pole as shown in the figure above. That is we first place the pole at some point and move ourselves into a position such that the top of the pole and the top of the coincide in our line of sight. Now repeat the same at a different point.

Let $OH$ be the height we are tying to measure. Let $AC=BD$ be the pole whose length we consider unity to simplify calculations. Now, ssing the similarity of triangles $OA'H$ and $OB'H$, we have

$$\displaystyle \frac{OH}{AC}=\frac{OB'}{BB'}=\frac{OA'}{AA'}=\frac{B'A'}{BB'-AA'}=\frac{B'A'}{B'A'-BA}=\frac{AB+BB'-AA'}{BB'-AA'}$$

which is an expression of length measurements. Not only do we avoid sextants here, we also don't have to worry about measuring lengths from the mountain's (or building's) base.

Note that $A'$ and $B'$ are the only measurement in the expression which is prone to error because of the parallax. If we let $OH=y$ and $BB'-AA'=z$, then we can rewrite the above as

$$OH=y=AC\cdot \left(1+\frac{AB}{z}\right)$$

Differentiating the above expression, we see that

$$\displaystyle \frac{dy}{dz}=-AC\cdot \frac{AB}{z^2}$$

This shows that if $z$ is too small (which means $AA'$ and $BB'$ are almost equal), then even a small change in $z$ would result in a large change in the height. So the best thing is to make the first measurement close enough to the mountain and the second measurement far so as to reduce this effect.

Because we rely on line of sight in finding both $A'$ and $B'$, we are interested in some sort of interval estimate for the height. To this end, we consider any measurements involving points $A'$ and/or $B'$ are independent Normal variables with variance $\sigma^2$. Now,

$$\displaystyle OH=AC \cdot \left(1+\frac{AB}{Z}\right)$$

where $Z =Z_{BB'}-Z_{AA'}\sim \mathcal{N}(BB'-AA',2\sigma^2)$.

Now to attain a confidence level of $\alpha$, we have

$\displaystyle \mathbb{P}(OH^{-} \leq OH \leq OH^{+})=\alpha$ where $\displaystyle OH^{\pm}=AC\cdot\left(1+\frac{AB}{BB'-AA' \mp z_\alpha \sqrt{2}\sigma}\right)$


Note that $z=BB'-AA'$ needs two measurements. We could alternately use a single measurement $A'B'$ which also is random with variance $2\sigma^2$. Practically though, as the two measuring points are far away, $A'B'$ will be long. In this case, we have,

$\displaystyle \mathbb{P}(OH^{-} \leq OH \leq OH^{+})=\alpha$ where $\displaystyle OH^{\pm}=AC\cdot\left(1-\frac{BA}{B'A' \mp z_\alpha \sqrt{2}\sigma}\right)^{-1}$

I plan to use this idea to estimate the height of a temple tower near my place. I'll update if I find anything.


Until then
Yours Aye
Me

Saturday, September 4, 2021

Animating the Spherical Tautochrone and Isochrone

I recently made a submission to 3Blue1Brown's Summer of Math Exposition based on the idea of Spherical Tautochrones. It's a short clip and if interested you can watch the video on my channel.

While doing the video, I wanted to know exact location of the video at any given time to make the animation scientifically correct. But rather than make this about the spherical case, let's make it more general. Then, using the Conservation of energy we first have,

$\displaystyle\frac{ds}{dt}=\sqrt{2g(y_0-y)}$

where $y_0$ denotes the y-coordinate of the sliding particle at time $t=0$.

Using the condition we derived for the Tautochrone curve in the video,

$\displaystyle \frac{ds}{dy}=\frac{T_0}{\pi}\sqrt{\frac{2g}{y}}$

Using the above two equations, we get

$\displaystyle \frac{dy}{dt}=\frac{\pi}{T_0}\sqrt{y(y_0-y)}$

In my python code for the video, I used python's odeint to numerically solve this equation. But thinking about it, there was a much more easier way to do this.

Using the pendulum analogy we saw in the video, we can multiply the 'force equation' of the pendulum by $l$, the length of the pendulum to get

$\displaystyle \frac{d^2s}{dt^2}+\frac{g}{l}s=0$

which is just the equation of the Simple Harmonic motion as noted in the video. Solving this for the initial condition that $s=s_0$ at $t=0$ and $s=0$ at $t=T_0$, we easily see that

$\displaystyle s=s_0\cos\left(\frac{\pi}{2}\frac{t}{T_0}\right)$

where we have used the expression $T_0=(\pi/2)\sqrt{l/g}$ to eliminate the $g/l$ term.

Also from the condition that we derived for tautochrones, we know that $s$ is directly proportional to $\sqrt{y}$. This immediately gives

$\displaystyle y=y_0\cos^2\left(\frac{\pi}{2}\frac{t}{T_0}\right)$

which gives the position of the particle at time any time $t$. For the spherical co-ordinate, this reduces to

$\displaystyle \cos\left(\frac{\theta}{2}\right)=\cos\left(\frac{\theta_0}{2}\right)\cos\left(\frac{\pi}{2}\frac{t}{T_0}\right)$

This means, we could have simply used this explicit expression rather than numerically solving a differential equation in the animation!!

Suddenly then, I realized that this will not work when animating the (polar) Isochrone. But then it dawned on me that the polar velocity is constant by definition of the polar isochrone. Great!!


Until then
Yours always
Me

Wednesday, August 18, 2021

Birds on a Wire - a Probability puzzle

This post is about birds on a wire - a nice probability puzzle that I recently saw on cut-the-knot website. The site also gives five solutions all of which are somewhat informal.

While all the solutions were pretty, I was particularly struck by the elegant solution by Stuart Anderson. Studying his solution made me realize that the condition that '$n$ is large' is not very significant. In fact, we'll use his idea to solve the question for the finite version of the problem.

Let the length of the wire be $1$ unit. Let $X_k$ ($1\leq k \leq n$) denote the position of the $k$th bird to sit on the wire. That is, all the $X_k$'s are independent standard uniform random variables.

Let $X_{(k)}$ be the random variable denoting the $k$th Order statistic. Now let's define new random variables $Y_k$ for $0 < k \leq n+1$ such that

$Y_k=X_{(k)}-X_{(k-1)}$ with $X_{(0)}=0$ and $X_{(n+1)}=1$.

Note that the $n$ birds divide the wire into $n+1$ segments and the random variable $Y_k$ denote the length of the $k$th segment.

Using the joint distribution of the Order statistics of the Uniform distribution, we can see that the $Y_k$s are identical (but not necessarily independent) $Beta(1,n)$ variables.

Therefore, the cumulative distribution for any $Y_k$ is $F(y)=\mathbb{P}(Y_k \leq y)=1-(1-y)^n$ and the expected value is $\mathbb{E}(Y_k)=1/(n+1)$.

Now if $n$ were large, we can use the idea that the above expression for CDF is approximated by $1-e^{-ny}$ and the expected value by $1/n$. But as noted earlier, we don't have to assume $n$ is large and proceed directly.

Let's define new variables $C_k$ such that

$\displaystyle C_k=\begin{cases}0, & \text{if }k \text{th segment is not colored}\\  Y_k, & \text{otherwise} \end{cases}$

It is straight forward to see that

(i) $\mathbb{E}(C_1)=\mathbb{E}(C_{n+1})=0$ because those segments never get coloured and
(ii) $\mathbb{E}(C_2)=\mathbb{E}(Y_2)=\mathbb{E}(C_n)=\mathbb{E}(Y_n)=1/(n+1)$ because those segments always gets coloured.

For the cases $2<k<n$, consider the segment $Y_k$ along with its neighbours $Y_{k-1}$ and $Y_{k+1}$. A minute's thought shows that the $k$th segment is not coloured if and only if it is the biggest among the three variables.

If we let $Z_k=\max(Y_{k-1},Y_k,Y_{k+1})$, we can rewrite $C_k$ as

$\displaystyle C_k=\begin{cases}0, & Y_k=Z_k\\  Y_k, & \text{otherwise} \end{cases}$

But $\displaystyle \mathbb{P}(Y_k=Z_k)=\frac{1}{3}$ by symmetry. Therefore,

$\displaystyle \mathbb{E}(C_k)=\mathbb{P}(Y_k=Z_k)\cdot 0+\mathbb{P}(Y_k \ne Z_k) \mathbb{E}(Y_k|Y_k \ne Z_k)=\frac{2}{3}\mathbb{E}(Y_k|Y_k \ne Z_k)$

To do this, we condition on $Y_k$. This gives,

$\displaystyle \mathbb{E}(Y_k)=\frac{1}{3}\mathbb{E}(Y_k|Y_k=Z_k)+\frac{2}{3}\mathbb{E}(Y_k|Y_k \ne Z_k)$

Rearranging this, we have

$\displaystyle \mathbb{E}(C_k)=\frac{2}{3}\mathbb{E}(Y_k|Y_k \ne Z_k)=\mathbb{E}(Y_k)-\frac{1}{3}\mathbb{E}(Y_k|Y_k=Z_k)=\mathbb{E}(Y_k)-\frac{1}{3}\mathbb{E}(Z_k)$

To compute the expected value of $Z_k$, we proceed in two parts.

First, consider any distribution with CDF $F(v)$ and the order statistics $V_{(i)}$, $V_{(j)}$ and $V_{(k)}$ with $i<j<k$.

It can be seen intuitively that the conditional distribution of $V_{(j)}$ given $V_{(i)}$ and $V_{(k)}$ is the same as the $(j-i)$th order statistic obtained from a sample of size $k-i-1$ whose distribution is $F(v)$ truncated on the left at $V_{(i)}$ and on the right at $V_{(k)}$.

For example, if we know that $V_{(5)}=0.4$ and $V_{(10)}=0.9$ from the standard uniform distribution $U(0,1)$, then $V_{(7)}$ has the same distribution as $W_{(2)}$ from a sample of size $4(=10-5+1)$ from the distribution $W\sim U(0.4,0.9)$.

This result can be obtained by using Theorems 2 and 3 given here.

Second, it is a well known result in probability (cuttheknothere and here to quote a few) that if a line of length 1 unit is broken at $n-1$ points giving $n$ segments, then the expected length of the $k$th largest segment is $(1/k+\cdots+1/n)/n$. Specifically, the length of the largest segment in a 3 segment case is $11/18$.

Therefore,

$\displaystyle \mathbb{E}(Z_k)=\mathbb{E}(\mathbb{E}(Z_k|X_{(k-2)},X_{(k+1)}))=\mathbb{E}\left(\frac{11}{18}(X_{(k+1)}-X_{(k-2)})\right)=\frac{11}{18}\frac{3}{n+1}$

Using the above and substituting the known values, we get

$\displaystyle \mathbb{E}(C_k)=\frac{1}{n+1}-\frac{1}{3}\frac{11}{18}\frac{3}{n+1}=\frac{7}{18n+18}$

If we now let $C=C_1+C_2+\cdots+C_n+C_{n+1}$, then using the linearity of expectations, we get

$\displaystyle \mathbb{E}(C)=\frac{2}{n+1}+(n-3)\frac{7}{18n+18} =\frac{7n+15}{18n+18}$

Note that this applies only when $n\geq 3$ (cases $n<3$ are trivial to calculate). More importantly, this shows that the expected value tends to $7/18$ for large $n$.

I would like to thank Abhilash Unnikrishan for pointing out the error (I assumed that $Y_k$s are independent) in my original solution and to use the 'expected value of largest segment in a line' to simplify calculations in arrive at the final result. 

Hope you enjoyed the discussion.


Until then
Yours Aye
Me

Sunday, July 11, 2021

Doubling the Cube

It's always interesting anytime we get to visit one of the three classical problems in Geometry. As the title of this post says, we are going to talk about Doubling the cube.

It is well known that doubling the cube is impossible with only a compass and straightedge but is tractable with a neusis ruler. One of the simplest such constructions is given Wikipedia. We use a slightly modified version in this post.


Construct an equilateral triangle $ABC$. Draw a line through $B$ perpendicular to $BC$. We now use the neusis to find point $D$ (on line $AB$) and $E$ such that $AB=ED$. Then, $CE=\sqrt[3]{2}AB$.

The wiki article quoted above does not prove this and I decided to try it on my own. To begin with, because $\angle CBE$ is right angled, we see that $BE=\sqrt{CE^2-CB^2}$.

Now we construct a point $E'$ on $BD$ such that $EE'$ is parallel to $CB$.



Because $\triangle DEE' \sim \triangle DCB$, their sides are in proportion. Therefore,

$\displaystyle \frac{EE'}{CB} = \frac{DE}{DC} \implies EE'=BC \cdot \frac{BC}{BC+CE}$.

As $\triangle BE'E$ is a $30-60-90$ triangle, $BE=\sqrt{3}EE'$. Therefore,

$\displaystyle \sqrt{3}\frac{BC^2}{BC+CE}=\sqrt{CE^2-BC^2} \implies (CE^2-BC^2)(CE+BC)^2=3BC^4$.

If we let $CE=xBC$, then the above expression reduces to $(x^2-1)(x+1)^2=3$.

By ratio test, we can guess that $-2$ is a root of the equation. Taking that factor out, we can see that $x=\sqrt[3]{2}$ thus proving that $CE=\sqrt[3]{2}BC$.

This last part of the proof is the one I find a little unsatisfactory. For one of the classical problems in Geometry, this proof is more algebraic than geometric.

I went in search of a more geometric proof myself. I couldn't find one and I went on an internet search and to my surprise, there aren't a lot. Finally, I found Nicomedes' proof in this page which gave me what I was looking for. I also get to know about Crockett Johnson's painting on Doubling the cube.

Luckily, I was able to modify (and simplify) the argument to get something that looks elegant to my eyes. This proof starts by constructing a line through $D$ perpendicular to $CB$. Let this line meet (the extension of) $CB$ at $J$.



As $\triangle CBE \sim \triangle CJD$,

$\displaystyle \frac{CB}{BJ}=\frac{CE}{ED} \implies CE \cdot BJ=CB \cdot ED=CB \cdot AB$

Note that $\triangle DBJ$ is a $30-60-90$ triangle. Therefore, $BD=2BJ$. This finally shows us that

$CE \cdot BD=2CB \cdot AB \tag{1} \label{reqn}$

Now comes the second part of our construction. Extend $AC$ and mark a point $F$ on it such that $CF=CD$. Join $D$ and $F$. Draw a line through $B$ perpendicular to $AD$ and mark its intersection with $AF$ as $G$. Draw a line through $G$ parallel to $AB$ and let it meet $DF$ at $H$. Join $H$ and $B$.



$\triangle GAB$ is a $30-60-90$ triangle. Therefore, $GA=2AB=2CB$.

Because $CD=CF$ by construction, $CE=GF$. Using this, we recast $\eqref{reqn}$ to

$GF \cdot BD=GA \cdot AB \tag{2} \label{rreqn}$

Our plan now is to prove that $GH=AB$. As $\triangle FGH \sim \triangle FAD$, we have

$\displaystyle \frac{GH}{GF}=\frac{AD}{AF} \implies GH \cdot AF=GF \cdot ( AB+BD)=GF\cdot AB+GF\cdot BD$

Using $\eqref{rreqn}$ now, we have,

$\displaystyle GH \cdot AF=GF\cdot AB+GA\cdot AB=AB\cdot (GF+GA)=AB \cdot AF \implies GH = AF$

This also shows that $BH$ is parallel to $AF$. Therefore,

$\displaystyle\frac{GH}{GF}=\frac{AD}{AF}=\frac{BD}{BH}  \tag{3} \label{feqn}$

because $\triangle FGH \sim \triangle FAD \sim \triangle HBD$.

We now construct a point $C'$ on $AB$ such that $CC'$ is perpendicular to $AB$. Note that $C'$ will also be the midpoint of $AB$. Now,

$DB\cdot DA=(DC'-C'B)(DC'+C'B)=DC'^2-C'B^2$

Now $CC'^2=CD^2-C'D^2=CB^2-C'B^2$ both by Pythagoras theorem on respective $\triangle CC'D$ and $\triangle CC'B$. Using this,

$DB \cdot DA=CD^2-CB^2=CF^2-CA^2=(CF-CA)(CF+CA)=GF\cdot AF$

Therefore,

$\displaystyle \frac{GF}{DB}=\frac{DA}{AF}$

(Alternately, we can construct a circle centered at $C$ with $CA$ as radius. Because both $D$ and $F$ are equidistant from $C$, tangents from $D$ and $F$ to the circle will be of the same length. In other words, both points have the same power w.r.t the circle. But it is apparent from the configuration that $\mathcal{P}(C,D)=DB\cdot DA$ and $\mathcal{P}(C,F)=FG \cdot FA$)

Using $\eqref{feqn}$, we finally have $\displaystyle \frac{GH}{GF}=\frac{GF}{BD}=\frac{BD}{BH}$

Because $ABHG$ is a parallelogram, $BH=AG=2GH$. Therefore,

$\displaystyle \left(\frac{GF}{GH}\right)^3=\frac{GF}{GH}\frac{GF}{GH}\frac{GF}{GH}=\frac{GF}{GH}\frac{BD}{GF}\frac{BH}{BD}=\frac{BH}{GH}=2$

Using the fact that $GF=CE$ and $GH=AB$, we finally have

$CE^3=2AB^3$ (or) $CE=\sqrt[3]{2}AB$


Until then
Yours Aye
Me

Thursday, May 20, 2021

Birthday Problem without Replacement

In one of the previous posts, we saw the generalization of Coupon collector's problem without replacement. Not long after, I thought about the same generalization for the Birthday problem.

Because a Deck of cards is a better platform for dealing with problems without replacement, we use the same setup here. That is, we seek the expected number cards we need to draw (without replacement) from a well shuffled pack of cards to get two cards of the same rank.

This problem is a lot easier. We have $n=13$ ranks each with $j=4$ instances (suits). Using Hypergeometric distribution, the probability that we need more than $k$ cards

$\displaystyle \mathbb{P}(X>k)=\frac{\binom{13}{k}\cdot 4^k}{\binom{52}{k}}$

That is, of the $k$ cards we select from $52$ cards, we can have atmost one instance for each rank. Therefore, the $k$ cards should come from the $13$ ranks and for each of the card, we have $4$ choices.

Now the expected value is obtained easily.

$\displaystyle \mathbb{E}(X)=\sum_{k=0}^n \mathbb{P}(X>k)=\sum_{k=0}^n \frac{\binom{n}{k}\cdot j^k}{\binom{nj}{k}}=1+\sum_{k=1}^nj^k\binom{n}{k}\binom{nj}{k}^{-1}$

Needless to say, this agrees with the classical expression (that is at $j \to \infty$) given in Pg 417 (or Pg 433) of 'Analytic Combinatorics'. We attempt to find asymptotics when $n\to\infty$. 

Let $\displaystyle a_k=j^k\binom{n}{k}\binom{nj}{k}^{-1}$ for $k\geq 1$

Then, using the beautiful idea I learned from Hirschhorn's page (for example, papers 176 and 197),

$\begin{align}\displaystyle a_k &= j^k\binom{n}{k}\binom{nj}{k}^{-1}\\ &= \frac{n}{nj}\frac{n-1}{nj-1}\cdots\frac{n-k+1}{nj-k+1}j^k \\ &= \frac{1-\frac{0}{n}}{1-\frac{0}{nj}}\frac{1-\frac{1}{n}}{1-\frac{1}{nj}}\cdots \frac{1-\frac{k-1}{n}}{1-\frac{k-1}{nj}} \\ &\approx \text{exp}\left\{-\frac{0}{n}-\frac{1}{n}-\cdots -\frac{k-1}{n}+\frac{0}{nj}+\frac{1}{nj}+\cdots +\frac{k-1}{nj}\right\} \\ &\approx \text{exp}\left\{-\frac{1-j^{-1}}{n}\frac{k^2}{2}\right\} \\ \end{align}$

This is nothing but Laplace's method widely discussed in literature and especially in my favorite book 'Analytic Combinatorics' (page 755).

Note that we have used idea that $(1+x)^m=e^{m\log(1+x)}\approx e^{mx}$ for small $x$. Now,

$\displaystyle \mathbb{E}(X)=1+\sum_{k=1}^n a_k \approx 1+\int\limits_0^\infty \text{exp}\left\{-\frac{1-j^{-1}}{n}\frac{k^2}{2}\right\}\,dk=1+\sqrt{\frac{\pi}{2}}\sqrt{\frac{n}{1-j^{-1}}}$

where we've used the standard Normal integral. For large $j$, this clearly reduces to the classical asymptotic value.

In fact, for large $n$, the asymptotic expansion of the binomial coefficient is

$\begin{align}\displaystyle \binom{n}{k} & \approx \frac{n^k}{k!}\text{exp}\left\{-\frac{S_1(k-1)}{n}-\frac{S_2(k-1)}{2n^2}-\frac{S_3(k-1)}{3n^3}\cdots\right\} \\ & \approx \frac{n^k}{k!}\text{exp}\left\{-\frac{k(k-1)}{2n}-\frac{k^3}{6n^2}-\frac{k^4}{12n^3}\cdots\right\}\\ & \approx \frac{n^k}{k!}\text{exp}\left\{-\frac{k^2}{2n}-\frac{k^3}{6n^2}-\frac{k^4}{12n^3}\cdots\right\}\\ \end{align}$

where $S_r(m)$ is the sum of the $r$-th powers of the first $m$ natural numbers.

Using the second expression and simplifying a bit more, we have

$\displaystyle a_k \approx \text{exp}\left\{\frac{1-j^{-1}}{8n}\right\}\text{exp}\left\{-\frac{(1-j^{-1})(k-\frac{1}{2})^2}{2n}\right\}\left(1-\frac{1-j^{-2}}{6n^2}k^3-\frac{1-j^{-3}}{12n^3}k^4+\frac{1}{2}\frac{(1-j^{-2})^2}{36n^4}k^6+\cdots\right)$

If we now use the Normal approximation, we get

$\begin{align}\displaystyle \mathbb{E}(X) &\approx 1+\text{exp}\left\{\frac{1-j^{-1}}{8n}\right\}\left(\sqrt{\frac{n\pi/2}{1-j^{-1}}}-\frac{1}{3}\frac{j+1}{j-1}-\frac{1}{4}\frac{1-j^{-3}}{(1-j^{-1})^{5/2}}\sqrt{\frac{\pi}{2n}}+\frac{5}{24}\frac{(1-j^{-2})^2}{(1-j^{-1})^{7/2}}\sqrt{\frac{\pi}{2n}}\right) \\ &= 1+\text{exp}\left\{\frac{1-j^{-1}}{8n}\right\}\left(\sqrt{\frac{n\pi/2}{1-j^{-1}}}-\frac{1}{3}\frac{j+1}{j-1}-\frac{1}{24}\frac{1-4j^{-1}+j^{-2}}{(1-j^{-1})^2}\sqrt{\frac{1-j^{-1}}{2n/\pi}} \right) \\ &\approx 1+\sqrt{\frac{n\pi/2}{1-j^{-1}}}-\frac{1}{3}\frac{j+1}{j-1}+\frac{1}{12}\frac{1-j^{-1}+j^{-2}}{(1-j^{-1})^2}\sqrt{\frac{1-j^{-1}}{2n/\pi}} \\ \end{align}$

which I think is an $O(n^{-1})$ approximation. Hope you liked the discussion.

Clear["Global`*"];
n = 10000; j = 7;
acc = 50;
res = N[1, acc];
k = 1; nume = N[n, acc]; deno = N[n, acc];
Monitor[Do[
     res += N[nume/deno, acc];
     nume *= (n - k); deno *= (n - k/j);
     , {k, n}];, {k, res}]; // AbsoluteTiming
res
N[1 + Exp[(1 - j^-1)/(
    8 n)] (Sqrt[(n \[Pi]/2)/(1 - j^-1)] - 1/3 (1 + j^-1)/(1 - j^-1) - 
     1/24 (1 - 4 j^-1 + j^-2)/(1 - j^-1)^2 Sqrt[(1 - j^-1)/(
      2 n/\[Pi])]), acc]


Until then
Yours Aye
Me