Sunday, April 24, 2022

The Problem of points - Without Replacement

Consider an Urn with $A$ green balls and $B$ red balls. Two players A and B play a game where they draw balls from this Urn one at a time (without replacement). If the drawn ball is green, Player A gets a point. Else, Player B gets a point.

If Player A needs $a$ points to win the game whereas Player B needs $b$ points, the question that we address in this post concerns the probability that Player A wins the match.

As the title says, this is just the Problem of points but without replacement. We've already encountered the problem in our blog in the post titled A note on Partial fractions where we saw how Pascal and Fermat independently solved the problem with different reasoning.

Interestingly, Pascal reasoning involves the (CDF of) Neg. Binomial distribution whereas Fermat's idea involves the (CDF of) Binomial distribution. The central insight in both their ideas is in realising that we'll need a maximum of $a+b-1$ games to conclude the match.

Let's consider the original Problem for points for a moment. If we let $p$ and $q$ represent the probability of A and B winning a single game, then using Pascal's idea,

$\displaystyle \mathbb{P}(\text{A wins the match})=\sum_{k=0}^{b-1}\binom{k+a-1}{k}p^aq^k$

where the individual terms are probabilities from the Neg. Binomial distribution, $NB(a,p)$.

Using Fermat's idea,

$\displaystyle \mathbb{P}(\text{B wins the match})=\sum_{k=b}^{a+b-1}\binom{a+b-1}{k}q^kp^{a+b-1-k}$

where the individual terms are probabilities from the Binomial distribution, $B(a+b-1,p)$.

Now using our discussion about the Neg. Hypergeometric distribution in the previous case and Pascal's idea, we could solve the Without-Replacement case of the Problem of points. Or we could go for simplicity and use the Hypergeometric distribution and Fermat's idea.

Either way, let $P(A,a,B,b)$ be the probability Player A winning the game in the setup described at the start of the post and $M$ be a very large number. It should be clear the classical case can be derived out of this setup as $P(pM,a,qM,b)$ with $p+q=1$.

Obviously, when $A=B$ and $a=b$, either player has an equal chance of winning because of symmetry. Similarly, when $A=B$ and $a<b$, Player A has an advantage. Same goes when $a=b$ and $A>B$.

But consider $p(50,25,40,20)$. Player A has 10 more balls than Player B but needs 5 more points to win the game. It isn't immediately clear whether this is advantageous to Player A or not? In this case, the probability of Player A winning the match is only (approx.) 49.07%.

Naturally, we might think as we break symmetry, the match skews towards one way. Perhaps the most surprising result about the Without-Replacement case of Problem of points is the one family of parameters that form an exception.

Consider $p(21, 11, 19, 10)$. Player A has two more balls than Player B but needs one extra point to win the match. Surprisingly, this match is evenly poised. Not just this, for positive integer $n$,

$\displaystyle p(2n+1,n+1,2n-1,n)=\frac{1}{2}$

Clearly, the parameters that define the game lacks symmetry. Still the game is evenly poised between the two players. I could neither come up with a reason nor a simple explanation of why this should be the case? If you can, please post it in the comments below.

pp[A_, a_, B_, b_] := (A / (A + B)) Binomial[A - 1, a - 1] Sum[Binomial[B, j] / Binomial[A + B - 1, j + a - 1], {j, 0, b - 1}];
p[A_, a_, B_, b_] := Sum[Binomial[j + a - 1, j] Binomial[A - a + B - j, B - j], {j, 0, b - 1}] / Binomial[A + B, B];
k = 2;
ListPlot[Table[p[2n + k, n + k, 2n - k, n], {n, 100}]]
p[50, 25, 40, 20]
p[21, 11, 19, 10]
p[26, 13, 22, 12]

Until then
Yours Aye

Saturday, April 23, 2022

Expected distance between two points inside a Sphere

This post picks up where we left earlier. As discussed before, we want to find the expected distance between two points randomly selected in an $n$-sphere.

Let the two points be $P$ and $Q$, and $O$ be the centre. The first step is to apply the technique of projective Reduction with $O$ as the scaling point. This shows that the expected length between two points inside a $n$-sphere is $2n/(2n+1)$ times the expected length between a point on the surface of $n$-sphere and another point inside the $n$-sphere.

WLOG, let $P$ be the point on the surface and let's orient the sphere so that point $P$ becomes the south pole. The next step is apply another Projective Reduction with the south pole as the scaling point. This step gives another factor of $n/(n+1)$ for the expected length. Note that while the first reduction preserves the uniformity of the points on the respective 'surfaces', it is not the case with the second reduction. This is what makes the problem tricky.

To find the density of $Q$, consider a random chord from point $P$ (the south pole) along with the axis on which $P$ lies. Let $\theta$ be the angle between the chord and the axis.

Rotating the chord along this axis cuts out a (for a lack of a better word) spherical-cone which contains a cone with its apex at the south pole and a spherical cap at the top (Imaging the other end of the chord to be on the opposite side of $P$). With simple geometry, it is also apparent that the angle between $OQ$ and the axis is $2\theta$.

If we let $V_{\text{sp. cone}}(\theta, r)$ be the volume of the spherical cone from a $n$-sphere of radius $r$ and apex angle $\theta$, then we have,

$\begin{align} \displaystyle V_{\text{sph. cone}}(\theta, r) &= V_{\text{cone}} + V_{\text{sph. cap}}\\&=  \frac{1}{n}V_{n-1}(r \sin 2\theta)(r+r\cos2\theta)+\int_{0}^{2\theta}V_{n-1}(r \sin t) \,d(-r\cos t)\\ &= r^n\frac{v_{n-1}}{n}\sin^{n-1}2\theta \cdot 2  \cos^2\theta + r^nv_{n-1}\int_0^{2\theta}\sin^nt\,dt \\ &=\frac{2^nr^nv_{n-1}}{n}\sin^{n-1}\theta\cos^{n+1}\theta + + r^nv_{n-1}\int_0^{2\theta}\sin^nt\,dt \end{align}$

We have used the fact that the area of an $n$-dimensional cone is $1/n$ times the volume of the circumscribing 'cylinder'. For the volume of the cap, we have used that the volume is the sum of the areas of $(n-1)$ dimensional 'circles' with infinitesimal thickness (see here for more). 

Now, $\frac{\partial V_{\text{sph. cone}}}{\partial r \partial \theta}$ gives volume between $V_{\text{sph. cone}}(\theta + d\theta, r+dr)$ and $V_{\text{sph. cone}}(\theta, r)$. When $Q$ lies in this infinitesimal volume, the distance between $P$ and $Q$ is just $2\cos\theta$.

Because we are only interested in distribution of $Q$ on the surface of the $n$-sphere, we can integrate out the $r$. Therefore, the density is given by,

$\displaystyle f(\theta)=\frac{1}{v_n}\int_{0}^1\frac{\partial V_{\text{sph. cone}}}{\partial r \partial \theta}\,dr$

We now get,

$\displaystyle f(\theta)=\frac{2^nv_{n-1}}{nv_n}((n-1)\cos^{n+2}\theta\sin^{n-2}\theta-(n+1)\sin^n\theta\cos^n\theta)+\frac{v_{n-1}}{v_n}\cdot 2\sin^n2\theta$

where we have used Leibniz differential under integral sign for the last term. Further simplification then gives,

$\displaystyle f(\theta)=\frac{2^nv_{n-1}}{nv_n}(n-1)\sin^{n-2}\theta\cos^n\theta$

This nice simplification allows us to express the expected length $e_n$ in terms of Beta function.

$\displaystyle e_n=\int_0^{\pi/2}2\cos\theta\cdot f(\theta)\,d\theta=\frac{2^n}{n}\frac{v_{n-1}}{v_n}(n-1)B(n/2-1/2, n/2+1)$

Finally, we see that the expected length $L$ between two points randomly selected inside an $n$-sphere is,

$\displaystyle \mathbb{E}(L)=\frac{2n}{2n+1}\frac{n}{n+1}\frac{2^n(n-1)}{n}\frac{B(n/2-1/2,n/2+1)}{B(n/2+1/2,1/2)}$

Interestingly, at $n\to \infty$, irrespective of whether we choose points on the surface or the inside, the expected length goes to $\sqrt{2}$. I can't clearly see why but if you have an idea, please do share it with us.

Until then
Yours Aye

Expected distance between two points on the Surface of a sphere

Problems in Geometric Probability are always fascinating. In this post, we going to see two of the well known and challenging problems in this topic.

The first problem is about the expected distance between two points selected at random from a surface of an $n$-sphere. To clarify, I'm considering a circle to be a $2$-sphere, a sphere to be a $3$-sphere and so on.

This problem is relatively easy as can seen from this stackexchange post. Let $S_n(R)$ and $V_n(R)$ denote the surface area and the volume of an $n$-sphere with radius $R$. We know that,

$S_n(R)=s_nR^{n-1}$ and $V_n(R)=v_nR^n$

where $s_n=\dfrac{2\pi^{n/2}}{\Gamma(n/2)}$ and $v_n=\dfrac{2}{n}\dfrac{\pi^{n/2}}{\Gamma(n/2)}$

For the problem under consideration, we can choose one of the points to be on the south pole of the $n$-sphere. Consider one of the axis that connects this point to the origin. A hyperplane perpendicular to this axis cuts the surface of the $n$-sphere to give a $(n-1)$-spherical surface. 

If we consider multiple such hyperplanes parallel to one another along the axis, the $n$-spherical surface is sliced into multiple rings. The picture below shows this in case of a $3$-sphere (taken from this Stack Exchange post).

Because second point is uniformly distributed in the $n$-spherical surface, it is equally likely to be present in any of these rings.

Let $\theta$ be the angle subtended between the radius that contains the second point and radius containing the first point. Then the distance between the points is $2\sin(\theta/2)$.

If we consider these rings to be infinitesimally small, then the density function of the second point lying in any of these rings is equal to the area of the ring. But this is easy. This is just the surface area of the  sliced $(n-1)$-hypersphere times $rd\theta$ (obviously $r=1$ for the unit sphere; used here for illustrative purposes). Therefore,

$\displaystyle \mathbb{E}(L)=\frac{1}{s_n}\int_0^\pi 2\sin(\theta/2)\cdot S_{n-1}(r\sin\theta)\cdot r\,d\theta=\frac{s_{n-1}}{s_n}\int_0^\pi 2\sin(\theta/2)\cdot \sin^{n-2}\theta \,d\theta$

Using Wolfram Alpha, we then see that

$\displaystyle \mathbb{E}(L)=\frac{\Gamma(n/2)}{\sqrt{\pi}\Gamma((n-1)/2)}\cdot \frac{2\sqrt{\pi}\Gamma(n-1)}{\Gamma(n-1/2)}=2\frac{\Gamma(n/2)}{\Gamma((n-1)/2)}\cdot \frac{\Gamma(n-1)}{\Gamma(n-1/2)}$

Now, the second problem asks for the expected distance between a point randomly selected on the $n$-spherical surface and a point selected on the $n$-spherical volume. This problem is more involved and requires a little more trickery. We'll see this in the next post.

Until then
Yours Aye

Wednesday, March 23, 2022

Notes on the Hypergeometric Distribution and its Negative

I wanted to create a post about the Hypergeometric distribution, one of the most important distributions in probability theory, as a prelude to my next post.

The probability mass function of a Hypergeometric distribution with parameters $N$, $K$ and $n$ is given by

$\displaystyle f(k;N,K,n)=\mathbb{P}(X=k)=\frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}$

The distribution has several symmetries which will be the key content of this post.

To start with, the role of successes and failures could be swapped. In terms of probability, this means that $f(k;N,K,n)=f(n-k;N,N-K,n)$. The binomial coefficients in the numerator of $f(k;N,K,n)$ gets swapped with this transformation.

In the next setup, the role of drawn and not drawn elements could be swapped. In terms of probability, this gives $f(k;N,K,n)=f(K-k;N,K,N-n)$. This amounts to invoking the symmetry of the binomial coefficient $^nC_k=^nC_{n-k}$.

Finally, the role of drawn elements and successes could be swapped as well. In probability terms, this becomes $f(k;N,K,n)=f(k;N,n,K)$. In other words,

$\displaystyle \mathbb{P}(X=k)=\binom{n}{k}\frac{\binom{N-n}{K-k}}{\binom{N}{K}}$

I find this pretty for two reasons. Firstly, there are two 'nCk' terms with the lowercase pair in the numerator and the uppercase in the denominator which gives it a nice structure. Secondly, and most importantly, this brings out the parallel with the Binomial distribution.

Comparing the probability of $k$ successes which we have above with that of a $\text{Bin}(n,k)$, we have,

$\displaystyle \frac{\binom{N-n}{K-k}}{\binom{N}{K}} \approx \left(\frac{K}{N}\right)^k\left(1-\frac{K}{N}\right)^{n-k}$

where $N \gg n$ and ratio of $K$ and $N$ is finite. This can be seen as an asymptotic expression for the ratio of two binomial coefficients.

To make better use of this asymptotic expression, let $K=pN$ and $p+q=1$. Then, the above can be rewritten as

$\displaystyle p^iq^j \approx \frac{\binom{N-i-j}{K-i}}{\binom{N}{K}}$

Lastly, all the above symmetries can be brought under a single umbrella by using beautiful idea from Duality and Symmetry in the Hypergeometric distribution. Surprisingly, this idea is relatively unknown. The authors show that the probability associated with the Hypergeometric distribution can also be written as,

$\displaystyle \mathbb{P}(X=k)=\binom{N}{k,n-k,K-k}\bigg/ \binom{N}{n}\binom{N}{K}$

where the numerator should be understood as the multinomial coefficient.

Finally, we come to visit the lesser known relative of the Hypergeometric distribution - the Negative Hypergeometric distribution. Just like the Hypergeometric parallels the Binomial, the Neg. Hypergeometric parallels the Neg. Binomial distribution.

We could use the exact idea of Neg. Binomial to derive the probability expression for Neg. Hypergeometric. But considering what we've seen so far, we could do better.

Let $Y$ be a Neg. Binomial variable that counts number of failures encountered until the $r$th success and $W$ be the corresponding Neg. Hypergeometric variable. We know that

$\displaystyle \mathbb{P}(Y=k)=\binom{k+r-1}{k}p^rq^k$

Now we just have to make use of the asymptotic expression above to derive the same for the Neg. Hypergeometric case. We have,

$\displaystyle \mathbb{P}(W=k)=\binom{k+r-1}{k}\frac{\binom{N-k-r}{K-r}}{\binom{N}{K}}$

Neat, isn't it?

Until then
Yours Aye

Friday, January 28, 2022

Matt Parker and the London Lottery

In his (aptly) named video 'Matt Explains: The Lottery', Matt explains the probabilities associated with the London lottery. In the first video he made on this topic, he worked out the probability of winning the Jackpot with the new system where we have a chance to win an extra lottery.

To make things clear, here are the winnings that I understood from his video. Six numbers are drawn (without repetition) from a pool of 59 numbers. If all 6 numbers match, then we win the Jackpot. If we match 5 numbers, then we get a bonus number whose match win us 50000 pounds and 1000 pounds otherwise. If we match 4 numbers, we win a 100 pounds and 3 matches wins us 25 pounds. Finally, under the new system, matching 2 numbers gives us no cash prize but gives us a new lottery which we can use the next time.

The number of matches are given by the Hypergeometric distribution as explained (and calculated) by Matt in the video. In his first video, Matt ignores all cash prizes and focuses only on the Jackpot and the extra lottery.

If we let $p_k$ denote the probability of $k$ matches and $p$ denote the probability of ultimately winning the Jackpot, we have

$p=p_6+p_2p$ which upon simplification gives $\displaystyle p = \frac{p_6}{1-p_2}$

Substituting the known values, gives a probability of $p=2.459111 \times 10^{-8}$ (or) an Odds of 1 in 40665098.

As we will be dealing with extremely small probabilities, let's define 1 micro-bps as a millionth of a basis point. In this sense, the probability in the above case is approx. 245.91 micro-bps.

At this point, as per a Subscriber's suggestion, Matt tries to attack the problem where we repeatedly buy lotteries using all our winnings until we win the Jackpot. Here Matt calculates the expected number of lotteries won with a given lottery.

If we let $l$ denote the expected number of lotteries won per lottery, then

$l=p_2 \cdot 1 + p_3 \cdot 12.5 + p_4 \cdot 50 + p_{5,1} \cdot 500 + p_{5,2} \cdot 25000 \approx 0.257191$

where we have used the fact that each lottery cost 2 pounds. Matt now uses the following equation (indirectly), to get the probability of winning the Jackpot.

$p=p_6+l \cdot p \tag{1} \label{wrongeqn}$ which gives $\displaystyle p = p_6/(1-l)$.

Using the known values, we get $p=298.7830655$ micro-bps (or) an Odds of about 1 in 33469098.

But something is definitely amiss in $\eqref{wrongeqn}$. For example, we can easily construct a case where $l > 1$ which will then give negative probabilities. What then is the error, if any, in Matt's reasoning? Or is there a hidden assumption that was not explicitly stated here?

To understand whats going on here, we should use idea that the probability of getting at least one win in $n$ trials is $1-(1-p)^n$ and interpret $p$ as the probability of winning atleast one Jackpot. Using this, a better expression is given by,

$p=p_6+(1 - p_6)(1 - (1 - p)^\bar{l}) \tag{2} \label{crudeeqn}$

where $\bar{l}=l/(1-p_6)$ is the expected number of lotteries won conditional on not winning the Jackpot directly. The above equation then means that we either win the Jackpot or get some lotteries to continue our pursuit of winning the Jackpot. Not so surprisingly, this equation can be solved in closed form. We have,

$\displaystyle p = 1 - (1 - p_6)^{1 / (1 - \bar{l})} \tag{3} \label{cruderes}$

This gives a probability of $p=298.7830666$ micro-bps (or) an Odds of about 1 in 33469097.

If we assume $p \approx 0$ in $\eqref{crudeeqn}$, then using the approximation $(1+x)^n \approx nx$ for small $x$, we end up with Matt's result. This clearly shows the hidden assumption in result derived by Matt.

But even $\eqref{crudeeqn}$ is still not exactly correct. This can still result in a value that is less than 0 or greater than 1 in certain cases which doesn't make any sense.

This problem occurs because, in the RHS of $\eqref{crudeeqn}$, we casually make an assumption that $\mathbb{E}(1-(1-p)^L)=1-(1-p)^{\mathbb{E}(L)}$ where $L$ is the random number of lotteries won (conditional on not winning the Jackpot). But this is clearly wrong because of Jensen's Inequality.

Therefore, the correct way to do this would be to solve the following equation.


where we explicitly condition on the resulting coupons. Solving this, we get $p=298.782539$ micro-bps (or) an Odds of about 1 in 33469156. 

This is good but there is still one final piece of hiccup here. In case of 2 matches, we get 25 pounds which we have assumed is equal to 12.5 lotteries. But non-integer tickets doesn't make much sense.

In case 2 matches wins us 24 pounds, then we would have $p=296.70581679$ micro-bps (or) an Odds of about 1 in 33703416. Had it been 26 pounds, then we would have $p=300.888537999$ micro-bps (or) an Odds of about 1 in 33234897.

The precise way to model this is to split the problem into cases: one in which we have just a lottery and the other in which we have the lottery and a pound. If we let $p$ and $p_e$ be the probabilities of winning at least one Jackpot in the respective cases, then we have

$p=p_6+p_{5,2}(1-(1-p)^{25000})+p_{5,1}(1-(1-p)^{500})+p_4(1-(1-p)^{50})+p_3(1-(1-p_e)(1-p)^{11})+p_2(1-(1-p)) \tag{4} \label{firstsimul}$
$p_e=p_6+p_{5,2}(1-(1-p_e)(1-p)^{24999})+p_{5,1}(1-(1-p_e)(1-p)^{499})+p_4(1-(1-p_e)(1-p)^{49})+p_3(1-(1-p)^{13})+p_2(1-(1-p_e)) \tag{5} \label{secondsimul} $

Solving this simultaneous equations gives, $p=296.75282794$ micro-bps (or) an Odds of about 1 in 33698077 and $p_e=300.13462527$ micro-bps (or) an Odds of about 1 in 33318380.

These equations have to be solved numerically which may be compute-intensive. A compromise would be let Jensen's inequality slide and consider the following equations.

$p=p_6+(1-p_6)(1-(1-p)^{\bar{l}_{11}}(1-p_e)^{\bar{l}_{12}})$ and $p_e=p_6+(1-p_6)(1-(1-p)^{\bar{l}_{21}}(1-p_e)^{\bar{l}_{22}})$

where $l_{11}=25000p_{5,2}+500p_{5,1}+50p_4+11p_3+p_2$, $l_{12}=p_3$, $l_{21}=24999p_{5,2}+499p_{5,1}+49p_4+13p_3$ and $l_{22}=p_{5,2}+p_{5,1}+p_4+p_2$ are the (unconditional) expected lotteries and lottery-pounds in the respective cases, and $\bar{l}_k=l_k/(1-p_6)$ for $k \in \{11, 12, 21, 22\}$. 

Like before, this has the advantage of being solved in closed form. We have,

$p=1-(1-p_6)^{\bar{m}}$ and $p_e=1-(1-p_6)^{\bar{m}_e}$

where $\displaystyle \bar{m} = \frac{1-\bar{l}_{22}+\bar{l}_{12}}{(1-\bar{l}_{11})(1-\bar{l}_{22})-\bar{l}_{12}\bar{l}_{21}}$ and $\displaystyle \bar{m}_e=\frac{1-\bar{l}_{11}+\bar{l}_{21}}{(1-\bar{l}_{11})(1-\bar{l}_{22})-\bar{l}_{12}\bar{l}_{21}}$

Using the known values, we get $p=296.7533437$ micro-bps and $p_e=300.1351482$ micro-bps. Like before, if we assume $p_6$ is also small, then we get an even simpler linear relation.

Or we could assume both $p$ and $p_e$ are small in $\eqref{firstsimul}$ and $\eqref{secondsimul}$, which will give us a pair of linear simultaneous equations. Solving them would give us,

$\displaystyle p = mp_6$ and $\displaystyle p_e = m_ep_6$

where $m$ (and $m_e$) is the same as $\bar{m}$ (and $\bar{m}_e$) except with unconditional expectations. Either way, these approximations become susceptible to absurd probabilities if not applied carefully.

Now, for the next part, let's say we solve everything exactly. If we have 4 pounds, then we could buy 2 lotteries. If we assume that we don't mix the proceeds from on with the other, then the probability of winning atleast one Jackpot is $1-(1-p)^2$ which with small value approximation gives $2p$. Similarly, With 5 pounds, the probability of winning atleast one Jackpot is $p+p_e$.

But with 6 pounds, we have a choice. We could either play three individual lotteries with winning probability $3p$ or two lottery-pounds with winning probability $2p_e$. With the values at hand, playing three individual lotteries is the clear choice.

This shows that it is better to convert all cash into lotteries to the fullest extent possible. Therefore, for integer $n$, we have,

$p_{2n}=np$ and $p_{2n+1}=(n-1)p+p_e$

All of the above were calculated using the following Mathematica code.

e1 = Binomial[6, 1] Binomial[53, 5] / Binomial[59, 6];
e2 = Binomial[6, 2] Binomial[53, 4] / Binomial[59, 6];
e3 = Binomial[6, 3] Binomial[53, 3] / Binomial[59, 6];
e4 = Binomial[6, 4] Binomial[53, 2] / Binomial[59, 6];
e51 = Binomial[6, 5] Binomial[53, 1] / Binomial[59, 6] * 52 / 53;
e52 = Binomial[6, 5] Binomial[53, 1] / Binomial[59, 6] * 1 / 53;
e6 = Binomial[6, 6] Binomial[53, 0] / Binomial[59, 6];

f[p_] := e6 + e52 (1 - Power[1 - p, 25000]) + e51 (1 - Power[1 - p, 500]) + e4 (1 - Power[1 - p, 50]) + e3 (1 - Power[1 - p, 25 / 2]) + e2 p - p;
g[p1_, p2_] := {e6 + e52 (1 - Power[1 - p1, 25000]) + e51 (1 - Power[1 - p1, 500]) + e4 (1 - Power[1 - p1, 50]) + e3 (1 - (1 - p2) Power[1 - p1, 11]) + e2 p1,
e6 + e52 (1 - (1 - p2) Power[1 - p1, 24999]) + e51 (1 - (1 - p2) Power[1 - p1, 499]) + e4 (1 - (1 - p2) Power[1 - p1, 49]) + e3 (1 - Power[1 - p1, 13]) + e2 p2};

(* NSolve[p == e6 + e52 (1 - Power[1 - p, 25000]) + e51 (1 - Power[1 - p, 500]) + e4 (1 - Power[1 - p, 50]) + e3 (1 - Power[1 - p, 12]) + e2 p && 0 < p < 1, p] 
NSolve[p == e6 + e52 (1 - Power[1 - p, 25]) + e51 (1 - Power[1 - p, 5]) + e4 (1 - Power[1 - p, 3]) + e3 (1 - Power[1 - p, 2]) + e2 p && 0 < p < 1, p] *)

(* n = 1; nmax = 1000;
a = N[0, 50]; b = N[1 / 1024 / 1024, 50]; tol = Power[10, -75];
While[n <= nmax,
    c = (a + b) / 2; temp = f[c];
    If[Or[(b - a) / 2 < tol, Abs[temp] < tol tol], Break[];];
    n += 1;
    If[temp > 0, a = c;, b = c;];
c *)

(* With only Jackpot and extra lottery *)
p = e6 / (1 - e2);
N[p, 15]
Floor[(1 - p) / p]

(* Matts Solution *)
l = e2 + (25 / 2) e3 + 50 e4 + 500 e51 + 25000 e52;
p = e6 / (1 - l);
N[p, 20]
Floor[(1 - p) / p]

(* Ignoring Jensens *)
lb = l / (1 - e6);
p = 1 - Power[1 - e6, 1 / (1 - lb)];
N[p, 20]
Floor[(1 - p) / p]

(* FindRoot[f[p1], {p1, 1}, WorkingPrecision -> 25] *)

(* With 25 units *)
p = 2.9878253945 Power[10, -8];
N[p, 20]
Floor[(1 - p) / p]

(* With 24 units *)
p = 2.9670581679 Power[10, -8];
N[p, 20]
Floor[(1 - p) / p]

(* With 26 units *)
p = 3.00888537999 Power[10, -8];
N[p, 20]
Floor[(1 - p) / p]

(* With 25 units but with extra lottery*)
(* FindRoot[g[p1, p2] == {p1, p2}, {p1, 1}, {p2, 1}, WorkingPrecision -> 25] *)
p = 2.9675282794 Power[10, -8];
Floor[(1 - p) / p]
p = 3.0013462527 Power[10, -8];
Floor[(1 - p) / p]

(* Ignoring Jensens *)
l11 = 25000 e52 + 500 e51 + 50 e4 + 11 e3 + e2; l12 = e3;
l21 = 24999 e52 + 499 e51 + 49 e4 + 13 e3;l22 = e52 + e51 + e4 + e2;
l11b = l11 / (1 - e6); l12b = l12 / (1 - e6); l21b = l21 / (1 - e6); l22b = l22 / (1 - e6);
bdeno = (1 - l11b) (1 - l22b) - l12b l21b;
mb = (1 - l22b + l12b) / bdeno; meb = (1 - l11b + l21b) / bdeno;
p = 1 - Power[1 - e6, mb];
N[p, 15]
Floor[(1 - p) / p]
pe = 1 - Power[1 - e6, meb];
N[pe, 15]
Floor[(1 - pe) / pe]

(* Small value approximations *)
deno = (1 - l11)(1 - l22) - l12 l21;
N[e6 (1 + (l12 - l22)) / deno, 10];
N[e6 (1 + (l21 - l11)) / deno, 10];
m = (1 - l22 + l12) / deno;
me = (1 - l11 + l21) / deno;
N[e6 m, 10]
N[e6 me, 10]

(* temp[n_] := temp[n] = If[n == 2, p, pe];
Calc[n_] := Module[{res = {}, m = IntegerPartitions[n, All, {2, 3}]},
    If[n < 2, Return[0];];
        res = Join[res, {1 - Times @@ (1 - temp /@ k)}];
    , {k, m}];
Power[10, 10] Table[Calc[n], {n, 30}] *)

Hope you enjoyed the discussion. See you in the next post.

Yours Aye
Until then

Wednesday, November 10, 2021

Some more problems on a Deck of cards

We already saw how a standard deck of cards forms a nice setting for studying both Coupon Collector problem without replacement and Birthday problem without replacement. Here we extend those to some more problems for a fun!

The first problem is to ask for the expected number of cards such that we have atleast one matching suit or one matching rank. This problem is relatively easy. We proceed using the same idea as before.

Let $X$ be the random variable denoting the number of cards need to get atleast one matching suit or rank. Then, the probability that we need more than $k$ cards to get this,

$\displaystyle \mathbb{P}(X>k)=\frac{k!\binom{13}{k}\binom{4}{k}}{\binom{52}{k}}$

That is, we are finding the probability that we neither have a mathcing suit nor a matching rank after we have drawn $k$ cards without replacement. Therefore,

$\displaystyle \mathbb{E}(X)=\sum_{k>0}\mathbb{P}(X>k)$

So, we need, on average, $\approx 3.0799$ cards to get either a matching suit or a matching rank when drawing cards without replacement from a randomly shuffled standard deck of cards.

To simplify text, let $S_k$ denote the event of having atleast one matching suit after having drawn $k$ cards without replacement and $R_k$ denote the same event for rank. Then, we know $\mathbb{P}(S^c_k)$, $\mathbb{P}(R^c_k)$, and $\mathbb{P}(S^c_k\text{ and }R^c_k)$.

Using Inclusion-Exclusion principle, we have

$\displaystyle \mathbb{P}(S^c_k \text{ or } R^c_k)=\mathbb{P}(S^c_k)+\mathbb{P}(R^c_k)-\mathbb{P}(S^c_k\text{ and }R^c_k)$

This allows to calculate the expected number of cards need to get both a matching suit and a matching rank when drawing cards without replacement. If $X$ denotes this random variable, then

$\mathbb{P}(X>k)=1-\mathbb{P}(S_k\text{ and }R_k)=\mathbb{P}(S^c_k \text{ or } R^c_k)$

This shows that

$\displaystyle \mathbb{E}(S \text{ and } R)=\mathbb{E}(S)+\mathbb{E}(R)-\mathbb{E}(S\text{ or }R)\approx 3.2678 + 5.6965 - 3.0799=5.8844$

Therefore, we need, on average, $\approx 5.8844$ cards to get both a matching suit and a matching rank when drawing cards without replacement from a randomly shuffled standard deck of cards.

Now we ask a similar question in the Coupon collector problem. That is, we seek the expected number of cards needed to collect all suits and all ranks when drawing cards without replacement from a well shuffled pack of cards.

Like we did before, we will be using idea of negative Geometric random variable and the Min-Max identity. Let $X_k$ ($k \in \{\text{Spades}, \text{Clubs}, \text{Hearts}, \text{Diamonds}\}$)be the random number of draws needed without replacement to get first suit. Similarly, let $Y_k$ for $1\leq k \leq 13$ be the corresponding random variables for ranks.

If we let $Z$ denote the random number of draws needed to collect all suits and ranks, and $A$ be the set $\{X_S, X_C,\cdots,Y_1,Y_2,\cdots,Y_{13}\}$, then by the Min-Max identity, we know

$\displaystyle \mathbb{E}(Z)=\mathbb{E}(\text{max }A)=\sum_{M \subset_\emptyset A}(-1)^{|M|}\mathbb{E}(\text{min }M)$

where we have used $\subset_\emptyset$ to denote a non-empty subset.

It just remains to calculate the expected values of the subsets. This is not very difficult and we can proceed as we did in the earlier post. For example, let's solve for $M=\{X_S,X_C,Y_5,Y_6,Y_7\}$.

Then $\mathbb{E}(\text{min }M)$ is the expected number of cards needed to get either a Spade or a Clubs or any of ranks $5$, $6$ or $7$. This is again a Negative Geometric variable with population size $N=52$ and "successes" $K=13+13+4+4+4-6=32$. We are subtracting the six cards which are the 5, 6 and 7 of Clubs and Spades to avoid double counting.

Now using the expectation of Negative Geometric variable, we easily see that $\mathbb{E}(\text{min }M)=(52+1)/(32+1)$.

Expanding the idea, the expected value of $Z$ can be seen to be

$\begin{align}\displaystyle \mathbb{E}(Z)&=(52+1)\left[1-\sum_{i=0}^4\sum_{j=0}^{13}\binom{4}{i}\binom{13}{j}\frac{(-1)^{i+j}}{13i+4j-ij+1}\right]\\ &=(52+1)\left[1-(-1)^{4+13}\sum_{i=0}^4\sum_{j=0}^{13}\binom{4}{i}\binom{13}{j}\frac{(-1)^{i+j}}{52-ij+1}\right]\end{align}$

Evaluating this with Wolfram Alpha, we finally see that the number of cards needed, on average, to collect atleast one card from each suit and rank when drawing cards without replacement from a shuffled deck of cards is $\approx 27.9998$, very close to an integer.

Until then
Yours Aye

Tuesday, November 9, 2021

Estimating the height of a Mountain (or a Building)

 I recently watched Matt Parker's attempt to measure the Earth's radius with Hannah Fry. As always, Matt made it entertaining but there was something missing there which I wanted to share in this post.

I remember seeing the idea used in the video in History of Geodesy and the related idea of finding the height of a mountain by ancient mathematicians. The idea that I find the most problematic in the approach used in the video is the use of sextant in finding the tower's height.

In my opinion, parallax error induced in hand made sextants is difficult to control and can completely throw off estimates. While we can't do away with the sextant in finding the Earth's circumference, we certainly can do better in the other case.

All we need now is a pole of known length and two measurements using that pole as shown in the figure above. That is we first place the pole at some point and move ourselves into a position such that the top of the pole and the top of the coincide in our line of sight. Now repeat the same at a different point.

Let $OH$ be the height we are tying to measure. Let $AC=BD$ be the pole whose length we consider unity to simplify calculations. Now, ssing the similarity of triangles $OA'H$ and $OB'H$, we have

$$\displaystyle \frac{OH}{AC}=\frac{OB'}{BB'}=\frac{OA'}{AA'}=\frac{B'A'}{BB'-AA'}=\frac{B'A'}{B'A'-BA}=\frac{AB+BB'-AA'}{BB'-AA'}$$

which is an expression of length measurements. Not only do we avoid sextants here, we also don't have to worry about measuring lengths from the mountain's (or building's) base.

Note that $A'$ and $B'$ are the only measurement in the expression which is prone to error because of the parallax. If we let $OH=y$ and $BB'-AA'=z$, then we can rewrite the above as

$$OH=y=AC\cdot \left(1+\frac{AB}{z}\right)$$

Differentiating the above expression, we see that

$$\displaystyle \frac{dy}{dz}=-AC\cdot \frac{AB}{z^2}$$

This shows that if $z$ is too small (which means $AA'$ and $BB'$ are almost equal), then even a small change in $z$ would result in a large change in the height. So the best thing is to make the first measurement close enough to the mountain and the second measurement far so as to reduce this effect.

Because we rely on line of sight in finding both $A'$ and $B'$, we are interested in some sort of interval estimate for the height. To this end, we consider any measurements involving points $A'$ and/or $B'$ are independent Normal variables with variance $\sigma^2$. Now,

$$\displaystyle OH=AC \cdot \left(1+\frac{AB}{Z}\right)$$

where $Z =Z_{BB'}-Z_{AA'}\sim \mathcal{N}(BB'-AA',2\sigma^2)$.

Now to attain a confidence level of $\alpha$, we have

$\displaystyle \mathbb{P}(OH^{-} \leq OH \leq OH^{+})=\alpha$ where $\displaystyle OH^{\pm}=AC\cdot\left(1+\frac{AB}{BB'-AA' \mp z_\alpha \sqrt{2}\sigma}\right)$

Note that $z=BB'-AA'$ needs two measurements. We could alternately use a single measurement $A'B'$ which also is random with variance $2\sigma^2$. Practically though, as the two measuring points are far away, $A'B'$ will be long. In this case, we have,

$\displaystyle \mathbb{P}(OH^{-} \leq OH \leq OH^{+})=\alpha$ where $\displaystyle OH^{\pm}=AC\cdot\left(1-\frac{BA}{B'A' \mp z_\alpha \sqrt{2}\sigma}\right)^{-1}$

I plan to use this idea to estimate the height of a temple tower near my place. I'll update if I find anything.

Until then
Yours Aye