Wednesday, November 10, 2021

Some more problems on a Deck of cards

We already saw how a standard deck of cards forms a nice setting for studying both Coupon Collector problem without replacement and Birthday problem without replacement. Here we extend those to some more problems for a fun!

The first problem is to ask for the expected number of cards such that we have atleast one matching suit or one matching rank. This problem is relatively easy. We proceed using the same idea as before.

Let $X$ be the random variable denoting the number of cards need to get atleast one matching suit or rank. Then, the probability that we need more than $k$ cards to get this,

$\displaystyle \mathbb{P}(X>k)=\frac{k!\binom{13}{k}\binom{4}{k}}{\binom{52}{k}}$

That is, we are finding the probability that we neither have a mathcing suit nor a matching rank after we have drawn $k$ cards without replacement. Therefore,

$\displaystyle \mathbb{E}(X)=\sum_{k>0}\mathbb{P}(X>k)$

So, we need, on average, $\approx 3.0799$ cards to get either a matching suit or a matching rank when drawing cards without replacement from a randomly shuffled standard deck of cards.

To simplify text, let $S_k$ denote the event of having atleast one matching suit after having drawn $k$ cards without replacement and $R_k$ denote the same event for rank. Then, we know $\mathbb{P}(S^c_k)$, $\mathbb{P}(R^c_k)$, and $\mathbb{P}(S^c_k\text{ and }R^c_k)$.

Using Inclusion-Exclusion principle, we have

$\displaystyle \mathbb{P}(S^c_k \text{ or } R^c_k)=\mathbb{P}(S^c_k)+\mathbb{P}(R^c_k)-\mathbb{P}(S^c_k\text{ and }R^c_k)$

This allows to calculate the expected number of cards need to get both a matching suit and a matching rank when drawing cards without replacement. If $X$ denotes this random variable, then

$\mathbb{P}(X>k)=1-\mathbb{P}(S_k\text{ and }R_k)=\mathbb{P}(S^c_k \text{ or } R^c_k)$

This shows that

$\displaystyle \mathbb{E}(S \text{ and } R)=\mathbb{E}(S)+\mathbb{E}(R)-\mathbb{E}(S\text{ or }R)\approx 3.2678 + 5.6965 - 3.0799=5.8844$

Therefore, we need, on average, $\approx 5.8844$ cards to get both a matching suit and a matching rank when drawing cards without replacement from a randomly shuffled standard deck of cards.

Now we ask a similar question in the Coupon collector problem. That is, we seek the expected number of cards needed to collect all suits and all ranks when drawing cards without replacement from a well shuffled pack of cards.

Like we did before, we will be using idea of negative Geometric random variable and the Min-Max identity. Let $X_k$ ($k \in \{\text{Spades}, \text{Clubs}, \text{Hearts}, \text{Diamonds}\}$)be the random number of draws needed without replacement to get first suit. Similarly, let $Y_k$ for $1\leq k \leq 13$ be the corresponding random variables for ranks.

If we let $Z$ denote the random number of draws needed to collect all suits and ranks, and $A$ be the set $\{X_S, X_C,\cdots,Y_1,Y_2,\cdots,Y_{13}\}$, then by the Min-Max identity, we know

$\displaystyle \mathbb{E}(Z)=\mathbb{E}(\text{max }A)=\sum_{M \subset_\emptyset A}(-1)^{|M|}\mathbb{E}(\text{min }M)$

where we have used $\subset_\emptyset$ to denote a non-empty subset.

It just remains to calculate the expected values of the subsets. This is not very difficult and we can proceed as we did in the earlier post. For example, let's solve for $M=\{X_S,X_C,Y_5,Y_6,Y_7\}$.

Then $\mathbb{E}(\text{min }M)$ is the expected number of cards needed to get either a Spade or a Clubs or any of ranks $5$, $6$ or $7$. This is again a Negative Geometric variable with population size $N=52$ and "successes" $K=13+13+4+4+4-6=32$. We are subtracting the six cards which are the 5, 6 and 7 of Clubs and Spades to avoid double counting.

Now using the expectation of Negative Geometric variable, we easily see that $\mathbb{E}(\text{min }M)=(52+1)/(32+1)$.

Expanding the idea, the expected value of $Z$ can be seen to be

$\begin{align}\displaystyle \mathbb{E}(Z)&=(52+1)\left[1-\sum_{i=0}^4\sum_{j=0}^{13}\binom{4}{i}\binom{13}{j}\frac{(-1)^{i+j}}{13i+4j-ij+1}\right]\\ &=(52+1)\left[1-(-1)^{4+13}\sum_{i=0}^4\sum_{j=0}^{13}\binom{4}{i}\binom{13}{j}\frac{(-1)^{i+j}}{52-ij+1}\right]\end{align}$

Evaluating this with Wolfram Alpha, we finally see that the number of cards needed, on average, to collect atleast one card from each suit and rank when drawing cards without replacement from a shuffled deck of cards is $\approx 27.9998$, very close to an integer.

Until then
Yours Aye
Me

Tuesday, November 9, 2021

Estimating the height of a Mountain (or a Building)

 I recently watched Matt Parker's attempt to measure the Earth's radius with Hannah Fry. As always, Matt made it entertaining but there was something missing there which I wanted to share in this post.

I remember seeing the idea used in the video in History of Geodesy and the related idea of finding the height of a mountain by ancient mathematicians. The idea that I find the most problematic in the approach used in the video is the use of sextant in finding the tower's height.

In my opinion, parallax error induced in hand made sextants is difficult to control and can completely throw off estimates. While we can't do away with the sextant in finding the Earth's circumference, we certainly can do better in the other case.

All we need now is a pole of known length and two measurements using that pole as shown in the figure above. That is we first place the pole at some point and move ourselves into a position such that the top of the pole and the top of the coincide in our line of sight. Now repeat the same at a different point.

Let $OH$ be the height we are tying to measure. Let $AC=BD$ be the pole whose length we consider unity to simplify calculations. Now, ssing the similarity of triangles $OA'H$ and $OB'H$, we have

$$\displaystyle \frac{OH}{AC}=\frac{OB'}{BB'}=\frac{OA'}{AA'}=\frac{B'A'}{BB'-AA'}=\frac{B'A'}{B'A'-BA}=\frac{AB+BB'-AA'}{BB'-AA'}$$

which is an expression of length measurements. Not only do we avoid sextants here, we also don't have to worry about measuring lengths from the mountain's (or building's) base.

Note that $A'$ and $B'$ are the only measurement in the expression which is prone to error because of the parallax. If we let $OH=y$ and $BB'-AA'=z$, then we can rewrite the above as

$$OH=y=AC\cdot \left(1+\frac{AB}{z}\right)$$

Differentiating the above expression, we see that

$$\displaystyle \frac{dy}{dz}=-AC\cdot \frac{AB}{z^2}$$

This shows that if $z$ is too small (which means $AA'$ and $BB'$ are almost equal), then even a small change in $z$ would result in a large change in the height. So the best thing is to make the first measurement close enough to the mountain and the second measurement far so as to reduce this effect.

Because we rely on line of sight in finding both $A'$ and $B'$, we are interested in some sort of interval estimate for the height. To this end, we consider any measurements involving points $A'$ and/or $B'$ are independent Normal variables with variance $\sigma^2$. Now,

$$\displaystyle OH=AC \cdot \left(1+\frac{AB}{Z}\right)$$

where $Z =Z_{BB'}-Z_{AA'}\sim \mathcal{N}(BB'-AA',2\sigma^2)$.

Now to attain a confidence level of $\alpha$, we have

$\displaystyle \mathbb{P}(OH^{-} \leq OH \leq OH^{+})=\alpha$ where $\displaystyle OH^{\pm}=AC\cdot\left(1+\frac{AB}{BB'-AA' \mp z_\alpha \sqrt{2}\sigma}\right)$


Note that $z=BB'-AA'$ needs two measurements. We could alternately use a single measurement $A'B'$ which also is random with variance $2\sigma^2$. Practically though, as the two measuring points are far away, $A'B'$ will be long. In this case, we have,

$\displaystyle \mathbb{P}(OH^{-} \leq OH \leq OH^{+})=\alpha$ where $\displaystyle OH^{\pm}=AC\cdot\left(1-\frac{BA}{B'A' \mp z_\alpha \sqrt{2}\sigma}\right)^{-1}$

I plan to use this idea to estimate the height of a temple tower near my place. I'll update if I find anything.


Until then
Yours Aye
Me