Wednesday, October 26, 2022

Asymptotic Probability of not finding a pair

Consider the problem of finding the probability of not finding a pair (not two consecutive cards of the same rank) from a well shuffled pack. We already discussed this problem in this post where I could this stackexchange post that uses Generating function to solve the problem.

I was eventually able to understand the Generating function approach in the SE post. I'll postpone that for a later post as we'll be looking at something equally interesting in this post. We'll be using the usual well known approach that is indispensable in this type of problems.

Using the ideas in the SE post, we see that the probability of not finding a pair in a pack of well shuffled deck is given by the integral,

$\displaystyle \mathbb{P}(\text{No pair})=\frac{1}{52!}\int_0^\infty e^{-z} (z^4-12z^3+36z^2-24z)^{13}\,dz$

Notice that this is not exactly in the form given in the SE post but we'll probe how the two expressions are equivalent in a later post. For now, let's focus just on the integral. Let,

$\displaystyle I_{4,n}=\int_0^\infty e^{-z} (z^4-12z^3+36z^2-24z)^n\,dz$

To apply Lapalce's method, we manipulate the above like as shown below.

$\displaystyle I_{4,n}=\int_0^\infty e^{n(-z/n+\ln(z^4-12z^3+36z^2-24z))}\,dz$

We now need the value of $z=z_0$ where the function $f(z)=-z/n+\ln(z^4-12z^3+36z^2-24z)$ is maximized. Here's where things get a little interesting.

Even though, it seems to be very hard to find the exact value of $z_0$, with some experimentation, it seems $z_0 \approx 4n+3$ for large $n$.

Expanding $f(z)$ at $z=4n+3$ and keeping terms upto $O(n^{-1})$, we see that

$\displaystyle f(z)\approx -4-\frac{3}{n}+\ln f(4n+3)-\frac{(z-4n-3)^2}{8n^2}$

Therefore,

$\displaystyle I_{4,n}\approx e^{-4n-3}(256n^4-288n^2-96n+9)^n \int_{-\infty}^\infty e^{\frac{-(z-4n-3)^2}{8n}}\,dz$

Using the Normal approximation, we then have,

$I_{4,n} \approx e^{-4n-3}(256n^4-288n^2-96n+9)^n\sqrt{2\pi}\sqrt{4n}$

Let $\mathbb{P}_{j,n}(\text{No Pair})$ be the probability of getting no pair in a pack of $nj$ cards where each card has an integer from $1$ to $n$ and each integer has $j$ cards. Then,

$\displaystyle\mathbb{P}_{j,n}(\text{No Pair})=\frac{I_{j,n}}{(nj)!}$

Using Stirling's approximation, we know that

$n!~\approx \sqrt{2\pi n}e^{-n}n^n$

Therefore,

$\displaystyle \mathbb{P}_{4,n}(\text{No Pair})\approx \frac{e^{-4n-3}(256n^4-288n^2-96n+9)^n\sqrt{2\pi}\sqrt{4n}}{\sqrt{2\pi}\sqrt{4n}e^{-4n}(4n)^{4n}}$

Simplifying further, we get

$\displaystyle \mathbb{P}_{4,n}(\text{No Pair})\approx e^{-3}\left(1-\frac{288n^2+96n-9}{256n^4}\right)^n\approx e^{-3}\left(1-\frac{9}{8n}\right)$

Doing this for different $j$, I was able to see that, in general,

$\displaystyle \mathbb{P}_{j,n}(\text{No Pair})\approx e^{1-j}\left(1-\frac{(j-1)^2}{2nj}\right)$

For example, this approximation gives $\mathbb{P}_{4,13}(\text{No Pair})\approx 0.0454785$ versus the exact value of $\mathbb{P}_{4,13}(\text{No Pair})=0.0454763$ which is extremely good.

Unfortunately, its not the case always. For example, the approximation above gives $\mathbb{P}_{13,4}(\text{No Pair})\approx 0.00000441615$ versus an actual value, given in the SE post quoted at the beginning of the post, of $0.00000118175$ which is understandable as $n$ in this case is very small.

But using the approximations $e^x\approx 1+x$ and $\ln (1-x)^{-1}\approx x+x^2/2$, we get refine our approximation to give,

$\displaystyle \mathbb{P}_{j,n}(\text{No Pair})\approx e^{-(j-1)}e^{\frac{-(j-1)^2}{2nj}}\approx \text{exp}\left(-nj \ln\left(\frac{1-j^{-1}}{n}\right)^{-1}\right)=\left(1-\frac{1-j^{-1}}{n}\right)^{nj}$

Now this 'hand-wavy' approximation gives

$\mathbb{P}_{4,13}(\text{No Pair})\approx 0.045501$ and $\mathbb{P}_{13,4}(\text{No Pair})\approx 0.00000118835$

which is much better in both the cases and gives us some us some reason to take the 'hand-waviness' seriously.

Until then
Yours Aye
Me