In one of the previous posts, we saw the generalization of Coupon collector's problem without replacement. Not long after, I thought about the same generalization for the Birthday problem.
Because a Deck of cards is a better platform for dealing with problems without replacement, we use the same setup here. That is, we seek the expected number cards we need to draw (without replacement) from a well shuffled pack of cards to get two cards of the same rank.
This problem is a lot easier. We have $n=13$ ranks each with $j=4$ instances (suits). Using Hypergeometric distribution, the probability that we need more than $k$ cards
$\displaystyle \mathbb{P}(X>k)=\frac{\binom{13}{k}\cdot 4^k}{\binom{52}{k}}$
That is, of the $k$ cards we select from $52$ cards, we can have atmost one instance for each rank. Therefore, the $k$ cards should come from the $13$ ranks and for each of the card, we have $4$ choices.
Now the expected value is obtained easily.
$\displaystyle \mathbb{E}(X)=\sum_{k=0}^n \mathbb{P}(X>k)=\sum_{k=0}^n \frac{\binom{n}{k}\cdot j^k}{\binom{nj}{k}}=1+\sum_{k=1}^nj^k\binom{n}{k}\binom{nj}{k}^{-1}$
Needless to say, this agrees with the classical expression (that is at $j \to \infty$) given in Pg 417 (or Pg 433) of 'Analytic Combinatorics'. We attempt to find asymptotics when $n\to\infty$.
Let $\displaystyle a_k=j^k\binom{n}{k}\binom{nj}{k}^{-1}$ for $k\geq 1$
Then, using the beautiful idea I learned from Hirschhorn's page (for example, papers 176 and 197),
$\begin{align}\displaystyle a_k &= j^k\binom{n}{k}\binom{nj}{k}^{-1}\\ &= \frac{n}{nj}\frac{n-1}{nj-1}\cdots\frac{n-k+1}{nj-k+1}j^k \\ &= \frac{1-\frac{0}{n}}{1-\frac{0}{nj}}\frac{1-\frac{1}{n}}{1-\frac{1}{nj}}\cdots \frac{1-\frac{k-1}{n}}{1-\frac{k-1}{nj}} \\ &\approx \text{exp}\left\{-\frac{0}{n}-\frac{1}{n}-\cdots -\frac{k-1}{n}+\frac{0}{nj}+\frac{1}{nj}+\cdots +\frac{k-1}{nj}\right\} \\ &\approx \text{exp}\left\{-\frac{1-j^{-1}}{n}\frac{k^2}{2}\right\} \\ \end{align}$
This is nothing but Laplace's method widely discussed in literature and especially in my favorite book 'Analytic Combinatorics' (page 755).
Note that we have used idea that $(1+x)^m=e^{m\log(1+x)}\approx e^{mx}$ for small $x$. Now,
$\displaystyle \mathbb{E}(X)=1+\sum_{k=1}^n a_k \approx 1+\int\limits_0^\infty \text{exp}\left\{-\frac{1-j^{-1}}{n}\frac{k^2}{2}\right\}\,dk=1+\sqrt{\frac{\pi}{2}}\sqrt{\frac{n}{1-j^{-1}}}$
where we've used the standard Normal integral. For large $j$, this clearly reduces to the classical asymptotic value.
In fact, for large $n$, the asymptotic expansion of the binomial coefficient is
$\begin{align}\displaystyle \binom{n}{k} & \approx \frac{n^k}{k!}\text{exp}\left\{-\frac{S_1(k-1)}{n}-\frac{S_2(k-1)}{2n^2}-\frac{S_3(k-1)}{3n^3}\cdots\right\} \\ & \approx \frac{n^k}{k!}\text{exp}\left\{-\frac{k(k-1)}{2n}-\frac{k^3}{6n^2}-\frac{k^4}{12n^3}\cdots\right\}\\ & \approx \frac{n^k}{k!}\text{exp}\left\{-\frac{k^2}{2n}-\frac{k^3}{6n^2}-\frac{k^4}{12n^3}\cdots\right\}\\ \end{align}$
where $S_r(m)$ is the sum of the $r$-th powers of the first $m$ natural numbers.
Using the second expression and simplifying a bit more, we have
$\displaystyle a_k \approx \text{exp}\left\{\frac{1-j^{-1}}{8n}\right\}\text{exp}\left\{-\frac{(1-j^{-1})(k-\frac{1}{2})^2}{2n}\right\}\left(1-\frac{1-j^{-2}}{6n^2}k^3-\frac{1-j^{-3}}{12n^3}k^4+\frac{1}{2}\frac{(1-j^{-2})^2}{36n^4}k^6+\cdots\right)$
If we now use the Normal approximation, we get
$\begin{align}\displaystyle \mathbb{E}(X) &\approx 1+\text{exp}\left\{\frac{1-j^{-1}}{8n}\right\}\left(\sqrt{\frac{n\pi/2}{1-j^{-1}}}-\frac{1}{3}\frac{j+1}{j-1}-\frac{1}{4}\frac{1-j^{-3}}{(1-j^{-1})^{5/2}}\sqrt{\frac{\pi}{2n}}+\frac{5}{24}\frac{(1-j^{-2})^2}{(1-j^{-1})^{7/2}}\sqrt{\frac{\pi}{2n}}\right) \\ &= 1+\text{exp}\left\{\frac{1-j^{-1}}{8n}\right\}\left(\sqrt{\frac{n\pi/2}{1-j^{-1}}}-\frac{1}{3}\frac{j+1}{j-1}-\frac{1}{24}\frac{1-4j^{-1}+j^{-2}}{(1-j^{-1})^2}\sqrt{\frac{1-j^{-1}}{2n/\pi}} \right) \\ &\approx 1+\sqrt{\frac{n\pi/2}{1-j^{-1}}}-\frac{1}{3}\frac{j+1}{j-1}+\frac{1}{12}\frac{1-j^{-1}+j^{-2}}{(1-j^{-1})^2}\sqrt{\frac{1-j^{-1}}{2n/\pi}} \\ \end{align}$
which I think is an $O(n^{-1})$ approximation. Hope you liked the discussion.
Clear["Global`*"];
n = 10000; j = 7;
acc = 50;
res = N[1, acc];
k = 1; nume = N[n, acc]; deno = N[n, acc];
Monitor[Do[
res += N[nume/deno, acc];
nume *= (n - k); deno *= (n - k/j);
, {k, n}];, {k, res}]; // AbsoluteTiming
res
N[1 + Exp[(1 - j^-1)/(
8 n)] (Sqrt[(n \[Pi]/2)/(1 - j^-1)] - 1/3 (1 + j^-1)/(1 - j^-1) -
1/24 (1 - 4 j^-1 + j^-2)/(1 - j^-1)^2 Sqrt[(1 - j^-1)/(
2 n/\[Pi])]), acc]
Until then
Yours Aye
Me