Monday, October 14, 2019

Nim Multiplication


I was recently working on a problem that need the idea of nim-multiplication. While nim-addition plays a nice role in all 1D combinatorial games, nim-multiplication takes its place in their 2D counterparts. An excellent read on the subject of Impartial Combinatorial Games can be found in Thomas S. Ferguson's Game Theory.

Now, back to nim-multiplication. While nim-addition, denoted by $\oplus$, is easy to calculate in a computer with a simple BitXOR operation, nim-multiplication, denoted by $\otimes$, is not as simple. Every source I found in the Internet listed the following two rules to simplify nim-multiplication.

  • Rule 1: The nim-product of a Fermat power and any smaller number is just their ordinary product.
  • Rule 2: The nim-product of a Fermat power with itself is 3/2-th of the Fermat power.
These two rules, in addition to the fact that nim-product is distributive w.r.t nim-addition and the commutative property of nim-multiplication, certainly simplifies the task.

To further simplify the task, we can make use of the following two additional rules.
  • Rule 3: For any sequence of non-negative integers $p_0<p_1<p_2<\cdots$, we have $2^{2^{p_0}}\otimes2^{2^{p_1}}\otimes2^{2^{p_2}}\otimes\cdots=2^{2^{p_0}}\cdot2^{2^{p_1}}\cdot2^{2^{p_2}}\cdot\cdots$. This also shows that any power of 2 can be written as nim-product of distinct Fermat powers by writing the exponent as sums of powers-of -2.
  • Rule 4:If $x\text{ AND }y=0$, then $x\oplus y=x + y$
  • Rule 5: If $x\text{ AND }y=0$, then $2^x\otimes 2^y=2^x\cdot2^y$
Rule 3 works because we can repeatedly apply Rule 1 for nim-products starting from the left and convert them to ordinary products. Note that any resulting term on the left will always be smaller than the term that comes after.

Rule 4 works because the AND condition makes sure that no bits are identical in the binary representation of $x$ and $y$.

Rule 5 works because the nim-product can be converted to nim-products of Fermat powers. The AND condition makes sure that no two Fermat powers are equal and hence we can sort the Fermat powers and apply Rule 3.

Before solving for the nim-multiplication problem, let's just revisit some properties of the $\text{AND}$ product. It is easy that $\text{AND}$ satisfies the following three properties.

  • $x\text{ AND }x=x$
  • Associative: $x\text{ AND }(y \text{ AND }z)=(x\text{ AND }y)\text{ AND }z$
  • Commutative: $x\text{ AND }y=y\text{ AND }x$

Using these properties, we can show that

  • P1: $(x\text{ AND }y)\text{ AND }(x-(x\text{ AND }y))=0$
  • P2: $(x\text{ AND }y)\text{ AND }(y-(x\text{ AND }y))=0$
  • P3: $ (x-(x\text{ AND }y))\text{ AND }(y-(x\text{ AND }y)) = 0$

Now the nim-product of arbitrary numbers can be split into several nim-products of powers-of-2 because
$x\otimes y=(2^{x_0}\oplus2^{x_1}\oplus2^{x_2}\oplus\cdots)\otimes(2^{y_0}\oplus2^{y_1}\oplus2^{y_2}\oplus\cdots)=\displaystyle\sum_{i,j \geq 0}2^{x_i}\otimes 2^{y_j}$
where the summation symbol should be understood as a nim-sum.

Let's first solve for nim-squares of powers-of-2. That is a nim-product of a power-of-2 with itself.

\begin{align}
2^x\otimes 2^x & = (2^{x-(x\text{ AND }-x)}\cdot 2^{x\text{ AND }-x})\otimes (2^{x-(x\text{ AND }-x)}\cdot 2^{x\text{ AND }-x}) \\
& = (2^{x-(x\text{ AND }-x)}\otimes 2^{x\text{ AND }-x})\otimes (2^{x-(x\text{ AND }-x)}\otimes 2^{x\text{ AND }-x})  \\
& = (2^{x\text{ AND }-x}\otimes 2^{x\text{ AND }-x})\otimes (2^{x-(x\text{ AND }-x)}\otimes 2^{x-(x\text{ AND }-x)}) \\
& = (3/2)2^{x\text{ AND }-x}\otimes(2^{x-(x\text{ AND }-x)}\otimes 2^{x-(x\text{ AND }-x)}) \\
\end{align}
We could convert ordinary product in the second equality into an nim-product because the AND product of the exponents can be clearly seen to be zero by using  $y=-x$ in P1.

Now for the nim-product of two distinct powers-of-2,

\begin{align}
2^x\otimes 2^y & = (2^{x-(x\text{ AND }y)}\cdot 2^{x\text{ AND }y})\otimes (2^{y-(x\text{ AND }y)}\cdot 2^{x\text{ AND }y}) \\
& = (2^{x-(x\text{ AND }y)}\otimes 2^{x\text{ AND }y})\otimes (2^{y-(x\text{ AND }y)}\otimes 2^{x\text{ AND }y})  \\
& = (2^{x-(x\text{ AND }y)}\otimes 2^{y-(x\text{ AND }y)})\otimes (2^{x\text{ AND }y}\otimes 2^{x\text{ AND }y}) \\
& = 2^{x-(x\text{ AND }y)}\cdot 2^{y-(x\text{ AND }y)}\otimes (2^{x\text{ AND }y}\otimes 2^{x\text{ AND }y})\\
\end{align}
In the first equality, we have used both P1 and P2 to convert from ordinary product to nim-product. In the last equality, we have used P3 to do the opposite.

The two results above together form the recursion that can be used to solve the nim-multiplication problem. Together with the facts that the nim-product of any number with 0 is 0 and the nim-product of a number with 1 is the number itself, it's not so hard to code the above for nim-multiplication.

UPDATE: My python implementation of the above algorithm.

def memoize(fn):
    """returns a memoized version of any function that can be called
    with the same list of arguments.
    Usage: foo = memoize(foo)"""

    def handle_item(x):
        if isinstance(x, dict):
            return make_tuple(sorted(x.items()))
        elif hasattr(x, '__iter__'):
            return make_tuple(x)
        else:
            return x

    def make_tuple(L):
        return tuple(handle_item(x) for x in L)

    def foo(*args, **kwargs):
        items_cache = make_tuple(sorted(kwargs.items()))
        args_cache = make_tuple(args)
        if (args_cache, items_cache) not in foo.past_calls:
            foo.past_calls[(args_cache, items_cache)] = fn(*args,**kwargs)
        return foo.past_calls[(args_cache, items_cache)]
    foo.past_calls = {}
    foo.__name__ = 'memoized_' + fn.__name__
    return foo

@memoize
def nimmult(x, y):
  if x < y:
    return nimmult(y, x)
  elif y == 0:
    return 0
  elif y == 1:
      return x
  elif x & (x - 1):
      return nimmult((x & - x), y) ^ nimmult(x - (x & - x), y)
  elif y & (y - 1):
      return nimmult(x, (y & - y)) ^ nimmult(x, y - (y & - y))
  elif x in [2, 4, 16, 256, 65536, 4294967296]:
      return 3 * x // 2 if x == y else x * y
  elif x == y:
      tempx, p = x, 0
      while tempx > 1:
          tempx >>= 1
          p += 1
      p = p & -p
      p = 1 << p
      return nimmult(3 * p // 2, nimmult(x // p, y // p))
  else:
      tempx, tempy, px, py = x, y, 0, 0
      while tempx > 1:
          tempx >>= 1
          px += 1
      while tempy > 1:
          tempy >>= 1
          py += 1
      temp = px & py
      temp = 1 << temp
      return nimmult(x // temp * y // temp, nimmult(temp, temp))

I've included the memoize operator that I usually use in the code. In summary, Yes.. It's as simple as that :)


Until then
Yours Aye
Me 

Saturday, September 21, 2019

A simple problem in Geometric probability


I recently came across a Stackexchange post that asks for the probability that a point chosen randomly inside an equilateral triangle is closer to the center than to any of its edges. The square case was Problem B1 in 50th Putnam 1989.

I thought what about a general n-gon. Obviously for a circle, the answer should be 1/4. To tackle, the general case, we can consider the center of the polygon to be the origin and orient the polygon in such a way that one of its edges is perpendicular to the positive $y$-axis.

We can scale the polygon such that this edge passes through $(0,1)$. In other words, we scale the polygon such that the inradius is $1$. With the polygon scaled to a unit-inradius, the side of the polygon becomes $2\tan(\pi/n)$.

Now, we know that the locus of the points that are equidistant from the origin and to the line $y=1$ is a parabola. Using the formulae for distance between two points and distance between a point and a line, we have,

$x^2+y^2=\displaystyle\frac{(y-1)^2}{1^2+0^2}$

The above formulae is the same one given in this Wikipedia page. Simplifying the above we find the equation of the parabola as $y=(1-x^2)/2$.

Using the symmetry of the $n$-gon, we only focus only on the triagulation of the polygon formed by the origin and the edge that lies along the line $y=1$. The area of one such triangulation can be easily seen as $\tan(\frac{\pi}{n})$.

The lines through the origin bounding this edge are $y=\pm x \tan(\pi/2-\pi/n)=\pm x \cot(\frac{\pi}{n}) $. From now on, we restrict ourselves to the positive sign again using symmetry to our advantage.

The point of intersection of the line $y=\cot(\frac{\pi}{n})x$ and the parabola $y=(1-x^2)/2$. Solving the resulting quadratic we can simplify the positive root as $x=\tan(\frac{\pi}{2n})$. Therefore the point of intersection is $(\tan(\frac{\pi}{2n}),\tan(\frac{\pi}{2n})\cot(\frac{\pi}{n}))$.

To simplify notation, let $t=\tan(\frac{\pi}{2n})$ from now on. Therefore, the line becomes $y=\pm \frac{1-t^2}{2t}x$, the point of intersection becomes $(t,\frac{1-t^2}{2})$ and the area of the traingulation becomes $\frac{2t}{1-t^2}$.

Now, the area bounded by the parabola and the $x$-axis, is given by,

$A=\displaystyle\int_0^t \frac{1-x^2}{2}\,dx=\frac{t}{6}(3-t^2)$

and

the area of the triangle formed by origin, the point $x=t$ and the point of intersection of the parabola and the line is $T=\frac{1}{2}\cdot t \cdot \frac{1-t^2}{2}$.

In each of the triangulations of the $n$-gon, the radom point will be closer to the center than to the edge if it falls within the region bounded by the lines $y=\pm \frac{1-t^2}{2t}x$ and the parabola. If we denote the probability of such an event by $p_n$, then

$p_n=\displaystyle\frac{2(A-T)}{\tan(\frac{\pi}{n})}=\frac{(t^2+3)(1-t^2)}{12}$

Substituting for $t$ and simplifying, we have $p_n=\frac{1}{12}(4-\sec^4(\frac{\pi}{2n}))$


Until then
Yours Aye
Me

Saturday, August 17, 2019

Estimating a sum with Probability


Let's say we want to find the the sum
$S=\displaystyle\sum_{k=1}^nf(k)$
but we don't have a closed form or any easier way to calculate this. One way of doing this is to use the idea of approximating an integral.

We can take choose some $m$ random values $0=X_0,X_1,X_2,\cdots,X_m$ between $1$ and $n$ such that $0=X_0<X_1<X_2<\cdots,X_m\leq n$ and use the following expression to get an approximation.

$S_r=\displaystyle\sum_{i=1}^m f(X_i)(X_i-X_{i-1})$

But how good is this approximation? If we define the error as $\Delta=S-S_r$, what is $\mathbb{E}(\Delta)$? In this post, we are precisely trying to answer that.

Let's first calculate $\mathbb{E}(S_r)$. Using linearity of expectation, we can find the expected value of each term in the expression for $S_r$.

$\mathbb{E}(f(X_i)(X_i-X_{i-1}))=\displaystyle\frac{1}{\binom{n}{m}}\sum_{x_1=1}^{x_2-1}\sum_{x_2=1}^{x_3-1}\cdots\sum_{x_{i-1}=1}^{x_i-1}\sum_{x_i=1}^n\sum_{x_{i+1}=x_i+1}^n\cdots\sum_{x_{m-1}=x_{m-2}+1}^n\sum_{x_{m}=x_{m-1}+1}^nf(x_i)(x_i-x_{i-1})$

To simplify this sum, we need the following beautiful result.

$\displaystyle\sum_{x_1=1}^{x_2-1}\sum_{x_2=1}^{x_3-1}\cdots\sum_{x_k=1}^{x_{k+1}-1}1=\frac{(x_{k+1}-1)_{(k)}}{k!}$ and $\displaystyle\sum_{x_2=x_1+1}^n\sum_{x_3=x_2+1}^n\cdots\sum_{x_{k+1}=x_k+1}^n1=\frac{(n-x_1)_{(k)}}{k!}$

Note that how they mimic repeated integration of the constant function. Using the above results, we first simplify the expectation to

$\begin{align}
\mathbb{E}(f(X_i)(X_i-X_{i-1}))&=\displaystyle\frac{1}{\binom{n}{m}}\sum_{x_{i-1}=1}^{x_i-1}\sum_{x_i=1}^nf(x_i)(x_i-x_{i-1})\frac{(x_{i-1}-1)_{(i-2)}}{(i-2)!}\frac{(n-x_i)_{(m-i)}}{(m-i)!}\\
&=\frac{1}{\binom{n}{m}}\sum_{x_i=1}^nf(x_i)\frac{(x_i)_i}{i!}\frac{(n-x_i)_{(m-i)}}{(m-i)!}\\
&=\frac{1}{\binom{n}{m}}\sum_{k=1}^nf(k)\frac{(k)_i}{i!}\frac{(n-k)_{(m-i)}}{(m-i)!}\\
&=\frac{1}{(n)_{(m)}}\sum_{k=1}^nf(k)\binom{m}{i}(k)_i(n-k)_{(m-i)}\\
\end{align}$

Therefore,
$\begin{align}
\mathbb{E}(S_r)&=\displaystyle\frac{1}{(n)_{(m)}}\sum_{i=1}^m\sum_{k=1}^nf(k)\binom{m}{i}(k)_i(n-k)_{(m)}\\
&=\frac{1}{(n)_{(m)}}\sum_{k=1}^nf(k)\bigl((n)_{(m)}-(n-k)_{(m)}\bigl)\\
&=S-\frac{1}{(n)_{(m)}}\sum_{k=1}^nf(k)(n-k)_{(m)}\\
\end{align}$

where we have the used the falling factorial analogue of the binomial theorem in the second equality. Now that we can have the result for the expected value of the error as,

$\begin{align}
\mathbb{E}(\Delta)&=\displaystyle\frac{1}{(n)_{(m)}}\sum_{k=1}^{n-m}f(k)(n-k)_{(m)}=\frac{1}{\binom{n}{m}}\sum_{k=1}^{n-m}f(k)\binom{n-k}{m}
\end{align}$

That's pretty nice, isn't it?

I find this random sums extremely good in certain cases. Especially, I was surprised when I tried to approximate the sum of totients and the divisor function. Considering that these number theoretic functions are themselves apparently random, the expected relative errors are surprising small. You should try them out for yourselves.


Until then
Yours Aye
Me

Thursday, June 6, 2019

(A family of) Integrals independent of a parameter


I recently came across a very nice wordpress page. There are a plenty of nice math in that page that I thoroughly enjoyed, especially the Distances to a line from the vertices of a regular polygon. In this post, we look at one of the results based on Integrals not being dependent on a parameter from the same wordpress page.

The author shows that

$\displaystyle \int_0^\infty f(x^2)\,dx=\int_0^\infty f((x-a/x)^2)\,dx$,    $\text{Re}(a)\leq0$

This result can be further generalized. In fact, the presence of the square term in the facilitates the proof of the result via differential under the integral sign (sadly, only via that route), but that doesn't mean we can't take a different route to the general result. Taking a cue from Problem B4 of 29th Putnam 1968, we show that,

$\displaystyle \int_{-\infty}^\infty f(x)\,dx=\int_{-\infty}^\infty f(x-a/x)\,dx$ provided one of them exists.

For example, here are the WolframAlpha results for $\sin x / x$ and $\sin(x-1/x)/(x-1/x)$. It's surprising that while Mathematica calculates the first case exactly as $\pi$ in the blink of an eye, it fails to evaluate the second.

Let $I=\displaystyle \int_{-\infty}^\infty f(x)\,dx$

Substitute $x=y-a/y$. Now comes the cute part. Note that when $y=0$, $x=-\infty$ and when $y=\infty$, $x=\infty$. Also, $dx=(1+\frac{a}{y^2})dy$

Therefore,

$\begin{align}
I&=\displaystyle \int_0^\infty f(y-a/y)\left(1+\frac{a}{y^2}\right)\,dy\\
&=\displaystyle \int_0^\infty f(y-a/y)\,dy+\displaystyle \int_0^\infty f(y-a/y)\frac{a}{y^2}\,dy\\
\end{align}$

Now, we make the substitution, $t=-\frac{a}{y}$ in the second integral. Therefore, $dt=\frac{a}{y^2}dy$. Also, $y=0,t=-\infty$ and $y=\infty,t=0$. Hence,

$\begin{align}
I&=\displaystyle \int_0^\infty f(y-a/y)\,dy+\displaystyle \int_{-\infty}^0 f(-a/t+t)\,dt\\
&=\displaystyle \int_{-\infty}^\infty f(x-a/x)\,dx
\end{align}$

which proves the result.

In any case, just like the author of the wordpress page, I too was surprised by the fact that the transformation $x \to x-a/x$ changes the shape of the curve drastically keeping the area constant. A non-calculus based proof of this result would be a feast to the senses I guess but I couldn't find it yet. See ya again with a nice post.


Until then
Yours Aye
Me

Saturday, October 27, 2018

German Tank problem


I was recently watching How to estimate a population using statisticians in Youtube and this reminded of the German tank problem. The problem has a very nice history and I started reading about it in Wikipedia after I watched the video.

I found it very interesting that the problem can be achieved using either frequentist inference or Bayesian inference, achieving different results. Estimating the size of a Population by gives the proof of the famous result that the number of tanks.

Let the random variables $N$ denote the unknown number of tanks, $M$ be the maximum serial number observed and $K$ be the sample size. Then given $M=m$ and $K=k$,

$\displaystyle N \approx \frac{k+1}{k}m-1$ by frequentist analysis and
$\displaystyle \mathbb{E}(N|M=m,K=k)=(m-1)\frac{k-1}{k-2}$ by Bayesian analysis.

Both of these estimates are based on the probability $M=m$ is the maximal among $N$ variables with a sample size of $K$. That is,
$\displaystyle \mathbb{P}(M=m|N,K)=\frac{\binom{m-1}{K-1}}{\binom{N}{K}}$

Now let's think differently. The history of the German tank problem meant that the serial number of the tanks were noted after they were captured or destroyed. But what if they weren't? Assume that the serial number were noted without the tanks being captured/destroyed. In other words, there is a possibility of the observed numbers being repeated. Therefore,

$\displaystyle \mathbb{P}(M=m|N,K)=\frac{\binom{K+m-1}{m-1}-\binom{K+m-2}{m-2}}{\binom{K+N-1}{N-1}}=\frac{K}{m-1}\frac{\binom{K+m-2}{m-2}}{\binom{K+N-1}{N-1}}$

Surprisingly, even in this case the total likelihood function remains the same as in the original German tank problem. That is,

$\displaystyle \sum_{n}\mathcal{L}(n)=\frac{k}{k-1}$ using the sum $\displaystyle \sum_{n=m}^\infty \frac{1}{\binom{k+n-1}{n-1}}=\frac{k}{m}\frac{1}{\binom{k+m-2}{k-2}}$

Proceeding like in the wiki article, this we have,

$\displaystyle N \approx \frac{k+1}{k}m-\frac{1}{m}$ by frequentist analysis and
$\displaystyle \mathbb{E}(N|M=m,K=k)=m\frac{k-1}{k-2}$ by Bayesian analysis.

Now coming back to the video, I quoted at the start of this post, it is clear that the population estimation is done with the famous capture-recapture method. This is basically the setting usually described in terms of Hypergeometric distribution. The estimate of the total population in this case is given by,

$\displaystyle N \approx \frac{nK}{k}$

where $K$ is the initial sample size, $n$ is the second sample size and $k$ is the number of elements that are recaptured.

But this is based on frequentist inference. What about the Bayseian inference? Turns out we would need the following sum to come up with the total likelihood function.

$\displaystyle \sum_{N \geq \text{max}(n,K)}\frac{\binom{N-K}{n-K}}{\binom{N-m}{n-m}}=\frac{n-m}{K-m-1}\frac{1}{\binom{K-m-2}{k-m-2}}$

Using we have, $\displaystyle \sum_{n}\mathcal{L}(n)=\frac{nK}{k(k-1)}$ and therefore $\displaystyle \mathbb{E}(N)=\frac{(n-1)(K-1)}{k-2}$ as given here.

Now, the final part. Matt does something interesting in the video. He uses the mark-and-recapture method to estimate the number of candies in a bowl by repeated sampling. How do we go about that?

Let $n_0$ be the total number of candies in the bowl that we are trying to estimate. To this end, we capture a bunch of them each time and mark how many times those candies were captured. Let's say we do this $k$ times and $n_1,n_2,\cdots,n_k$ be sample sizes. Let $X_j$ denote the number of candies that were captured $j$ times. Now we have,

$\displaystyle \mathbb{E}(X_j)=n_0[t^j]\prod_{i=1}^k \left(1-\frac{n_i}{n_0}+ t\frac{n_i}{n_0}t\right)$

Using the data given by Matt in the video, we observe that $2$ candies were capture $4$ times. Using this as an estimate for $\mathbb{E}(X_4)$ and solving resulting equation for $n_0$ and picking the maximal root, we have $n_0 \approx 164.3$ whereas the bowl originally had about $169$ candies.

The above formulae is based on frequentist inference. It's not so hard to arrive at the $\mathbb{P}(X_j=m)$ with recurrence but finding a closed form, which I think will be needed in Bayesian inference, remains a challenge. I'll update if I'm successful in that front.


Until then,
Yours Aye
Me

Monday, April 30, 2018

The Birthday Problem


The Birthday problem is one of the classic problems in Probability theory concerning the probability that, in a set of $n$ randomly chosen people, some pair will have a the same birthday. The idea in this post is to generalize this setup and find the expected value in this generalization.

The generalization here is this: Assume that there is a very long line of persons ready to enter a very large room one by one. Each person is let in and declares her birthday upon entering the room. How many people, on average, must enter in order to find $k$ people that have birthdays that are $d$ days apart (given an $r$-day year)?

Non-Spoiler: Some readers may recognize that this Problem 584 of Project Euler. But note that this post neither gives the answer nor the right way to solve that problem. All we are going here is find an approximation of the required value which is far off from the correct answer.

Well, when we say 'approximation', we are immediately reminded of the Poisson approximation of the Birthday problem in Wikipedia. But the idea here is that most of the approximations I found in the web concerns only about the probability and not about the expected value.  A special case of this problem is solved in Strategic Practice 5 of the (amazing) Stat 110 course by the (extraordinary) Prof. Joseph Blitzstein. And I got curious as to how good the approximation really is.

Let $\mathbb{E}(k,d,r)$ be the expected number of people that have to enter room such that we have $k$ people with birthdays $d$ days apart for the first time.

Let $X_n$ be the number of $k$-set birthday matches in when $n$ people have entered the room. By the Poisson paradigm, $X_n$ can be considered to be a poisson variable. The parameter $\lambda_n$ for this variable is given by,

$\displaystyle\lambda_n=\displaystyle\binom{n}{k}\cdot\frac{1}{(k-1)!}\left(\frac{2d+1}{r}\right)^{k-1}$  (why?)

Therefore we immediately have, $\mathbb{P}(X_n=0)=e^{-\lambda_n}$

Now by a simple trick common in evaluating expected values, we have,

$\begin{align}
\displaystyle\mathbb{E}(k,d,r)&=k+\sum_{n=k}^\infty\mathbb{P}(X_n=0)=k+\displaystyle\sum_{n=k}^\infty e^{-\lambda_n}
\end{align}$

Though we have an infinite sum, the numbers decrease rapidly because of the exponential factor. For the values given the said Project euler problem, this approximation gives $\mathbb{E}_{\text{Poisson}}(3,1,10)\approx 6.15092$ and $\mathbb{E}_{\text{Poisson}}(3,7,100)\approx 8.79751$

But we can take it a little further. When the parameter of a Poisson variable get very large, then the Normal variable with the same mean and variance as the Poisson variable gives better accuracy. Therefore,

$\begin{align}
\displaystyle\mathbb{E}(k,d,r)&=k+\sum_{n=k}^\infty\mathbb{P}(X_n=0)\\
&=k+\displaystyle\sum_{n=k}^\infty \mathbb{P}(-1/2 \leq X_n \leq 1/2)
\end{align}$

where $X_n \sim \mathcal{N}(\lambda_n,\lambda_n)$ along with the continuity correction.

Here again, the numbers decrease fast and the sum converges in about fifty terms. Now we have $\mathbb{E}_{\text{Normal}}(3,1,10)\approx 5.68554$ and $\mathbb{E}_{\text{Normal}}(3,7,100)\approx 8.04791$

In fact for the required answer, the normal approximation gives $\mathbb{E}_{\text{Normal}}(4,7,365)\approx 33.6396$ with a relative error of just about $2.5\%$. Not bad.

These values were calculated with the following Mathematica code.

Clear["Global`*"];
PoissonApprox[k_, d_, r_] :=
  k + Sum[N[
     Exp[-Binomial[n, k] Power[(1 + 2 d)/r, k - 1]/
        Factorial[k - 1]]], {n, k, 100}];
NormalApprox[k_, d_, r_] :=
 k + Sum[N[
    CDF[NormalDistribution[
       Binomial[n, k] Power[(1 + 2 d)/r, k - 1]/Factorial[k - 1],
       Sqrt[Binomial[n, k] Power[(1 + 2 d)/r, k - 1]/
          Factorial[k - 1]]], 1/2] -
     CDF[NormalDistribution[
       Binomial[n, k] Power[(1 + 2 d)/r, k - 1]/Factorial[k - 1],
       Sqrt[Binomial[n, k] Power[(1 + 2 d)/r, k - 1]/
          Factorial[k - 1]]], -1/2]], {n, k, 100}]
PoissonApprox[4, 7, 365]
NormalApprox[4, 7, 365]

Hope you enjoyed this. See you again with a nice problem.


Until then
Yours Aye
Me

Tuesday, April 24, 2018

Probability on a Chess Match


Consider two friends $A$ and $B$ playing a series of bullet chess games. As every time the player who lost the recent game wants to continue, they come with an idea. They'll play the first game. After the game, they'll toss a fair coin. If the coin comes up heads, they'll continue playing another game, else they end the match. They'll continue this procedure until the coin turns up Tails.

The question we are interested here is the probability of either player winning the match given the probability $\alpha$ of $A$ winning a game, probability $\gamma$ of $B$ winning a game and the remaining probability $\beta=1-\alpha-\gamma$ of a game ending in a draw. WLOG assume $\alpha<\gamma$.

All the equations referred in this post now on are from A list of Integrals and Identities.

Let $A_{n,k}$ denote the event of $A$ winning a match of $n$ games by $k$ more wins (ignoring the coin tosses). Then,

$\displaystyle\mathbb{P}(A_{n,k})=\sum_{a+b+c =n\\a-c=k}\binom{n}{a,b,c}\alpha^a\beta^b\gamma^c$

where $a$ denotes the number of games won by $A$, $c$ the number of games won by $B$ and $b$ the number of games that ended in a draw.

First, let's tackle an easy question. That is finding the probability of the match ending in a draw.

$\begin{align}
\mathbb{P}(\text{Draw})&=\displaystyle\sum_{n=1}^\infty\mathbb{P}(A_{n,0}\text{ | }n\text{ games played})\cdot\mathbb{P}(n\text{ games being played})\\
\end{align}$

As we know that the probability of the match lasting $n$ games is $2^{-n}$, using (1) we have,

$\begin{align}
\mathbb{P}(\text{Draw})&=\displaystyle\frac{\cot\phi}{\sqrt{\alpha\gamma}}-1\\
\end{align}$,   where $\phi=\cos^{-1}\displaystyle\left(\frac{2\sqrt{\alpha\gamma}}{1+\alpha+\gamma}\right)$


Similarly, we now look at the probability of $A$ not losing the match. This time we have,

$\begin{align}
\mathbb{P}(A \text{ not losing})&=\displaystyle\sum_{n=1}^\infty\sum_{k=0}^\infty\mathbb{P}(A_{n,k}\text{ | }n\text{ games played})\cdot\mathbb{P}(n\text{ games being played})\\
\end{align}$

Using (6), this reduces to,

$\begin{align}
\mathbb{P}(A \text{ not losing})&=\displaystyle\frac{\cot\phi}{\sqrt{\alpha\gamma}-\alpha(\sec\phi-\tan\phi)}-1\\
\end{align}$

In fact, instead of using a fair coin, if they decide to use a biased coin so that they continue playing with probability $q$ and end the match with probability $p=1-q$,

$\begin{align}
\displaystyle\frac{q}{1-q}\mathbb{P}(\text{Draw})&=\frac{\frac{1}{2}\cot\phi}{\sqrt{(q\alpha)(q\gamma)}}-1\\
\end{align}$

$\displaystyle\frac{q}{1-q}\mathbb{P}(A \text{ wins the match})=\displaystyle\frac{\frac{1}{2}\cot\phi}{(q\gamma)(\sec\phi+\tan\phi)-\sqrt{(q\alpha)(q\gamma)}}$

$\displaystyle\frac{q}{1-q}\mathbb{P}(B \text{ wins the match})=\displaystyle\frac{\frac{1}{2}\cot\phi}{(q\alpha)(\sec\phi+\tan\phi)-\sqrt{(q\alpha)(q\gamma)}}$

where $\displaystyle\phi=\cos^{-1}\left(\frac{2\sqrt{(q\alpha)(q\gamma)}}{(1-q)+q\alpha+q\gamma}\right)$

These expressions can be written algebraically without the use of trigonometric functions but they are clumsy and I prefer this form. Also, there a couple of nice details hidden in these expressions. First, using a right choice of $q$, the weaker player will be able to decrease his opponent's chance of winning the match which was surprising for me.

Second, in the interesting and limiting case of $q\to1$, the LHSs of the above three expression can be seen as the expected number of games after which the match is at a drawn stage, player $A$ is in the lead and player $B$ is in the lead respectively.

$\displaystyle\lim_{q\to1}\displaystyle\frac{q}{1-q}\mathbb{P}(\text{Draw})=\frac{1}{|\gamma-\alpha|}-1$,

$\displaystyle\lim_{q\to1}\displaystyle\frac{q}{1-q}\mathbb{P}(A \text{ wins the match})=\frac{\alpha}{\text{max}(0,\gamma-\alpha)^2}$ and

$\displaystyle\lim_{q\to1}\displaystyle\frac{q}{1-q}\mathbb{P}(B \text{ wins the match})=\frac{\gamma}{\text{max}(0,\alpha-\gamma)^2}$

ASIDE: An interesting discussion on a related problem is given in Joint probability of geometric random variables. However, there the distributions are independent. Our problem can be stated similarly using the result that if,

$\vec{\mathbf{X}}\sim \text{Mult}(N, \vec{\mathbf{p}})$ where $N \sim \text{NBin}(m,s)$, then $X_i \sim \displaystyle\text{NBin}\left(m,\frac{s}{s+(1-s)p_i}\right)$

But the $X_i$'s are not independent (they would be had $N$ been a Poisson variable).

Even though we started with the assumption $\alpha<\gamma$, fortunately the final results holds for all values of $\alpha$ and $\gamma$. The symmetry of the formulae is really enchanting to me and greatly satisfied that I've solved this. Hope you enjoyed it.

Until then
Yours Aye
Me