Saturday, December 30, 2017

A little math on Badminton

If you had read my first few posts, you would've known that am an amateur Badminton player. Now I don't find either the time or the energy to play the game, but it's a game that I've always enjoyed playing. This post is about combining my interest on Badminton and my passion for math.

Most of the important details of the post are borrowed from Score probabilities for serve and rally competitions. The paper was too good in analyzing a given serve-rally competition, be it tennis or Badminton or the likes, the paper did not stick to any one particular game and finish it off in style which is exactly what we plan to do here for Badminton.

Let's just list the rules to make sure we are all on the same page.

  • A rally cannot end in a draw
  • The winner of the current rally wins a point and serves next irrespective of who served the current rally
  • The first player to reach 21 points wins the game unless the score was ever tied at 20 apiece
  • If the score ever reached 20-20, then the first player to lead by two point or the reach 30 points wins the game

Let me stick to the notations used in the paper quoted above and call the two players $A$ and $B$. WLOG, we always consider $A$ to serve first. The game will be analyzed from $A$'s perspective and hence let $p_a$ denote the probability that $A$ wins the rally when $A$ serves and $p_b$ denote the probability that $A$ wins the rally when $B$ serves.

Then $q_x=1-p_x,\text{ }x=a,b$ be the corresponding winning probabilities for $B$.

Given these probabilities, we intend to find the probability of $A$ winning the game. But before we begin, assuming both the players are equally skilled, do you think that $A$ has an advantage because he is serving first?

Let $\mathbb{P}(X),\text{ } X=A,B$ denote the probability that $X$ wins the game and $\mathbb{P}(X_{i,j}),\text{ } X=A,B$ denote the probability that game is at a stage where $A$ has $i$ points, $B$ has $j$ points and $X$ has won the last point.

The paper doesn't go on to complete by taking into account the end conditions of the game. To do that, we extend it to cover the end cases as well.

0,& i \leq 0\\
p_a^i, &i\leq21,j=0\\
\displaystyle p_a^iq_b^j\sum_{r=1}^i\binom{i}{r}\binom{j-1}{r-1}\left(\frac{q_a p_b}{p_a q_b}\right)^r,&i < 21, j<21\text{ (or) }i=21,j<20\\
p_a\text{ }\mathbb{P}(A_{i-1,j})+p_b\text{ }\mathbb{P}(B_{i-1,j}),&0\leq i - j \leq 2, \text{ }i\leq30,j<30

0,& j \leq 0\\
q_a q_b^{j-1}, &j\leq21,i=0\\
q_a q_b^{j-1}p_a^i\left[1+\displaystyle\sum_{s=1}^{j-1}\sum_{r=1}^i\binom{i}{r}\binom{s-1}{r-1}\left(\frac{q_a p_b}{p_a q_b}\right)^r\right],&j < 21, i<21\text{ (or) }j=21,i<20\\
q_a\text{ }\mathbb{P}(A_{i,j-1})+q_b\text{ }\mathbb{P}(B_{i,j-1}),&0\leq j-i \leq 2, \text{ }j\leq30,i<30

Anything that doesn't fall in these categories is $0$.


$\mathbb{P}(A)=\displaystyle\sum_{r=0}^{19} \mathbb{P}(A_{21,r})+\sum_{r=22}^{30} \mathbb{P}(A_{r,r-2})+\mathbb{P}(A_{30,29})$ and $\mathbb{P}(B)=1-\mathbb{P}(A)$

Playing with this code, we now answer the question that I asked above. According to the model, when both the players are equally skilled (both while serving and defending), it makes no difference as to who serves first. This seemed a counter-intuitive to me because assuming everything else equal, I thought the first player has the initiative.

Let's look at the question of serve-advantage with the following graphic.

The probability of $A$ winning the game is plotted in the Y-axes. The orange curve is when both players are equally adept at winning the rallies when they serve. That 'adept-ness' is in the X-axis. The upper blue curve is when $A$ has a 'slight' serve-advantage over $B$ and vice verse for the lower green curve. (The 'slight' is 1% here)

The green curve clearly shows the second player can 'pull-down' $A$'s chances of winning the game by developing a serve-advantage though at professional levels this diminishes fast.

Yet another interesting graphic is the one where the first player has the same probability of winning a rally irrespective of who served.

Again the Y-axis shows the probability of $A$ winning the game and the X-axis, his probability of winning the rally (irrespective of the serve). The graph pretty much says that once a player reached about 2/3rd chance of winning the rally, there is pretty much nothing more he has (or need) to do.

Hope you enjoyed this. Wish you all a very Happy New Year!!

Yours Aye,

Wednesday, December 20, 2017

An Iterated Sum

 I recently saw a very nice problem. It's a problem on Continuous distribution but I recognized it's one of those problems where the reasoning can be applied exactly to the discrete case too. So here it is.

Consider the following sum.


One interpretation of the above sum can be as follows: The sum counts the number of $k$-tuples of positive integers such that each integer in the tuple is less than or equal $n$ and the sum of any two consecutive elements in the tuple in less than or equal to $n$ as well.

To find this sum, let's modify the notation a bit. Let


So we are solving for $T_k(0)$ given $n$. Now we take advantage of the iterative structure of the sum to write it as,


with $T_0(m)=1$. This may be a little confusing but a moment of thought clears it up easily. Now this renders itself to the idea of generating functions. We define,

$f(z,m)=\displaystyle\sum_{j=0}^\infty T_j(m)z^j$

Let's do some algebraic manipulations in the above sum.

f(z,m)&=1+z\sum_{j=1}^\infty T_j(m)z^{j-1}\\
&=1+z\sum_{j=1}^\infty z^{j-1} \sum_{i=1}^{n-m}T_{j-1}(i)\\
&=1+z\sum_{i=1}^{n-m}  \sum_{j=1}^\infty T_{j-1}(i)z^{j-1}\\
&=1+z\sum_{i=1}^{n-m} f(z,i)\\

We can infer that

$(1):$    $f(z,h+1)-f(z,h)=-z f(z,n-h)$

$(2):$    $f(z,h+2)-2f(z,h+1)+f(z,h)=-z (f(z,n-h-1)-f(z,n-h))$

$(3):$    $f(z,n)=1$

$(4):$    Using $h=0$ in $(1)$, we have $f(z,1)-f(z,0)=-z$      (using $(3)$)

Using $h=n-m-1$ in $(1)$, we have $f(z,n-m)-f(z,n-m-1)=-z f(z,m+1)$. Substituting this in $(2)$, we finally arrive at,


This is a simple difference equation of the second order. The characteristic equation of this is,

$x^2+(z^2-2)x+1=0$ with roots $z_{+,-}=\displaystyle\frac{1}{2}(2-z^2\pm z\sqrt{z^2-4})$

Therefore, $f(z,m)=c_1 z_+^m+c_2 z_-^m$. Note that our point of interest $f(z,0)=c_1+c_2$

We use $(3)$ and $(4)$ to solve for these constants. Finally,


Honestly, the entire idea so far has been adapted to the discrete case from the original source. Moreover, the given expression looks complicated. It seems we can simplify this and somehow get rid of the radical terms but I was not able to.

But I did all these because, first of all, it was fun and second, I was able to notice something very nice which would be my original contribution to this post.

Let $f_j(z)=z+\displaystyle\frac{1}{f_{j-1}(-z)}$ with $f_0(z)=1$. Then $T_k(0)=[z^k]f_n(z)$. That is,

$\displaystyle\sum_{j_1=1}^n\sum_{j_2=1}^{n-j_1}\sum_{j_3=1}^{n-j_2}\cdots\sum_{j_k=1}^{n-j_{k-1}}1=[z^k] \text{  }[z;-z,z,-z,\cdots,(-1)^{n-1}z,1]$

in continued fraction notation. Note that there are $n$ $z$'s on the RHS and the sign alternates with every iterate.

I really liked how an iterated sums occur on one side, a continued fraction on the other and levels on both sides $k$ and $n$ in a twisted way. Not only does this make the sum compact (also freeing it of any radicals), it also makes it easier to program. I used this idea to make a Python program for different $k$.

Yeah, my original idea was very small compared to the length of the post but it was great learning experience for me going through a problem in both continuous and discrete cases side-by-side, understanding how each steps transpires in both of them. Hope you enjoyed this.

Until then
Yours Aye

Tuesday, October 24, 2017

A (very) short note on the MTZ constraints of Vehicle Routing Problem

This post will be about the Vehicle Routing Problem (VRP), a combinatorial optimization and integer programming problem which generalizes the famous Travelling Salesman Problem (TSP). To be even more specific, I'm making this post to discuss MTZ constraints, a set of  sub-tour elimination constraints, in the Vehicle flow formulation of the VRP.

I tried to understand the ones given in the Wiki article but they are definitely wrong (as of this writing). Very many sources can be found on the internet but I was not able to understand any of them clearly. Turns out, those are not very hard to understand and I was able to make my own formulation. It goes like the following.

$u_i-u_j+Cx_{ij}\leq C-d_j$      $\forall i,j \in V \setminus \{0\}, i\ne j$      $s.t.$ $0<d_i \leq C$
$d_i\leq u_i \leq C$     $\forall i \in V \setminus \{0\}$

where $C$ is the capacity of the vehicle, $d_i$ is the demand of the $i$th city, and $x_{ij}$ is a binary variable that defines whether a tour happens from the $i$th city to the $j$th city or not (To avoid confusion, I've used the notations used in the Wiki article).

It is easy to see why this works. First, if there exists a sub-tour involving $k$ cities, then we can add the inequalities corresponding to those edges. The $u$ variables cancel each other and we will left with,

$0\leq -d_{i_1}-d_{i_2}-d_{i_3}-\cdots-d_{i_k}$

which is not possible because of the strictly positive constraint placed on the $d$ variables, thus proving every feasible tour of this formulation has exactly one sub-tour starting and ending at the 'origin' city.

To prove the existence of $u$ variables for all feasible solutions, first let's consider the case where not trip takes place from the $i$th city to the $j$th city. That is, $x_{ij}=0$. Then, the constraint corresponding to $x_{ij}$ becomes,

$0\leq (C-u_i)+(u_j-d_j)$

Both the bracketed terms are positive because of the constraints place on the $u$'s and thus the constraint becomes non-binding. On the other hand, let's consider the most interesting case of $x_{ij}=1$.

Now the $u$'s can be given a very nice interpretation. A particular $u_i$ can be interpreted as the cumulative demand delivered so far in tour upto (and including) city $i$. With this, we can see that easily see that the inequality becomes,

$-d_j+C\leq C-d_j$

which is trivially true. Hope you enjoyed this.

Until then,
Yours Aye,

Friday, September 22, 2017

Proof for the Expected Value Problem III

Let $ABC$ be a triangle with $AB=c$, $BC=a$, and $CA=b$. Let $D$, $E$ and $F$ denote the feet of the altitudes from $A$, $B$ and $C$ receptively such that $AD=h_a$, $BE=h_b$ and $CF=h_c$.

First off, we define some notations. Let $L$ be the length of the line drawn by selecting a random point $P$ in a triangle $ABC$ and let $X$ be the point where the random line intersects the triangle.

Let $\mathbb{E}_{\triangle ABC}(L|S_k)$ be the expected length of the line drawn as described in setup $S_k$ in the triangle $ABC$.

Setup 1 Select a random point in triangle $ABC$ and draw a line (towards $BC$) parallel to the altitude $AD$.

WLOG, we can chose the vertex $B$ to be the origin with the $x$-axis aligned with the side $BC$. Then we have,

$\displaystyle\mathbb{E}_{\triangle ABC}(L|S_1)=\frac{1}{\triangle ABC}\iint\limits_{(x,y)\in\triangle ABC}|PX|\,dy\,dx=\frac{1}{\triangle ABC}\iint\limits_{(x,y)\in\triangle ABC}y\,dy\,dx=\frac{h_a}{3}$

The last equality follows by recognizing the integral as the $y$-coordinate of the centroid of the triangle $ABC$.

Setup 2 Given a point $M$ between $B$ and $D$, select a random point in triangle $ABC$ and draw a line (towards $BC$) parallel to $AM$ until it meets the triangle. Let $t$ be the angle between $AM$ and $AD$ (measured clockwise from $AD$).

Using the same co-ordinate system as above, this time we have,

\displaystyle\mathbb{E}_{\triangle ABC}(L|S_2,M)&=\frac{1}{\triangle ABC}\iint\limits_{(x,y)\in\triangle ABC}|PX|\,dy\,dx\\
&=\frac{1}{\triangle ABC}\iint\limits_{(x,y)\in\triangle ABC}\frac{y}{\cos t}\,dy\,dx=\frac{h_a}{3\cos t}=\frac{|AM|}{3}

Setup 3 Given a point $M$ between $B$ and $D$, select a random point in triangle $ABC$ and draw a line (away from $BC$) parallel to $AM$ until it meets the triangle. Let $t$ be the angle between $AM$ and $AD$ (measured clockwise from $AD$).

Now we do a case distinction based on whether the random point is in $\triangle ABM$ or $\triangle AMC$.

\displaystyle\mathbb{E}_{\triangle ABC}(L|S_3,M)&=\frac{\triangle MAB}{\triangle ABC}\mathbb{E}_{\triangle MAB}(L|S_2,A)+\frac{\triangle MCA}{\triangle ABC}\mathbb{E}_{\triangle MCA}(L|S_2,A)\\
&=\frac{\triangle ABM}{\triangle ABC}\frac{|MA|}{3}+\frac{\triangle AMC}{\triangle ABC}\frac{|MA|}{3}\\

Surprisingly, it doesn't matter whether we draw the line towards or away from the edge $BC$, the expected value remains the same.

Setup 4 Given a point $M$ between $B$ and $D$, select a random point in triangle $ABC$ and draw a line (either towards or away from $BC$) parallel to $AM$ until it meets the triangle. Let $t$ be the angle between $AM$ and $AD$ (measured clockwise from $AD$).

From the previous discussion, we have,

$\displaystyle\mathbb{E}_{\triangle ABC}(L|S_4,M)=\frac{|AM|}{3}$

This can be equivalently written as,

$\displaystyle\mathbb{E}_{\triangle ABC}(L|S_4,t)=\frac{h_a}{3\cos t}$

Setup 5 Select a random point in triangle $ABC$ and draw a line (either towards or away from $BC$) until it meets the triangle, such that the angle $t$ between $PX$ and $AD$ (measured clockwise from $PX$) satisfies $0 \leq t \leq \tan^{-1}\left(\frac{BD}{AD}\right)$.

Using the law of the total probability,

\mathbb{E}_{\triangle ABC}(L|S_5)&=\mathbb{E}(\mathbb{E}_{\triangle ABC}(L|S_4,t))\\
&=\frac{1}{\tan^{-1}\left(\frac{BD}{AD}\right)}\int\limits_0^{\tan^{-1}\left(\frac{BD}{AD}\right)}\frac{h_a}{3\cos t}\,dt\\

We can use the fact that $\int\limits_0^{\tan^{-1}m}\sec t\,dt=\text{sinh}^{-1}m$, we have

$\displaystyle\mathbb{E}_{\triangle ABC}(L|S_5)=\frac{h_a}{3\tan^{-1}\left(\frac{BD}{AD}\right)}\text{sinh}^{-1}\left(\frac{BD}{AD}\right)$

Setup 6 Select a random point in triangle $ABC$ and draw a line (either towards or away from $BC$) until it meets the triangle, such that the angle $t$ between $PX$ and the side $AB$ (measured clockwise from $PX$) satisfies $0 \leq t \leq \angle A$.

We again make a case distinction based on where the ray $PX$ intersects the side $BC$ - either in $BD$ or in $DC$. We can separate the angle into two cases based on this. Using the previous result, we have,

$\displaystyle\mathbb{E}_{\triangle ABC}(L|S_6)=\frac{h_a}{3\angle A}\left(\text{sinh}^{-1}\left(\frac{BD}{AD}\right)+\text{sinh}^{-1}\left(\frac{DC}{AD}\right)\right)$

This, after some basic simplifications, results in,

$\displaystyle\mathbb{E}_{\triangle ABC}(L|S_6)=\frac{h_a}{3\angle A}\text{csch}^{-1}\left(\frac{p}{a}\frac{s-a}{p-a}\right)$

where $p$ and $s$ are the perimeter and semi-perimeter of $\triangle ABC$ respectively.

Now we get into the final part of the proof.

Setup 7 Select a random point in triangle $ABC$ and draw a line at a random angle from that point until it meets the triangle.

We now again do a case distinction based on the angle and using the previous result, we finally have,

\displaystyle\mathbb{E}_{\triangle ABC}(L)&=\sum_{x \in \{a,b,c\}}\frac{h_x}{3\pi}\text{csch}^{-1}\left(\frac{p}{x}\frac{s-x}{p-x}\right)\\

And hence the result. Hope you enjoyed this. I'll come up with something different the next time.

Until then,
Yours Aye

Thursday, September 21, 2017

An Expected Value Problem III

In An Expected Value Problem, we found the expected length of a line segment drawn by picking a random point in a rectangle and choosing a random angle, and extending it until it meets the edge of a rectangle. We ask the same question here for an arbitrary acute angled triangle in this post.

Let $ABC$ be a triangle with $AB=c$, $BC=a$, and $CA=b$. Let $D$, $E$ and $F$ denote the feet of the altitudes from $A$, $B$ and $C$ receptively such that $AD=h_a$, $BE=h_b$ and $CF=h_c$. I considered this question in the same post above, but I was not able to solve it then.

But recently I collaborated with someone who gave me a very valuable hint which reduced the difficulty of the problem drastically. The idea pretty much solved the problem. In summary, if we denote the length of line segment as $L$, we have,

\displaystyle\mathbb{E}(L)&=\sum_{x \in \{a,b,c\}}\frac{h_x}{3\pi}\text{csch}^{-1}\left(\frac{p}{x}\frac{s-x}{p-x}\right)\\

where $p$ and $s$ denote the perimeter and the semi-perimeter respectively.

Let's calculate the expected values for a few special cases.

$\mathbb{E}(1,1,1)\approx 0.30284835$ and $\mathbb{E}(3,4,5)\approx 1.100147755$

The interest in this question seems very low in the mathematical literature and I was not able to find any material in the internet to verify these values. However, a friend of mine wrote a simulation for the Pythagorean case and the result matched with the value given by the formulae.

Two things are to be mentioned here. First, even though we dealt with the same question for a triangle and a rectangle, note that the formulas are drastically different. Some may find it obvious that they are different but I had an intuition that they'll be pretty 'close' which turned out to be wrong.

Second, the individual terms in the above formulae can themselves are respective expected values for a different question. I'll follow up with the what those questions are and the proof of this formulae in the next few posts.

Until then,
Yours Aye

Saturday, June 10, 2017

An Expected Value problem II

We saw the explicit expression for the expected distance from a random point inside a rectangle to an edge of the rectangle in the previous post. We deal with a similar problem here.

There are a lot of reference in the internet to find the expected distance between two random points in a rectangle (Stack Exchange post). Here we ask 'what is the expected distance between two random points in a right angled triangle?'.

Basically the question is conceptually simple and the expression for the expected value can be readily written as follows:

$\mathbb{E}_T(L)=\displaystyle\int\limits_0^a \int\limits_0^{mx}\int\limits_0^a\int\limits_0^{mu}\sqrt{(x-u)^2+(y-v)^2}\,dv\,du\,dy\,dx$

where the right angled triangle is formed by the points $(0, 0)$, $(a, 0)$, and $(a, b)$. Also, for simplicity we used $m=b/a$.

Now, the difficulty of this integral (and the one that makes it different from the rectangle case) is the variable limits of the integral. In case of a rectangle, all the limits would be constants which makes it a little easier.

I'm not going to go through all the details and trouble I went through to solve this. But it is a combination of both Mathematica and a lot of paper work. Finally, after a few days and hell-a-lot completely messed up ideas later, I ended up with the following explicit expression for the expected value.

$\mathbb{E}_T(L)=\displaystyle\frac{a^3+b^3+2 d^3}{15 d^2}+\frac{a^2 }{15 b}\left(\frac{b^3}{d^3}+1\right)\text{csch}^{-1}\left(\frac{a}{b}\right)+ \frac{b^2}{15 a}\left(\frac{a^3}{d^3}+1\right) \text{csch}^{-1}\left(\frac{b}{a}\right)$

where $d=\sqrt{a^2+b^2}$.

I also wrote a small python program in an Online IDE to verify the correctness of this result.

For example, for $a=2$ and $b=3$, the formula gives

$\mathbb{E}_T(L)=\displaystyle\frac{1}{195} \left(35+26 \sqrt{13}\right)+\frac{4}{45} \left(1+\frac{27}{13 \sqrt{13}}\right) \text{csch}^{-1}\left(\frac{2}{3}\right)+\frac{3}{10} \left(1+\frac{8}{13\sqrt{13}}\right) \text{csch}^{-1}\left(\frac{3}{2}\right) \approx 1.047156999$

whereas the simulation gives $1.047117963$ after $10^8$ trials (takes about 200 secs).

Now for completion sake we also give the expected length of two random points in a rectangle. Here we have,

$\mathbb{E}_R(L)=\displaystyle\frac{a^5+b^5-\left(a^4-3 a^2 b^2+b^4\right)d}{15 a^2 b^2}+\frac{a^2}{6 b}\text{csch}^{-1}\left(\frac{a}{b}\right)+\frac{b^2 }{6 a}\text{csch}^{-1}\left(\frac{b}{a}\right)$

One of the nice things of putting these two together is, now we can also get an expression for the length of two random points lying on opposite sides of diagonal $AC$ of rectangle $ABCD$. Denoting this expected length by $\mathbb{E}_D(L)$ (D denoting diagonal), we have this relation from which we can find the required value.


I tried to get an expression but it doesn't factor nicely but so it better left out. Hope you enjoyed this.

Until then
Yours aye

Wednesday, June 7, 2017

An Expected Value Problem

Hi All... Recently I thought of a problem and tried to solve it. To my surprise, I ended up with a closed form solution for the same which I would like to share with you by this post.

Consider a rectangle $ABCD$ with the longest side $AB=CD=a$ units and shorter side $BC=DC=b$ units. Now pick a random point in this rectangle and a draw a straight line from this point at a random angle until the line meets the edge of the rectangle. What will be the expected value of this line?

We can attack the problem head-on with multiple integrals. Let $L$ be the length of the line. Then we have,

$\mathbb{E}(L)=\displaystyle\frac{1}{ab}\frac{1}{2\pi}\int\limits_{0}^{a}\int\limits_{0}^{b}\int\frac{y}{\sin t}\,dt\,dy\,dx$

The limits of the innermost integral have been left out purposely. We have to decompose the innermost integral. Let's do it this way. Consider the point $(x,y)$. We'll call it $P$.

Now draw a perpendicular from this point to each of the four sides of the rectangle. Let the perpendicular meet the side $X$ at $P_X$. Also join this point to each of the four vertices of the rectangle. This splits the entire rectangle into eight regions. 

Consider the integral $I(a,b)$ (ignoring constant factors for time being) for the 'random lines' that end up in region $PAP_{AB}$

$I(a, b)=\displaystyle\int\limits_{0}^{a}\int\limits_{0}^{b}\int\limits_{\tan^{-1}(y/x)}^{\pi/2}\frac{y}{\sin t}\,dt\,dy\,dx$

But reflecting the 'random lines' about the $y$-axis, this integral also represents the $PP_{AB}B$, reflecting about the $x$-axis, this integral represents region $PP_{CD}D$, reflecting w.r.t both the axes it represents region $PCP_{CD}$. Solving this one integral covers four of the eight regions.

This is integral is pretty simple to solve with standard tables (or atmost with Mathematica). We get,


where $d$ is the length of the diagonal of the rectangle.

Now for the other regions. We don't have to solve anything separately. Just interchange the values of $a$ and $b$. This amounts to rotating the rectangle by $90$ degrees and reasoning as before for the four other regions.

Note the nice thing that in going over the eight regions, we have made the full $2\pi$ radians possible for the 'random line'. So finally we have,

$\mathbb{E}(L)=\displaystyle\frac{4I(a,b)+4I(b,a)}{2\pi ab}$

Simplifying things we finally we end up with,

$\mathbb{E}(L)=\displaystyle\frac{a^3+b^3-d^3}{3\pi ab}+\frac{a}{\pi}\ln{\frac{b+d}{a}} + \frac{b}{\pi}\ln{\frac{a+d}{b}}$


$\mathbb{E}(L)=\displaystyle\frac{a^3+b^3-d^3}{3\pi ab} +\frac{a}{\pi}\text{csch}^{-1}\left(\frac{a}{b}\right)+ \frac{b}{\pi}\text{csch}^{-1}\left(\frac{b}{a}\right)$

The rectangle had a lot of symmetry that we were able to exploit. I'm trying to do the same for a given arbitrary triangle but it seems so very difficult with complicated integrals cropping at all places. I'll update if end up with something.

Until then,
Yours aye

Saturday, May 13, 2017

Division of Dirichlet Series with Python

We saw how to do Dirichlet Series multiplication in the previous post. This post is about dividing two Dirichlet series. Using pretty much the same notation as before, here we are concerned about the following.


We can use the following Python code to get the Summatory function $H(n)$.

Note: We know that Dirichlet inverse exists if and only if $g(1)\ne0$. In fact, for most of the multiplicative functions $g(1)=1$. So I have made this assumption in the code. If it is not, changes have to made accordingly.

from timeit import default_timer

start = default_timer()

n = 1000000000
s = int(n ** 0.5)

Divs = [0] * (1 + s)

for k in range(1, 1 + s):
    temp = [1] if k == 1 else [1, k]
    stemp = int(k ** 0.5)
    for j in range(2, 1 + stemp):
        if k % j == 0:
            if j * j == k:
                temp += [j]
                temp += [j, k // j]
    Divs[k] = sorted(temp)

def DirDiv(Lowsn, Highsn, Lowsd, Highsd, n):
    s = int(n ** 0.5)
    Lows, Highs = [0] * (1 + s), [0] * (1 + s)
    Lows[1] = 1 // (Lowsd[1] - Lowsd[0])
    for k in range(2, 1 + s):
        for j in Divs[k][:-1]:
            Lows[k] -= (Lows[j] - Lows[j - 1]) * (Lowsd[k // j] - Lowsd[k // j - 1])
        Lows[k] += Lowsn[k] - Lowsn[k - 1]
        Lows[k] *= Lows[1]
        Lows[k] += Lows[k - 1]
    for k in range(s, 0, -1):
        tempn = n // k
        res = Highsn[k]
        u = int(tempn ** 0.5)
        v = tempn // (1 + u)
        for i in range(1, 1 + v):
            res -= (Lows[i] - Lows[i - 1]) * (Highsd[k * i] if k * i <= s else Lowsd[tempn // i])
        for i in range(2, 1 + u):
            res -= (Lowsd[i] - Lowsd[i - 1]) * (Highs[k * i] if k * i <= s else Lows[tempn // i])
        i = 1 + u
        res += Lows[tempn // i] * Lowsd[u]
        Highs[k] = res
    return Lows, Highs

Lowsn, Highsn = [0] + [1] * s, [0] + [1] * s
Lowsd, Highsd = [k for k in range(1 + s)], [0] + [n // k for k in range(1, 1 + s)]
Lows, Highs = DirDiv(Lowsn, Highsn, Lowsd, Highsd, n)
print(default_timer() - start)

As in the multiplication case, here again the 'Lows' and 'Highs' mean the same. In the sample code given above, we have $F(s)=1$ and $G(s)=\zeta(s)$. This means $H(s)$ is the Dirichlet series of the Moebius function.

As a result, we get the values of the Merten's function as the output. Here again, the runtime of the algorithm is about $O(n^{\frac{3}{4}})$.

Until then
Yours Aye

Multiplication of Dirichlet Series with Python

It is well known that polynomial multiplication can be much more efficiently using Fast Fourier Transform. I was searching for similar to use in case Dirichlet series. For example, if I have two functions $f(n)$ and $g(n)$, with Dirichlet series $F(s)$ and $G(s)$ respectively, what would be efficient way to get the summartory function of the Dirichlet convolution of the two functions?

Again, we could simply use the Dirichlet's Hyperbola method to get a very generic algorithm to solve this.

Let $f(n)$ and $g(n)$ be two arithmetic functions. Define $h(n)=(f*g)(n)$ as the Dirichlet convolution of $f$ and $g$. Also, let $F(n)$, $G(n)$ and $H(n)$ be the summatory functions for the respective functions.

Now using the formulas we saw in Dirichlet's Hyperbola method, I wrote the following Python code that does the job. The runtime of the following algorithm is about $O(n^{\frac{3}{4}})$

from timeit import default_timer

start = default_timer()

Lim = 1000000000
s = int(Lim ** 0.5)

Divs = [0] * (1 + s)

for k in range(1, 1 + s):
    temp = [1] if k == 1 else [1, k]
    stemp = int(k ** 0.5)
    for j in range(2, 1 + stemp):
        if k % j == 0:
            if j * j == k:
                temp += [j]
                temp += [j, k // j]
    Divs[k] = sorted(temp)

def DirMult(Losf, Highsf, Losg, Highsg, n):
    s = int(n ** 0.5)
    Highs = [0]
    Los = [0] * (1 + s)
    for k in range(1, 1 + s):
        temp = 0
        for j in Divs[k]:
            temp += (Losf[j] - Losf[j - 1]) * (Losg[k // j] - Losg[k // j - 1])
        Los[k] = Los[k - 1] + temp
        temp, tempn = 0, Lim // k
        u = int(tempn ** 0.5)
        v = tempn // (1 + u)
        mini, maxi = min(u, v), max(u, v)
        for j in range(1, 1 + mini):
            temp += (Losf[j] - Losf[j - 1]) * (Highsg[k * j] if k * j <= s else Losg[tempn // j]) + (Losg[j] - Losg[
                j - 1]) * (Highsf[k * j] if k * j <= s else Losf[tempn // j])
        for j in range(1 + mini, 1 + maxi):
            temp += (Losg[j] - Losg[j - 1]) * (Highsf[k * j] if k * j <= s else Losf[tempn // j])
        j = 1 + u
        temp -= Losg[u] * Losf[tempn // j]
        Highs += [temp]
    return Los, Highs

Losf, Highsf = [k for k in range(1 + s)], [0] + [Lim // k for k in range(1, 1 + s)]
Losg, Highsg = [k for k in range(1 + s)], [0] + [Lim // k for k in range(1, 1 + s)]

Lows, Highs = DirMult(Losf, Highsf, Losg, Highsg, Lim)

print(default_timer() - start)

We'll see how to use this code. To use the function $\text{DirMult}$, we need four parameters. These parameters defined as below.

$\text{Losf}[k]=F(k)$, $\text{Highsf}[k]=F\left(\left\lfloor\frac{n}{k}\right\rfloor\right)$, $\text{Losg}[k]=G(k)$, and $\text{Highsg}[k]=G\left(\left\lfloor\frac{n}{k}\right\rfloor\right)$

We get two outputs and they too are interpreted the same way.

$\text{Los}[k]=H(k)$, $\text{Highs}[k]=H\left(\left\lfloor\frac{n}{k}\right\rfloor\right)$

In the example code above, we have $F(s)=G(s)=\zeta(s)$. Therefore, the output we get is the Divisor Summatory function.

Note that as the algorithm is made for an almost no-brainer approach. It doesn't take advantage of the properties of the functions that we are multiplying and hence the increased runtime.

I'll see you in the next post with Division of Dirichlet Series.

Until then
Yours Aye

Monday, April 10, 2017

Divisor functions - Continued..

This post continues the discussion we had in my previous post.

I discussed only $k=0$ case before dismissing the others as uninteresting. They aren't after all. After some further work, I think I've figured out how the first DGF given above generalized to the other cases as well.

Following the same notations given before we have,

$\displaystyle\sum_{n=1}^\infty\frac{\sigma_k(m\cdot n)}{n^s}=\zeta(s)\zeta(s-k) \prod_{p_j|m}\left(1+(p_j^{k}+p_j^{2k}+p_j^{3k}+\cdots+p_j^{m_jk})(1-p_j^{-s})\right)$

Quite simple after all!! But I don't think the second DGF simplifies as nicely. That should be one for another post. But there is something else.

The first formula given in the previous post works even when $n$ is replaced by $n^2$. We are gonna make that replacement because of the closed form DGF we have for $\sigma_k(n^2)$.

Again following the same notations, this time we have,

$\displaystyle\sum_{n=1}^\infty\frac{\sigma_k(m\cdot n^2)}{n^s}=\frac{\zeta(s)\zeta(s-2k)\zeta(s-k)}{\zeta(2s-2k)} \prod_{p_j|m}\left(1+(p_j^{k}+p_j^{2k}+\cdots+p_j^{m_jk})\left(\frac{1-p_j^{-s}}{1+p_j^{-(s-k)}}\right)\right)$

As before, these DGFs can be used in combination with Dirichlet's Hyperbola method to find their summatory functions. Hope you find these interesting and worth your while. See ya soon.

Till then
Yours Aye,

Friday, April 7, 2017

Divisor functions

Oh... I didn't realize I'vnt posted anything this year. Anyway, here's my first post of the year - my thoughts on the Divisor function.

The Divisor functions are well known and have been studied extensively in Mathematics. Their relation to the Partition function is one of best identities I've seen. I was playing around with them as part of my effort to solve a problem and came up with some identities that I can't find anywhere on the Web. So this post will be about them.

The Divisor function is known to be multiplicative, but not completely multiplicative. But I needed an some formulas where I wanted to find the value of the Divisor functions of product of two numbers that are not co-prime. After some extensive calculations, I ended with the following:

Let $m=p_1^{m_1}p_2^{m_2}p_3^{m_3}\cdots$ and $n=p_1^{n_1}p_2^{n_2}p_3^{n_3}\cdots$. Then,

$\displaystyle\frac{\sigma_k(m\cdot n)}{\sigma_k(n)}=\prod_{p_j|m}\left(1+\frac{p_j^k+p_j^{2k}+p_j^{3k}+\cdots+p_j^{m_jk}}{1+p_j^{-k}+p_j^{-2k}+p_j^{-3k}+\cdots+p_j^{-n_jk}}\right)$

We can easily check a couple of easy cases. For example, when $n=1$, all of the $n_j$'s become $0$ and the above formulae reduces to the definition of Divisor function. Similarly, when $m$ and $n$ are co-prime, for any prime that divides $m$, the exponent of that prime in $n$ will be $0$, again reducing to a form that shows that divisor functions are multiplicative.

I found the case $k=0$ especially very interesting. We'll see why in a short while. First of all, when $k=0$, the above formula reduces to,

$\displaystyle\frac{\sigma_0(m\cdot n)}{\sigma_0(n)}=\left(1+\frac{m_1}{n_1+1}\right)\left(1+\frac{m_2}{n_2+1}\right)\left(1+\frac{m_3}{n_3+1}\right)\cdots$

As $\sigma_0(n)=(1+n_1)(1+n_2)(1+n_3)\cdots$, we can write

$\sigma_0(m\cdot n)=\sigma_0(n)+m_1\sigma_{0,p_1}(n)+m_2\sigma_{0,p_2}(n)+\cdots+m_1m_2\sigma_{0,p_1p_2}(n)+\cdots+m_1m_2\cdots m_k\sigma_{0,p_1p_2\cdots p_k}(n)+\cdots$

where $\sigma_{0,B}(n)$ is (defined only for square-free $B$) a function that counts the number of divisors of $n$ that are not divisible by any of the prime factors of $B$. Note that we have eliminated  everything related to prime factorization of $n$.

Actually this is why I said the case $k=0$ is interesting. Had $k$ been greater than $0$, we would have those annoying prime factors of $n$ in the above expression. Now with what we have above we can make an interesting Dirichlet Generating function.

Given $m=p_1^{m_1}p_2^{m_2}p_3^{m_3}\cdots$, we have

$\displaystyle\sum_{n=1}^\infty\frac{\sigma_0(m\cdot n)}{n^s}=\zeta^2(s) \prod_{p_j|m}\left(1+m_j(1-p_j^{-s})\right)$

Even better, for a given $m$, if we define a function $f(n)$ as

$f(n)=\displaystyle\sum_{d|m}\sigma_0(d\cdot n)$

again we end up with a nice DGF. This we have,


All of the above DGFs in combination with Dirichlet's Hyperbola method can be used to find their respective summatory function.

We can see how this generalizes in the next post. See ya soon.

Till then
Yours Aye,