Wednesday, March 18, 2020

A note on Berlekamp Massey Algorithm

Let's say we are given a sequence of $2n$ terms and were told that these are generated by a linear recurrence of order $n$ with the first $n$ terms of the sequence being the initial terms. More specifically, given $n$ terms, we want a recurrence that generates the last $\lfloor \frac{n}{2}\rfloor$ terms with the first $\lceil \frac{n}{2} \rceil$ terms.

The problem of finding the recurrence given the terms can be solved by Berlekamp Massey (BM) Algorithm. Though there are not many sources that I could find for solving recurrences with BM, I found the CodeForces entry that explains much of the details. Yet another source that I found very interesting is Berlekamp Massey algorithm.

To make the algorithm more clear than the quoted page, let's take an example. Let the sequence of terms be $[1,2,4,10,24,50,124]$.

Looking at (only) the first three terms, we make a first guess that

$\text{Guess 1:}\quad a_n=2a_{n-1}$ (or) $a_n-2a_{n-1}=0$

But when we include the fourth term, we run into a problem. We have a discrepancy.


We stare into the (only) first four terms of the sequence and come with another guess.

$\text{Guess 2:}\quad a_n=2a_{n-1}+a_{n-2}$ (or) $a_n-2a_{n-1}-2a_{n-2}=0$

And this correctly generates the fourth term.

(two tiny details at this point for the curious minded - starts here...)
First, someone could've guessed $a_n=2a_{n-1}+2a_{n-3}$. That works too. Continuing with the BM algorithm, we may end up with different answers but both recurrences will correctly generate the last three terms using the first four terms. Try it!

Second, though our second guess correctly predicts $a_4$, it fails for $a_3$. This is not a problem as Flawed guesses, especially for the first terms in the first half, gets 'course-corrected' at later stage of the algorithm. In the final algorithm, we only be making guesses, that too trivial guesses, at the initial stage.
(... and ends here)

We try it for the next term and we see the recurrence correctly generates fifth term. We're good. We try it for the next term and we run into problem. We have a discrepancy of $8(=58-50)$.

But this time we don't have to guess anything more. Note that we re-write our 'Guess 2' as

$\text{Guess 2:}\quad a_n=2a_{n-1}+a_{n-2}+k(a_{n-2}-2_{n-3})$

for any real $k$.

Where did the terms in the bracket come from? It our first guess shifted by two terms.

Why have we shifted it by two terms? Because we are currently finding a discrepancy at the sixth term and our last guess created a discrepancy at fourth term.

Will this not affect the previous terms? That is the key of BM algorithm. The terms in the bracket will become zero when we evaluate it for earlier terms.

For example, when we evaluate Guess 2 for $a_5$, the terms in the bracket will be $a_3-2a_2$ which is actually zero. In fact, only because it was zero, we made it our first guess.

Now that we have an additional factor of $k$ in our 'Guess 2' now, we can use it to our advantage.  Choosing $k=-4$, we remove the discrepancy at the sixth position.

From now on, we can continue using the same idea. Anytime we face a discrepancy, we can choose an right multiple (based on the current and previous deviation) of our previous guess (shifted by the right amount) to correct that discrepancy. And this continues for all elements in the given sequence, thus completing the algorithm.

This type of problem comes every now and then in Competitive programming. For a given problem, we can quickly code a brute force method that can churn out the first few terms and we would be interested in finding a linear recurrence if there is one.

If we have the recurrence, finding the $n$th term (even for $n$ of order $10^{10^7}$) of the recurrence can be found in a matter of seconds. For example, we have this wonderful Codechef Editorial that explains how to do that.

Here is an (over)commented Python code of the BM algorithm based on what I wrote above.

from timeit import default_timer

start = default_timer()

def berlmassey(l0, p0):
     # Input: list of length 2n and a prime p
     # Output: Recurrence of length n that takes the first n terms as initial values and generates the next n (and possibly the further) terms
     l, p = l0[:], p0
     if len(l) <= 1:
          return l
     if l[0] != 1:          # WLOG, first element of the list should be 1
          return []
     old_rec = []          # there is no recurrence to start with
     v = 1          # deviation of the (empty) recurrence that occurs in the first element
     old_pos = 0          # trivially the list was fine until the first element
     curr_rec = [l[1] % p]          # we guess something that's trivially true
     curr_pos = 1          # starting with the second element of the list
     while curr_pos < len(l):
          temp = 0
          for i in range(len(curr_rec)):          # check what the current recurrence gives for the next element
               temp += curr_rec[i] * l[curr_pos - i - 1]
               temp %= p
          temp = (l[curr_pos] - temp) % p          # is there a deviation?
          if temp == 0:          # If no deviation, current recurrence is still good and hence moving on..
               curr_pos += 1
          temp_curr_rec = curr_rec[:]          # just a temporary storage
          v = temp * pow(v, -1, p) % p          # normalizing factor based on current and last deviation
          old_rec = [p - x for x in old_rec]          # re-strucuturing the last successful recurrence
          old_rec = [1] + old_rec
          old_rec = [0] * (curr_pos - old_pos - 1) + old_rec
          if len(old_rec) > len(curr_rec):          # making last successful and current recurrences the same length
               curr_rec = curr_rec + [0] * (len(old_rec) - len(curr_rec))
               old_rec = old_rec + [0] * (len(curr_rec) - len(old_rec))
          curr_rec = [(curr_rec[i] + v * old_rec[i]) % p for i in range(len(curr_rec))]          # updating the current recurrence with the normalizing constant and the (old) last successful recurrence
          old_rec = temp_curr_rec[:]          # the recurrence we started with becomes the last successful recurrence
          v = temp          # the deviation that occured at this position
          old_pos = curr_pos          # this position becomes the last deviated position
          curr_pos += 1          # On to the next iteration..
     return curr_rec

print(berlmassey([1, 2, 4, 10, 24, 50, 124], 1000000007))
print(default_timer() - start)

Finally, a couple of points regarding the code. As there are divisions involved, I have used a prime and made everything with modulo operations. We can also use exact divisions if there is a need. Hope you enjoyed it.

Until then
Yours Aye

Friday, January 24, 2020

A nice result in Geometry

Euler's theorem in Geometry is a great result that connects the distance between the circumcentre and incentre of a triangle with the circumradius and inradius of the triangle. A similar result, known as Fuss's theorem, holds for bi-centric quadrilaterals. It was surprising when I found a similar relation holds for three mutually tangent circular arcs.

Consider the three blue circles $A$, $B$ and $C$ (with radii $r_A$, $r_B$ and $r_C$ respectively) that are mutually tangent to each other. Let the points of tangency be $T_A$, $T_B$ and $T_C$ which becomes the vertices of our 'triangle of circular arcs'.

We can now draw two circles: a circumcirle (red) that passes through all the tangency points and an incircle (green) that is mutually internally tangent to all the blue circles. Let's position the configuration in the Argand plane such that center of the incircle is at the origin.

Let the centers of the blue circles be $z_A$, $z_B$ and $z_C$, and the center of the circumcircle be $z$.

Now if we let $R$ and $r$ be the circumradius and inradius of our triangle, then
$$\text{Eqn (1): }d^2=R^2-4Rr+r^2$$
where $d=\vert z \vert$ is the distance between the centers of the circumcircle and incircle.

To prove this, Consider $\triangle z_A z_B z_C$ whose side lengths are $r_A+r_B$, $r_B+r_C$ and $r_C+r_A$. It is easy to show that,

$\text{Eqn (2): }R^2=\displaystyle\frac{r_Ar_Br_C}{r_A+r_B+r_C}$ and $\text{Eqn (3): }r^{-1}={r_A}^{-1}+{r_B}^{-1}+{r_C}^{-1}+2R^{-1}$

$(2)$ is obtained by noting that the red circle is the incircle of $\triangle z_A z_B z_C$ and $(3)$ is a direct application of Descartes' theorem.

Now given the vertices of a complex triangle, we can express the incenter of the triangle using its vertices (using this). Now we have,

$\displaystyle z=\sum_{\text{cyc}}\frac{r_B+r_C}{2(r_A+r_B+r_C)}z_A \implies \text{Eqn (4): }2(r_A+r_B+r_C)z=\sum_{\text{cyc}}(r_B+r_C)z_A$

Multiplying $(4)$ with its complex conjugate, we have,

$\text{Eqn (5): }4(r_A+r_B+r_C)^2\vert z \vert^2=\displaystyle\sum_{\text{cyc}}(r_B+r_C)^2\vert z_A \vert^2+\sum_{\text{cyc}}(r_B+r_C)(r_C+r_A)(z_A\bar{z_B}+\bar{z_A}z_B)$

From the figure, we see that $\vert z_B-z_A \vert=r_A+r_B$. Using the fact that the product of a complex number and its conjugate is the square of its magnitude, we can see that,

$\text{Eqn (6): }z_A\bar{z_B}+\bar{z_A}z_B=\vert z_A \vert^2+\vert z_B \vert^2-\vert z_B-z_A\vert^2=\vert z_A \vert^2+\vert z_B \vert^2-(r_A+r_B)^2$

Using $(6)$ in $(5)$, and cancelling the common terms, we get,

$\text{Eqn (7): }2(r_A+r_B+r_C)\vert z \vert^2=\displaystyle\sum_{\text{cyc}}\vert z_A \vert^2(r_B+r_c)-\prod_{\text{cyc}}(r_A+r_B)$

From the figure above, we have $\vert z_x \vert=r+r_x$ for $x \in \{A,B,C\}$. Using this we get,

\displaystyle\text{Eqn (8): }\sum_{\text{cyc}}\vert z_A \vert^2(r_B+r_c)&=\sum_{\text{cyc}}(r^2+2rr_A+r_A^2)(r_B+r_C) \\
&= 2r^2(r_A+r_B+r_C)+4r(r_Ar_B+r_Br_C+r_Cr_A)+\sum_{\text{cyc}}r_A^2(r_B+r_C) \\

Using $(8)$ in $(7)$, we get,

$\text{Eqn (9): }\displaystyle2(r_A+r_B+r_C)\vert z \vert^2=2r^2(r_A+r_B+r_C)+4r(r_Ar_B+r_Br_C+r_Cr_A)+\sum_{\text{cyc}}r_A^2(r_B+r_C)-\prod_{\text{cyc}}(r_A+r_B)$

Now the second term of $(9)$ can be simplified as
\text{Eqn (10): }r(r_Ar_B+r_Br_C+r_Cr_A)&=rr_Ar_Br_C(r_A^{-1}+r_B^{-1}+r_C^{-1})\\
where we have used $(2)$ and $(3)$ in the second equality.

The last two terms of $(9)$ can be expanded and most of the terms gets cancelled giving,

$\displaystyle\text{Eqn (11): }\sum_{\text{cyc}}r_A^2(r_B+r_C)-\prod_{\text{cyc}}(r_A+r_B)=-2r_Ar_Br_C=-2R^2(r_A+r_B+r_C)$
where we have used $(2)$ in the last equality.

Using $(10)$ and $(11)$ in $(9)$ and dividing both sides by $2(r_A+r_B+r_C)$ , we finally have $(1)$.

From $(1)$, it follows that, $R\geq (2+\sqrt{3})r$ which holds with equality only when $r_A=r_B=r_C$.

This can probably be proved with geometry but I couldn't get my head around it.

Finally, before we finish this discussion, we also note that we can completely parametrize $(1)$ for non-co-prime integer solutions. For integer $a,b$ such that $\gcd(a,b)=1$, $3 \nmid b$ and $b \leq \sqrt{3}a$

(i) If $a\not\equiv b \pmod{2}$, then $R=3a^2+4ab+b^2$, $r=2ab$ and $d=3a^2-b^2$
(ii) If $a\equiv b \pmod{2}$, then $R=(3a^2+4ab+b^2)/2$, $r=ab$ and $d=(3a^2-b^2)/2$

Hope you found this interesting. See ya later.

Tangencies: Three tangent Circles
Efficiently constructing tangent circles
Inversion and applications to Ptolemy and Euler

Until then
Yours Aye