I write, therefore I am: November 2024

Saturday, November 30, 2024

Five ways of solving an Expected Value problem

Let's say an Independent Aurelian is investigating butterflies on two types of traits, A and B. It is known that the probability of a butterfly exhibiting trait A is $p$ and that of trait B is $q$ (independent of each other). Note that this means the probability of a butterfly exhibiting both these traits is $pq$.

If she wants to study both these traits, how many butterflies, on average, would she need to catch? She can do this either by catching two butterflies each with exactly on of these traits or by catching a butterfly with both these traits.

The problem is simple and that subject of this post is look at five different methods of this problem. A key fact that will be of great use in all these methods is that for an event with success probability $r$, we would need, on average, $1/r$ trials to get the first success.

Simple equation

Let $N$ be the expected value we are looking for. We can condition this based on the nature of the first butterfly that is caught.

$\displaystyle N=(1-p)(1-q)\cdot(1+N)+(1-p)q\cdot \left(1+\frac{1}{p}\right)+p(1-q)\cdot\left(1+\frac{1}{q}\right)+pq\cdot 1$

Transfer Matrix

It is easy to setup the transfer matrix for this problem with the states being $\{\text{None, Only trait }A\text{, Only trait }B\text{, Both}\}$. Then the transfer matrix $\mathbf{T}$ is,

$\displaystyle \mathbf{T}=\begin{pmatrix}1-(1-p)(1-q)&p(1-q)&(1-p)q&pq \\ 0&1-q&0&q \\ 0&0&1-p&p \\ 0&0&0&1\end{pmatrix}$

It is then well known that the expected value is given by $\mathbf{1}(\mathbf{I}-\mathbf{T})^{-1}$ where $\mathbf{I}$ is the identity matrix and $\mathbf{1}$ is a row matrix of 1's.

Infinite Series

Let's say we've caught $n$ butterflies. The probability that none of them have trait A is $(1-p)^n$ which means that atleast one of them have trait A is $1-(1-p)^n$ and the same applies for trait B as well (with $p$ replaced with $q$).

$\mathbb{P}(\text{atleast of one each of both traits in }n\text{ trials})=(1-(1-p)^n)(1-(1-q)^n)$

With this, we see that $\mathbb{P}(N>n)=1-(1-(1-p)^n)(1-(1-q)^n)$

Then, $\displaystyle \mathbb{E}(N)=\sum_{n=0}^\infty \mathbb{P}(N>n)=\sum_{n=0}^\infty \Bigl (1-(1-(1-p)^n)(1-(1-q)^n) \Bigl )$

Story proof

We can answer this more directly. The expected number of butterflies needed can be broken into parts. First, expected number to get atleast one of the traits and second, the remaining value.

The probability of a butterfly exhibiting one of the traits is $1-(1-p)(1-q)=p(1-q)+(1-p)q+pq$. Therefore, we would need $1/(1-(1-p)(1-q))$ trials to get atleast of the traits. Now,

$\displaystyle \mathbb{P}(\text{Only trait A}|\text{atleast one of the traits})=\frac{p(1-q)}{p(1-q)+(1-p)q+pq}=\frac{p(1-q)}{1-(1-p)(1-q)}$

After having got trait A, we would be looking for trait B which we would get, on average, in $1/q$ trails. This approach applies for the other two cases as well. Therefore,

$\displaystyle\mathbb{E}(N)=\frac{1}{1-(1-p)(1-q)}+\frac{p(1-q)}{1-(1-p)(1-q)}\cdot\frac{1}{q}+\frac{(1-p)q}{1-(1-p)(1-q)}\cdot\frac{1}{p}+\frac{pq}{1-(1-p)(1-q)}\cdot 0$

(Shorter story proof using) Min-Max identity

It is easy to see that $x_1+x_2=\max (x_1,x_2)+\min (x_1,x_2)$ for any two real $x_1,x_2$.

Using this and the linearity of expectations, we have,

$\mathbb{E}(N)=\mathbb{E}(\text{trials to get trait A})+\mathbb{E}(\text{trials to get trait B})-\mathbb{E}(\text{trials to get atleast one of the traits})$

All of these are Geometric random variables with probabilities $p$, $q$ and $p+q-pq$. Therefore,

$\displaystyle \mathbb{E}(N)=\frac{1}{p}+\frac{1}{q}-\frac{1}{p+q-pq}$

Where's the fun in solving a problem if getting an answer is the only objective?

Yours Aye

A Geometric Optimisation problem

Consider two circles $A(B)$ and $C(B)$ intersecting at $B$ and $D$, and a line passing through $D$. Let this line intersect the circle $A(B)$ at $F$ and the circle $C(B)$ at $G$.

The question of maximising the arithmetic mean of $DF$ and $DG$ is well known. It is also known that Euclidean construction of the line that maximises the Harmonic mean of $DF$ and $DG$ is impossible. We already saw the construction of the line that maximises the Geometric mean of $DF$ and $DG$ in this blog.

In this post we are interested in constructing the line such that the Quadratic mean of $DF$ and $DG$ is maximised. In fact, we solve a slightly more general problem. For a given $\theta$, we construct the line such that $\sqrt{DF^2+DG^2-2 \cdot DF \cdot DG \cdot \cos \theta}$ is maximised.

We first use some auxiliary points and lines that helps us in the final solution. For a given $\theta$, construct points $D'$ and $D'_1$ such that $\angle DAD'=\angle DCD'_1=\theta$. Construct the circle that passes through $D$, $B$ and $D'_1$.

For any point $E$ on this circle, let the line $ED'$ intersect circle $A(B)$ at $F$. Let $G$ be likewise on circle $C(B)$. Construct line $FG$. We first show that the line $FG$ passes through $D$.

We know that $\angle FD'B+\angle FDB=\pi$ because they are opposite angles in a cyclic quadrilateral. But $\angle FD'B=\angle ED'B=\pi-\angle ED'_1B$ because $\angle ED'B$ and $\angle ED'_1B$ are opposite angles of a cyclic quadrilateral. Now, $\angle FD'B=\angle BD'_1G=\angle BDG$.

Therefore, $\angle BDG + \angle FDB=\pi$ which shows that $F$, $D$ and $G$ are collinear.

Now, by construction, $\theta / 2=\angle D'BD=\angle D'_1BD$ and therefore, $\angle D'BD'_1=\theta$. This shows that $\angle D'ED'_1=\pi-\theta$.

We know that $\angle DGD'_1=\angle DBD'_1=\theta/2$ because they are angles in a same segment of a circle. Because $\angle DAD'=\theta$, we can see that $\angle DFD'=\pi-\theta/2$, and therefore $\angle DFE=\theta/2$.

Thus we have shown that $\triangle FEG$ is isosceles with $FE=EG$. Therefore, we can quickly see that $FG=2EF\cos(\theta/2)$.

Using Stewart's theorem on this triangle for the cevian DE, we have,

$EF^2=DE^2+DF\cdot DG$

Multiplying by $(2\cos(\theta/2))^2$ on both sides and simplifying, we get

$(2\cos(\theta/2))^2DE^2=FG^2-2DF\cdot DG \cdot 2\cos^2(\theta/2)$

Using the fact that $FG=DF+DG$ and $\cos\theta=2\cos^2(\theta/2)-1$,

$(2\cos(\theta/2))^2DE^2=DF^2+DG^2-2\cdot DF \cdot DG \cdot \cos\theta$

As $\theta$ is given, the above expression shows that maximising the RHS amounts to maximising $DE$. But this can be easily achieved. We just draw the line joining $D$ and center of circle $D'BD'_1$. The point at which this line intersects the circle gives us $E$ from which we can trivially construct the line $FG$ using what we've described before.

Hope you enjoyed this. See ya soon.

Until then, Yours Aye