Branching processes: Mean and variance

Generational growth

Consider the following scenario (see the figure below):

  1. A single individual (cell, organism, etc.) produces j(=0,1,2,) descendants with probability pj, independently of other individuals.
  2. The probability of this reproduction, {pj}, is known.
  3.  That individual produces no further descendants after the first (if any) reproduction.
  4. These descendants each produce further descendants at the next subsequent time with the same probabilities. 
  5. This process carries on, creating successive generations.

Figure 1. An example of the branching process.

Let Xn be the random variable representing the population size (number of individuals) of generation n. In the above figure, we have X0=1, X1=4, X2=7, X3=12, X4=9. We shall assume X0=1 as the initial condition. Ideally, our goal would be to find how the population size grows through generations, that is, to find the probability Pr(Xn=k) for each n=1,2,. However, that's not easy, if possible. Therefore, we content ourselves by finding average properties such as the mean population size E(Xn) or the variance V(Xn), or some characteristic probabilities such as the probability of extinction.

Let's consider the probability generating function for pj:

G(s)=j=0pjsj.

Since we are assuming X0=1, G(s) is the generating function for the random variable X1 (this means Pr(X1=j)=pj). 

Let G2(s) be the generating function of the random variable X2 (the population size of the second generation). Since there are X1 individuals, assuming each produce Yk(k=1,2,,X1) offsprings, we have

X2=Y1+Y2++YX1.

In the case of the above Figure 1, we have a realized value of X2 as

x2=y1+y2+y3+y4=4+1+0+2=7.

In general,

Pr(Yk=j)=pj,  Pr(X1=m)=Pr(X1=m|X0=1)=pm.

(Recall that X0=1 is the assumed initial condition.)

Let hn=Pr(X2=n).

Using the partition theorem (or the law of total probability), we have

hn=Pr(X2=n)=r=0Pr(X2=n|X1=r)Pr(X1=r)=r=0prPr(X2=n|X1=r).

Multiply both sides by sn and sum over n,

G2(s)=n=0hnsn=r=0prn=0P(X2=n|X1=r)sn.

The last factor (the second sum) can be calculated as follows.

n=0P(X2=n|X1=r)sn=E(sX2|X1=r)   (conditional expectation)=E(sY1+Y2++Yr)=E(sY1)E(sY2)E(sYr)   (by independence)=G(s)G(s)G(s)   (identically distributed)=[G(s)]r

since E(sY1)=E(sY2)==G(s). Each Yk is (the random variable representing) the number of offsprings produced by an individual, and they are i.i.d. Therefore,

G2(s)=r=0pr[G(s)]r=G(G(s)).

Since G(1)=1, we have G(G(1))=G(1)=1

The same result holds for any successive generations. Let Gm(s) be the generating function of Xm (the population size of generation m), then

Gm(s)=Gm1(G(s))=G(G((G(s))))

where G(s) is nested m times. Note that Gm(1)=1 follows from $G(1) = 1$.

Mean and variance

Let μn and σn2 be the mean and variance of the population size of the n-th generation. Using the pgf Gn(s), we have

μn=E(Xn)=Gn(1),σn2=V(Xn)=Gn(1)+μnμn2.

It is understood that G1(s)=G(s). For n=2, we have

μ2=ddsG2(s)|s=1=dds[G(G(s))]s=1=G(1)G(1)=μ12.

In general,

μn=dGn(s)ds|s=1=dG(Gn1(s))ds|s=1=G(1)Gn1(1)=μ1μn1=μ12μn2==μ1n.

Exercise. Prove this by mathematical induction. □

To obtain the variance σn2, we need second derivatives.

Gn(s)=G(Gn1(s))Gn1(s),Gn(s)=G(Gn1(s))[Gn1(s)]2+G(Gn1(s))Gn1(s).

With s=1 and recalling Gn(1)=1 for any n, we have

Gn(1)=G(Gn1(1))[Gn1(1)]2+G(Gn1(1))Gn1(1)=G(1)[Gn1(1)]2+G(1)Gn1(1)=(σ12μ1+μ12)μn12+μ1Gn1(1).

Let un=Gn(1) and we have

unμ1un1=(σ12μ1+μ12)μ12n2

since μn1=μ1n1. This is a first-order linear inhomogeneous difference equation. We need to distinguish between two cases depending on μ1=1 or not.

Case 1: μ11

The corresponding homogeneous difference equation is

unμ1un1=0

which is just a geometric sequence. So the general solution is

un=Bμ1n

with B being a constant. For the particular solution, try un=Cμ12n with some constant C. This tentative solution satisfies the difference equation if

C=σ12μ1+μ12μ1(μ11).

Exercise. Verify this result. □

Combining the two solutions (the general and particular), we have

un=Bμ1n+σ12μ1+μ12μ1(μ11)μ12n.

The constant B can be determined from the case with n=1 where we have

u1=σ12μ1+μ12

so that

B=σ12μ1+μ12μ1(μ11).

After all, we have

un=σ12μ1+μ12μ1(μ11)μ1n(μ1n1).

I hope you still remember that un=Gn(1). Therefore,

σn2=un+μ1nμ12n=σ12μ1n1(μ1n1)μ11.

Case 2: μ1=1

In this case, the original difference equation becomes

unun1=σ12

which is just an arithmetic sequence, so the general solution is

un=nσ12+A,,

where A is a constant. From u1=σ12, we have A=0. Hence,

σn2=Gn(1)=nσ12.

Comments

Popular posts from this blog

Birth process

Informal introduction to formal logic