Hardy Weinberg Model for Gene Frequencies*

An Application of the Square of a Binomial

 

In learning algebraic manipulations, one of the early techniques practiced is multiplication of binomials

(a+b)(c+d)

by successive application of the distributive property of multiplication over addition, known fondly as FOIL (First, Outside, Inside, Last), often used as a verb.  The special case of a square of a binomial

(a+b)=a+2ab+b

is a pattern that is frequently stressed in early instruction, and gets special recognition later in completing the square. 

 

Here is an application in genetics, looking at the particular frequencies of mutant alleles and the related frequencies of allele pairings of homozygotes with the normal allele only, homozygotes with the mutant allele only, that display a trait, and heterozygotes (mixed allele pairs) that are carriers of the trait if it is recessive (as with the sickle cell gene), and displaying the trait if it is dominant (as with blood antibody types).    The square of the binomial appears naturally in this description.

 

The normal allele is denoted A; the fraction of alleles that are normal in a population is denoted p.

The mutant allele is denoted a; the fraction of alleles that are mutant in a population is denoted q.

The entirety of the population of alleles is the sum of both fractions: p+q=1.

 

An individual is composed of a maternal allele and a paternal allele.

                                                     

The population fraction of individuals that are homozygous AA is p;

the population fraction of individuals that are heterozygous Aa is pq;

the population fraction of individuals that are homozygous aa is q.

 

The sum of all these fractions, the entirety of the population, is

1=p+2pq+q=(p+q) =1=1

Let us see this applied.

 

A screening of school age children in Ghana for sickle cell anemia produced the results in the following table.

Genotype

AA

Aa

aa

Frequency

.834

.161

.005

What is the frequency, q, of the sickle cell allele in this population?

 

Well, these are REAL numbers, so the “answers” don’t come out looking perfect.  Seeing from the above that q=.005, it would be natural to conclude that q=Ö(.005)=.071.  This would give a value for p=1-q=.929.  Unfortunately, this is inconsistent with our p value of .834, because (.929) =.863.  And the value of 2pq from our calculated values would be .132, not the .161 value given.  Part of what has happened here is the dearth of significant digits in our starting figure of .005.

 Let us try using the other two values; the frequencies for genotypes AA and Aa have three significant digits.  Using the p=.834, we could calculate p=Ö(.834)=.913.  This would yield q=1-p=.087.  But these p and q values are not consistent with our table values either:  2pq=.158, and q=.008.  Alternatively, we can note that

q=q(1)=q(p+q)=pq+q=½ Frequency (Aa)+Frequency (aa)

so we can compute the frequency, q, as

q=½(.161)+.005= .0805+.005».086.

The q here would be .007, and would have p=1-q=1-.086=.914; the associated p=.835

If we try using both of the three sigfig frequencies, then note that

p=p(1)=p(p+q)=p+pq=Frequency(AA)+½ Frequency (Aa).

Thus we can compute the frequency, p, of the normal allele in the population:

p=.834+½(.161)=.834+.0805».915.

The p  here comes up .837.  The frequency, q, is then q=1-p=1-.915=.085.  This gives q=.007.

Let’s summarize all this in a table.

                                                  AA                Aa              aa

  actual frequencies for genotypes®

.834

.161

.005

p

q

p

2pq

q

.929

.071

.863

.131

.005

.913

.087

.834

.158

.008

.914

.086

.835

.157

.007

.915

.085

.837

.156

.007

 

Well you pick.  I’d go with p=.92 and q=.08.  (The text I swiped this from uses the last.)

 

Now let’s compare this with

 

A screening of young adults in Ghana for sickle cell anemia showed that 75.6% were of genotype AA, while the remaining 24.4% were of genotype Aa.  Sickle cell disease is expensive to treat, and of recent development, so this generation of aa genotypes did not survive.  But the presence of allele a  provides individuals with some protection against malaria.  The frequency of the a allele among these young adults is .122 (half the alleles of the heterozygous group).  This is a significant increase from the q values we obtained from the preceding sample.

 

Other genetic disorders that are the result of the presence of two recessive alleles can be analyzed in a similar fashion.  Estimate the proportion of carriers (heterozygotes) in each population:

·        Cystic fibrosis, which occurs in approximately one out of every 1600 births in the United States.

·        In the Jewish populations with origins in northern Europe (Ashkenazim), Tay-Sachs disease has an incidence rate of 1 in 6000.

·        Albinism among the Indians on the San Blas Islands of Panama occurs in approximately one out of every 30 births.

 

In the ABO blood type classifications, let p,q, and r denote the frequencies of A,B and O alleles respectively.   p+q+r=1. An OO genotype will present with blood type O; an  AA or AO genotype will present with blood type A; a BB or BO genotype will present with blood type B; and an AB genotype will present with blood type AB.

Use a tree diagram to show that the fractions in the population that present with various blood types are

Blood Type

O

A

B

AB

Frequency

r

p+2pr

q+2qr

2pq

and admire how the sum of all these frequencies is the square of the trinomial (p+q+r)!

 

If a study of the blood types in France produced information in the table below, determine the frequencies of the alleles A, B, and O.  What fraction of type A individuals are AO heterozygotes?

 

Blood Type

O

A

B

AB

Frequency

.441

.435

.090

.034

 

 

 

 

 



* Adapted from Chapter 61 of “Mathematics for the Biosciences” by Michael Cullen, PWS Publishers.