Thursday, 6 December 2012

Mathematics of animal breeding

As I mentioned in the post on Basics on animal breeding, this area of animal science is full of equations and mathematical models. This text will shortly list and explain some of these equations, but it will not cover them thoroughly. For a deeper understanding of breeding math, try to get your hands on Richard Bourdon's book Understanding Animal Breeding.

Hardy-Weinberg law

Hardy-Weinberg law (H-W law for short) was formulated by two scientists at the same time, hence the name. It predicts the frequencies of genotypes in a generation based on the allele frequency. The formula is simply

p2 : 2pq : q2 

where p is the frequency of the dominant allele, and q is the frequency of the resessive allele (in other words, it's fAA: fAa : faa). H-W equilibrium is a status where both the allele frequency and the genotype frequency stay unchanged from generation to another. This can be achieved only in an "ideal population", where there's no random genetic drift, no mutation, no selection, no migration, and all males can reproduce with any female in the population (free reproduction).

Example: In a population of 1000 cows, 175 are white, 600 are spotted and the rest are black. Let's pretend that the genotype for whiteness is bb, Bb for spotted and BB for black. The genotypes hold the following alleles
white: 175 * 2 =  350 b alleles
spotted: 600  b alleles and 600 B alleles
black: 225 * 2 = 450 B alleles
Total:  950 b alleles and 1050 B alleles. Relative frequencies are 0,475 for b and 0,525 for B. Relative genotype frequencies are bb = 0,175, Bb = 0,6 and BB = 0,225.

But does this imaginary population follow the H-W equilibrium? To check that, we count what the genotype frequencies should be according to the law. Now p = 0,525 and q = 0,475. So we should have p2 : 2pq : q2 = 0,525 : 2*0,525*0,475 : 0,4752 = 0,276 : 0,5 : 0,224. Since these are NOT the same as the observed genotype frequencies (0,175 : 0,6 : 0,225), the population is NOT in H-W equilibrium.

Heritability (h2)

Heritability shows how much of the difference between animals is caused by genes. It has two forms: a narrow definition (h2) and a wide definition (H2). Of these two, animal breeding uses h2, because it does not include dominance and epistasis, which are not  inherited. The definition of heritability is

h2 = σ2A / σ2P

where σ2A denotes the variance in additive genetic impact and σ2P the variance in phenotype. So heritability in it's narrow sense shows how much of the difference between animals is caused by differences in their breeding value. It can also be though of as the animal's possibility for genetic progress. The third interpretation for heritability h2 is that it is the regression factor for the estimated breeding value (EBV) in relation to the phenotype. So the heritability value is needed when calculating the EBV.

Repeatability (r)

When calculating the EBV, we should always have several measurement results of one trait from one animal. For example, the milk yields for several lactation periods, of the amount of piglets in all of the litters of one sow. This repeatability is the correlation factor between these results, showing a linear continuation in the results for thet trait. Using repeatability one results can be combined into one when calculating the EBV. The formula for repeatability is

r = (σ2G + σ2Ep)/ σ2P

where the sum of genetic variance and permanent environmental variance is divided by the variance in phenotype.

Kinship factor fx,y

Like it's name suggests, the kinship factor has to do with relationships between two living animals. It predicts the probability that a random allele of one animal is identical by descendent (IBD) with an allele of another animal. That is to say whether that allele is inherited or not. It's important to separate IBD alleles from identical in state (IIS) alleles, where two animals have chemically identical alleles, but they're not inherited. IBD alleles are always inherited from one parent, IIS alleles are just identical but not inherited. IBS is always IIS, but not vice versa!

Kinship factor between a parent and its offspring is always 1/4. Kinship factor is 1/4 also between full sibs. The calculation formula is explained below, the example calculates the kinship between a parent and its offspring.

Additive genetic relationship ax,y

Additive genetic relationship is simply

ax,y = 2fx,y        or formally        ax,y = Cov(Ax, Ay) / σ2A

and it shows the conformity between the breeding values (BV) of two animals (x and y). It depends on the probability of common alleles (IBD), and considers only one allele. The probability that one allele of an offspring is inherited from its parent is always 50 %. The probability that full sibs have inherited the same allele (their kinship factor) is 1/4, so their additive genetic relationship is 2 * 1/4 = 1/2 or 50 %.

If the animals considered are inbred, the formula doesn't apply.

Inbreeding coefficient Fx

Inbreeding means mating animals, which are related to one another. For example, mating full siblings or a parent and its offspring. Inbreeding coefficient indicates the probability (in percentage) that both of an animal's alleles are from the same parent (both alleles are IBD). The inbreeding coefficient of an offspring equals the kinship factor of its parents. Since ax,y = 2fx,y, fx,y must be 0,5*ax,y. Thus, the inbreeding coefficient of an offspring is also 0,5 * the additive genetic relationship between its parents.

Since inbreeding increases the risk of resessive traits and illnesses and narrows the gene pool, it is not recommended to breed animals which would have inbreeding coefficient over 10 %. 

Estimated breeding value

So let's get to the point already! Right, let's do that, and see how to calculate that magical EBV and a breeding value index.

EBV is an estimated breeding value, and it concerns only one trait. It is calculated differently based on what information we have available: one result from the animal, several results from the animal, or results from the animal and it's relatives. The same goes for the b-factor and for the accuracy (rTI). The best EBV is calculated using regression:

i- A) = b (Pi - P) 

where (Âi- A) is the index value, b is the regression factor, Pi is a mean of the animal's results and P is the population mean of those results. If we have only one result from the animal itself, b will be h2 squared. 
A common form of the formula, when using results of only one animal, is  just I = b (Pi - P), where I denotes the breeding value index. In this case, b is calculated as
b = (n * h2) / (1 + (n-1) *r)
where  n is the number of results and r is the repeatability factor. Accuracy is still counted as h2 squared.

Estimating the breeding value based on the animal's offspring looks like this:

I = 2 * b (Pi - P) where b = (p * h2) / (p-1) * h2 + 4

here p is the amount of offspring. The formula is based on the assumption that we have one result from each offspring, and all offsprings are half-sibs. In this case the accuracy depends on the h2 and amount of offspring.


  1. This was really helpful for my math project thanks!

  2. Will let this sink in & settle itself b4 I can digest