Saturday, January 20, 2018

Measuring Genetic Fitness

Yes, it is a thing, and is quantitative. But it can be hard to measure, especially when genetic epistasis enters the picture.

Biological fitness is often thought about in nebulous terms, as unquantifiable, even taboo. But in genetics, it has a specific and quantifiable meaning- the likelihood of a gene, organism, or other unit of study to make it into the next generation. For example, an antibiotic resistance gene may have very high fitness in a bacterial population under antibiotic pressure, and also confer high fitness on its host organism as a whole. In those terms, fitness is measurable and quantifiable, especially for tiny laboratory organisms which reproduce quickly, have relateively few genes, and are easy to measure and manipulate. But it can be very difficult to evaluate elsewhere, and not just because organisms like us reproduce slowly and have lots of genes.

Fitness as a concept typically applies to deviations such as mutations, or sub-populations or sub-species, against a background of normal (i.e. wild-type) gene / whole population. It a measure of difference from the norm. The wild-type organism would have a fitness of 1, and a mutant with, say, a fatal developmental defect, has a fitness of 0. For a population, the collection of all fitness values is called the fitness landscape, the average being 1, but the deviations being very significant, as we know from our own populations. The population having some complex landscape of more and less fit individuals evolves through time with more fit genotypes gaining population share, and less fit genotypes (by definition) declining. One of the major functions of sexual reproduction is to continually re-arrange the fitness landscape to a more diverse and dispersed state by mixing and recombining genotypes, so that much of the deleterious mutational load of the population ends up segregated to low-fitness craters in the landscape, and is disposed of efficiently. But the same token, beneficial alleles can be more rapidly combined to reach fitness peaks.

A notional fitness landscape, from low (blue) to high (red). Populations with sex/recombination (a, right) have a much faster path to fitness peaks within this landscape. Note also that this is a very additive landscape, without epistatic or other complex genetic effects.

When considering the fitness value of individual genes and their combinations, the starting assumption is that they will be additive. That is, the quantitative fitness value for, say, an altered hemoglobin that allows survival at higher altitude will be independent of, and additive with, the fitness value for being able to consume yak milk in adulthood, via changes in lactase enzyme expression. But the fitness of one gene often affects that of others, an interaction that geneticists call epistasis. If I have a genetic propensity for alcoholism, and also a genetic propensity to liver failure, the combination, while conceptually and mechanistically independent, may end up far more lethal than either one would be alone. One extreme form of epistasis is synthetic lethality, where two mutations that are individually tolerated turn lethal when they occur together. Since everything in organismal biology is to some degree connected with everything else, epistasis is very common, and thus fitness values or, similarly, disease prognostic effects that we might be interested to determine for various allele variants, (i.e. mutations), as seen increasingly in the diagnostic clinical setting for humans, can be very tricky to estimate.

A recent paper advanced the measurement of fitness in complex genetic situations and from practical forms of data. It is naturally difficult to measure fitness across whole genomes and populations. Typically, an observation is made of relative fitness for an isolated gene/allele, where one mutation outcompetes another, in a shared genetic background (using model organisms). This ranking or rank-order style of data is far more accessible and thus more common than are detailed calculations of absolute fitness. This is true for single genes, and even more true for complex combinations that may exhibit epistasis. Yet the rank-ordering of fitness for alleles, genotypes, organisms, or other entities remains a very rich source of information, which the authors exploit to find novel epistatic effects.
Epistasis = ( w00+w11) - (w01+w10), where fitness is "w", and the genotypes are 00 (wild type), 11(double mutant) and 01, 10 (each single mutant). Epistasis, i.e. the deviation from additivity, can be positive or negative, and is quantitative to the extent that the fitness values they are based on are quantitative. The figure below gives a very simple example that indicates epistasis between two loci on the basis of very minimal fitness data.
Depiction of two known rank-order fitness relationships, and a conclusion of epistasis from such minimal / partial data. Arrows point towards higher fitness, and show ranks, in a rock-paper-scissors kind of sequence. Highest fitness is at the top, and least is at the bottom. We can already see that something is not additive, as one of the single mutants is more fit than the double mutant, while the other single mutant is more fit than the wild-type. If all effects were additive, the advantageous 01 mutation would not make the 11 double mutant less fit than the 10 single mutant.

Taking this to more loci and complex relations, the authors venture into Dyck numbers, graph isomorphism, Walsh coefficients, and other obscure methods to organize these rank analyses, and come out with computationally easy ways to analyze all this for any number of loci, to help geneticists make sense of what patchy / partial data they may have at hand.

Antibiotic resistance shows clear epistasis. Among four mutations relevant to antibiotic resistance in the TEM-class of bacterial antibiotic resistance gene, (a beta lactamase common in hospitals, in E. Coli, H. influenzae, and N. gonorrhoeae), having all mutations (TEM-50) is most resistant and thus most fit, but surprisingly, no single mutation (TEM-84,-19, -17, -33) confers better resistance than having none at all. But honestly, one didn't need a computer to figure this out.
"Although we have applied our method here only to fitness, any other continuous phenotype of interest can be analyzed in exactly the same manner. The fitness landscape w is then replaced by a more general genotype-to-phenotype map. For example, rather than using it as a fitness proxy, one may be concerned about the drug resistance phenotype itself and its genetic architecture."
We will soon have our complete genomes at hand as a normal part of our medical record. This will uncover an ever-growing list of deviations and oddities, whose significance will be the subject of many decades of study. The work above is just one of many kinds of methods that will be attempting to make sense of our genetic variations, with the goal of peering into our medical and inter-generational futures.

No comments: