Saturday, April 2, 2011

Human evolution- drifting, not sweeping

Our genomes indicate relatively few genetic sweeps in the last 100,000 years.

Despite the lack of a medical revolution from our knowledge of the human genome, as yet, other fields are being revolutionized, including the study of human evolution. The ability to compare complete genomes from people of different lineages generates vast amounts of digital comparative data that can transform pre-history into history of a sort.

Humans are virtually clones, compared to most other organisms which have far greater genetic variation. Our genetic variation is small due to the very small population sizes we seem to have had for much of our evolutionary history, (perhaps even due to severe bottlenecks), leading to the genetic Eve/Adam reconstruction of single ancestors dating to maybe 250,000 years ago.

Since that time our genetic variation has been increasing apace, with most modern-day variation residing in Africa and much less outside, due to the later migration of small populations out of Africa, perhaps 50,000 to 90,000 years ago, combined with very limited interbreeding with resident homo cousins like the Neanderthals.

This variation (caused by mutation, then reduced by selection and drift) is the mother lode- it tells us which populations are more related to others, it generates the traits that differ between populations and between individuals, and it carries traces of the selective process itself, which is the focus of the current paper.

Imagine a strongly beneficial mutation arising that, say, provides complete protection against some disease like the plague. This mutation resides on, say, chromosome 8. After an epidemic of plague, all members of the (now reduced) population have this new mutation- the others have been killed off. (In the parlance, this mutation has now been "fixed" because there are no competing alleles at its locus.) This means also that all members of the population share the same chromosome 8 in its entirety. This has been a selective sweep, carrying along whatever other variation resided on that copy of chromosome 8 that had the disease-protecting variant, good or bad. This kind thing can bring along and expose recessive traits like the royal haemophilia, and it dramatically reduces genetic variation of this chromosome in the population, which can be tracked for tens of thousands of years.

On the other hand, imagine another trait with less dramatic effects. Perhaps which slightly reduces the severity of psoriasis in those who have condition that due to other factors in the genetic background. This trait may be beneficial in a few situations, but will never sweep itself or its neighbors to predominance or completion in the population. It will drift along, gradually rising in percentage within the population (if it has no untoward side effects, which would be rare, given the networked nature of the genome, actually). The key point is that the slowness of this process allows recombination to have its say.

Recombination happens on every chromosome in every generation, swapping parts of its arms between those inherited from each parent. Roughly speaking, the location of such swaps is random and at least one happens on every chromosome per generation, so very gradually over time, each gene variant in a population becomes neighbors with variants of nearby genes other than those it was born with. After roughly 200,000 years for humans, this mixing should be essentially complete and the variant tends to be no more associated with than the neighboring variants it was born with than one would expect by chance.

Crossover recombination, which happens on each chromosome, at least once per generation, mixing up the genetic variants on chromosomes throughout the population.
That process is what this paper was interested in- figuring out whether there were any islands of unusually low variation that bespeak selective sweeps by highly beneficial gene variants over the last few hundred thousand years in humans. The authors hail from something called the "1000 genomes project", though they only have 179 genomes to their name in this paper, claiming that is enough to start sifting through the available variation. They consider three sub-populations: Yoruba of Africa (YRI), European (CEU), and Asian (CHB+JPT). The goal was to find whether over the time since these populations split, (roughly 100,000 years for YRI vs the other two, and 23,000 years for CEU vs CHB+JPT), any gene variants went from low percentage to become fixed (i.e 100%) in only one lineage.

Unfortunately, most of the more statistical data presented in this paper is stunningly hard to interpret and present. I also doubt that it means as much as they make it out to mean. But I will take a stab with one emblematic graph. Skip the next paragraph if you are in a hurry!


For this graph, the authors have isolated all exons of the human genome and conceptually centered them at the X-axis midline. Then across the local region, (X-axis in centimorgans, a unit of genetic distance), they graph on the Y-axis the diversity of the populations they have sampled. This tends to be lower in highly conserved areas like exons and higher in outlying areas. This diversity is normalized to (divided by) the difference between the canonical human genome and the sequence of rhesus monkey, which should in principle cancel out the variations due to conservation of protein-coding genes and other typically conserved elements.  Note that the diversity of the African population samples (green) is substantially higher than that of the European (orange) or Asian (purple) population samples. The authors argue that the central troughs are signs of specific directional selection that has affected the human linage differentially from the normal maintenance or purifying selection that would have been in common between the human and rhesus genomes. The main result is that the troughs they see are very narrow- signs that this directional selection dragged along very little of the surrounding genomes, where diversity remained high. Which is to say- genetic sweeps were very uncommon, on average.

The main result, aside from the sorts of somewhat dubious graphs shown above, was that there are very few fixed differences between their sampled populations- only four fixed amino acids in the entire genome that differ between the African and European population samples. For comparison, there are about 40,000 human-specific fixed amino acid changes between chimpanzees and humans (split for 5 million years), of which 10 to 20% are estimated to be selectively significant. So one would have expected to see 800 fixed changes in the 50,000 to 100,000 years since the African/European split.

Unfortunately, the authors focus on amino acid changes, (apparently out of convenience), totally missing the more important and frequent loci of evolutionary change in regulatory regions. They are only seeing the tip of the iceberg, really, and don't have or offer a good idea how big the whole iceberg is. Additionally, a small amount of genetic flow between populations, as might have been transmitted through the bordering regions of North Africa and Arabia, could have severely reduced the fixation of variants that had otherwise become established in their respective geographic regions.

Nevertheless, this illustrates the overall tiny level of genetic difference among contemporary human populations. It also implies that whatever evolution and differentiation was going on in the protein-coding regions they focus on, it was almost entirely confined to slowly jiggering the frequencies of alleles present in a population rather than rapid revolutionary replacements of an old gene variant with a shiny new gene variant. Even with the limited genetic variation that humans posess, significant phenotypic variation is evidently possible with virtually no population-level 100% differences. This undoubtedly reflects complex traits influenced by many genes whose interrelations make change both slow and genetically hard to track.
"An important implication is that in the search for targets of human adaptation, a change in focus is warranted. To date, selection scans have relied almost entirely on the sweep model, either explicitly (by considering strict neutrality as the null hypothesis and a classic sweep as the alternative) or implicitly (by ranking regions by a statistic thought to be sensitive to classic sweeps and focusing the tails of the empirical distribution). It appears that few adaptations in humans took the form that these approaches are designed to detect, such that low-hanging fruits accessible by existing approaches may be largely depleted."
So, human evolution seems to have slowed down in recent times, at least with regard to sweeps by strongly beneficial variants. I would guess that this is due to our rapidly increasing population sizes over this time, which tends to preserve variation and forestall fixation, at least on a short-term basis. It may also be a testament to our frolicsome tendency to interbreed widely, preserving variation in the face of wars, famines, diseases, genocides, and calamities generally.

  • Republicans are seeking even less regulation over the financial industry, in service to crony capitalism.
  • Ditto from Krugman ... on mortgage fraud and other criminal activities. 
  • Victimization narratives know no bounds, really.
  • Fascinating analysis of self control, anarchy, religion, and conservatism.
  • Guess who is on the wrong side in Africa?
  • More on financial fraud:
"Indeed, accounting control fraud is finance's “weapon of choice” in much of the developed world because it is the superior solution to the tradeoff between the risk of being sanctioned for looting and the rewards from looting. Even the most powerful bank CEO faces a grave risk of being imprisoned if he sticks his hand in the till and steals $10,000. If, instead, he uses accounting control fraud to loot the bank of $50 million he has an excellent chance of never even being prosecuted."
"It doesn’t take too much to know that if a nation sacrifices millions of dollars of potential income per day because it keeps millions of its citizens unemployed that it is not using its resources optimally. When you do the sums there is no greater inefficiency than mass unemployment."