Saturday, April 2, 2011

Human evolution- drifting, not sweeping

Our genomes indicate relatively few genetic sweeps in the last 100,000 years.

Despite the lack of a medical revolution from our knowledge of the human genome, as yet, other fields are being revolutionized, including the study of human evolution. The ability to compare complete genomes from people of different lineages generates vast amounts of digital comparative data that can transform pre-history into history of a sort.

Humans are virtually clones, compared to most other organisms which have far greater genetic variation. Our genetic variation is small due to the very small population sizes we seem to have had for much of our evolutionary history, (perhaps even due to severe bottlenecks), leading to the genetic Eve/Adam reconstruction of single ancestors dating to maybe 250,000 years ago.

Since that time our genetic variation has been increasing apace, with most modern-day variation residing in Africa and much less outside, due to the later migration of small populations out of Africa, perhaps 50,000 to 90,000 years ago, combined with very limited interbreeding with resident homo cousins like the Neanderthals.

This variation (caused by mutation, then reduced by selection and drift) is the mother lode- it tells us which populations are more related to others, it generates the traits that differ between populations and between individuals, and it carries traces of the selective process itself, which is the focus of the current paper.

Imagine a strongly beneficial mutation arising that, say, provides complete protection against some disease like the plague. This mutation resides on, say, chromosome 8. After an epidemic of plague, all members of the (now reduced) population have this new mutation- the others have been killed off. (In the parlance, this mutation has now been "fixed" because there are no competing alleles at its locus.) This means also that all members of the population share the same chromosome 8 in its entirety. This has been a selective sweep, carrying along whatever other variation resided on that copy of chromosome 8 that had the disease-protecting variant, good or bad. This kind thing can bring along and expose recessive traits like the royal haemophilia, and it dramatically reduces genetic variation of this chromosome in the population, which can be tracked for tens of thousands of years.

On the other hand, imagine another trait with less dramatic effects. Perhaps which slightly reduces the severity of psoriasis in those who have condition that due to other factors in the genetic background. This trait may be beneficial in a few situations, but will never sweep itself or its neighbors to predominance or completion in the population. It will drift along, gradually rising in percentage within the population (if it has no untoward side effects, which would be rare, given the networked nature of the genome, actually). The key point is that the slowness of this process allows recombination to have its say.

Recombination happens on every chromosome in every generation, swapping parts of its arms between those inherited from each parent. Roughly speaking, the location of such swaps is random and at least one happens on every chromosome per generation, so very gradually over time, each gene variant in a population becomes neighbors with variants of nearby genes other than those it was born with. After roughly 200,000 years for humans, this mixing should be essentially complete and the variant tends to be no more associated with than the neighboring variants it was born with than one would expect by chance.

Crossover recombination, which happens on each chromosome, at least once per generation, mixing up the genetic variants on chromosomes throughout the population.
That process is what this paper was interested in- figuring out whether there were any islands of unusually low variation that bespeak selective sweeps by highly beneficial gene variants over the last few hundred thousand years in humans. The authors hail from something called the "1000 genomes project", though they only have 179 genomes to their name in this paper, claiming that is enough to start sifting through the available variation. They consider three sub-populations: Yoruba of Africa (YRI), European (CEU), and Asian (CHB+JPT). The goal was to find whether over the time since these populations split, (roughly 100,000 years for YRI vs the other two, and 23,000 years for CEU vs CHB+JPT), any gene variants went from low percentage to become fixed (i.e 100%) in only one lineage.

Unfortunately, most of the more statistical data presented in this paper is stunningly hard to interpret and present. I also doubt that it means as much as they make it out to mean. But I will take a stab with one emblematic graph. Skip the next paragraph if you are in a hurry!

For this graph, the authors have isolated all exons of the human genome and conceptually centered them at the X-axis midline. Then across the local region, (X-axis in centimorgans, a unit of genetic distance), they graph on the Y-axis the diversity of the populations they have sampled. This tends to be lower in highly conserved areas like exons and higher in outlying areas. This diversity is normalized to (divided by) the difference between the canonical human genome and the sequence of rhesus monkey, which should in principle cancel out the variations due to conservation of protein-coding genes and other typically conserved elements.  Note that the diversity of the African population samples (green) is substantially higher than that of the European (orange) or Asian (purple) population samples. The authors argue that the central troughs are signs of specific directional selection that has affected the human linage differentially from the normal maintenance or purifying selection that would have been in common between the human and rhesus genomes. The main result is that the troughs they see are very narrow- signs that this directional selection dragged along very little of the surrounding genomes, where diversity remained high. Which is to say- genetic sweeps were very uncommon, on average.

The main result, aside from the sorts of somewhat dubious graphs shown above, was that there are very few fixed differences between their sampled populations- only four fixed amino acids in the entire genome that differ between the African and European population samples. For comparison, there are about 40,000 human-specific fixed amino acid changes between chimpanzees and humans (split for 5 million years), of which 10 to 20% are estimated to be selectively significant. So one would have expected to see 800 fixed changes in the 50,000 to 100,000 years since the African/European split.

Unfortunately, the authors focus on amino acid changes, (apparently out of convenience), totally missing the more important and frequent loci of evolutionary change in regulatory regions. They are only seeing the tip of the iceberg, really, and don't have or offer a good idea how big the whole iceberg is. Additionally, a small amount of genetic flow between populations, as might have been transmitted through the bordering regions of North Africa and Arabia, could have severely reduced the fixation of variants that had otherwise become established in their respective geographic regions.

Nevertheless, this illustrates the overall tiny level of genetic difference among contemporary human populations. It also implies that whatever evolution and differentiation was going on in the protein-coding regions they focus on, it was almost entirely confined to slowly jiggering the frequencies of alleles present in a population rather than rapid revolutionary replacements of an old gene variant with a shiny new gene variant. Even with the limited genetic variation that humans posess, significant phenotypic variation is evidently possible with virtually no population-level 100% differences. This undoubtedly reflects complex traits influenced by many genes whose interrelations make change both slow and genetically hard to track.
"An important implication is that in the search for targets of human adaptation, a change in focus is warranted. To date, selection scans have relied almost entirely on the sweep model, either explicitly (by considering strict neutrality as the null hypothesis and a classic sweep as the alternative) or implicitly (by ranking regions by a statistic thought to be sensitive to classic sweeps and focusing the tails of the empirical distribution). It appears that few adaptations in humans took the form that these approaches are designed to detect, such that low-hanging fruits accessible by existing approaches may be largely depleted."
So, human evolution seems to have slowed down in recent times, at least with regard to sweeps by strongly beneficial variants. I would guess that this is due to our rapidly increasing population sizes over this time, which tends to preserve variation and forestall fixation, at least on a short-term basis. It may also be a testament to our frolicsome tendency to interbreed widely, preserving variation in the face of wars, famines, diseases, genocides, and calamities generally.

  • Republicans are seeking even less regulation over the financial industry, in service to crony capitalism.
  • Ditto from Krugman ... on mortgage fraud and other criminal activities. 
  • Victimization narratives know no bounds, really.
  • Fascinating analysis of self control, anarchy, religion, and conservatism.
  • Guess who is on the wrong side in Africa?
  • More on financial fraud:
"Indeed, accounting control fraud is finance's “weapon of choice” in much of the developed world because it is the superior solution to the tradeoff between the risk of being sanctioned for looting and the rewards from looting. Even the most powerful bank CEO faces a grave risk of being imprisoned if he sticks his hand in the till and steals $10,000. If, instead, he uses accounting control fraud to loot the bank of $50 million he has an excellent chance of never even being prosecuted."
"It doesn’t take too much to know that if a nation sacrifices millions of dollars of potential income per day because it keeps millions of its citizens unemployed that it is not using its resources optimally. When you do the sums there is no greater inefficiency than mass unemployment."


  1. Interesting post! I look forward to exploring the rest of your blog.

    What are your thoughts on this paper, that suggests human genetic evolution is in fact accelerating?

    Even if these changes are limited mainly to adaptations to neolithic foods (dairy, grains, etc.), couldn't they be considered "strongly beneficial variants"?

    And what about the widespread use of birth control, which more heavily weights mate selection as an evolutionary pressure? I've written about the subject here if you're interested:

    Interesting point about bottlenecks in human evolution. Have you seen this graphic?

    Once again, thoroughly enjoyed your post!

  2. Hi, Mr. Moyer-

    Thanks for your comments. I think the Toba bottleneck has been substantially diminished by recent work, but otherwise, the graphic you cite is very detailed and informative.

    Thanks especially for the interesting paper you cite on recent evolution. As you can tell, I had serious problems with the paper posted on above, (Hernandez), to the point of not taking seriously some of their main contentions. Hawkes, et al. seems more interesting and solid on the whole. I would reconcile them by saying that the Hawkes paper deals with snps, which have been accumulating apace and changing frequencies by selection, drift, etc. The Hernandez paper only deals with fixation, which is a high bar, and only fixation in protein coding regions- an even higher bar. Fixation means that a variant is no longer a snp, but has gone to 100% in the population at issue.

    So I think what they are looking for is significantly different, as I speculated, perhaps the rise in population has reduced the ability of any snp to fix throughout the human range. Ditto for the geographical radiation, which prevents variants that are geographically segregated, from fixing completely through the kind of very large populations being considered here (all of Africa). Even within Africa, there has been a rise in population and I would guess a good deal of geographic sedentarism, which promotes the preservation of variants which are rare considering the whole population.

    For the Hawkes paper, they dealt with moderate frequency snps, neither near fixation nor extremely rare. The main result seems to be their Figure 1, which shows a curve of variants strongly biased toward recent time & peaking at 8k years ago. I would guess that this is consistent with a very mundane steady state model, where most snps are deleterious, arise in recent time, and then are swept away over many thousands years (or fixed by drift), leaving only a few snps that are subject to some kind of balancing selection, neither fixing nor being selected against, but being beneficial in low frequency only. It is a fascinating question, and the authors clearly think they have something more significant going on.

    However, in principle, I agree with Hawkes over Hernandez- that there is no reason to think that selection has slowed on humans overall in recent times. So both overplay their results, especially Hernandez, since the putative slowdown they point to makes no sense at all. All I can understand of it is that the rate of fixation has slowed over the interval for reasons that are perhaps speculative, but have nothing to do with overall rates of evolution.

  3. Thanks for your response. Along the same lines, it's worth noting that reduced genetic diversity doesn't necessarily correspond with reduced phenotypic/functional diversity. With each mass extinction, we've lost a huge chunk of genetic diversity, but each time life re-expands to fill each environmental niche (a point from Dawkins, I think, but it might have been Gould). Genetically dolphins aren't that far away from camels, pigs, or other even-toed ungulates, but in appearance and function they're radically different. So maybe raw genetic diversity isn't the only vector to consider when asking the question "Has human evolution slowed down?"

    My technical understanding of genetics is superficial -- doing my best to keep up. Thanks again for the detailed response.

  4. Let me also add a link for the precursor to the Hawkes paper, where the same group explains their screen for signs of positive selection in the human genome, based on troughs in linkage disequilibrium of snps. This would be essential background for evaluating their later works, and for the whole field of recent human population genetics, really.

    Here they also mention that "It is intriguing that a significant fraction of inferred selected alleles are found in most of the examined populations." .. which is in essence the same observation as the original paper of this post, that few alleles have become differentially fixed among different large-scale human populations. And they add that balanced selection is a viable explanation for the persistence of variants that are selected but not fixed.