Saturday, July 31, 2010

Watching the evolution knobs spin

Evolution really happens at the dials controlling genes, more than in protein sequences those genes encode.

The shock of humans having only ~23,000 genes has yet to fully sink in. Fewer genes than soybeans? Than the potato? Additionally, the depth to which some of these genes are conserved is also astonishing, with a promoter of eye development working quite well when transplanted into fruit flies. What, then, makes us different? What has evolution been doing all this time?

A recent paper in science adds evidence that far more variation goes on in the promoters of genes than in their coding sequences. The authors tracked the sites of action (i.e. DNA binding) of two liver-specific transcription regulatory proteins in chickens, opossum, mice, dogs, and humans, and found that few  were recognizably conserved. Most sites disappeared, reappeared, altered, and mutated with considerable abandon.

The regulators themselves (CEBPA, and HNF4A) were very well conserved, meaning that as proteins, they had virtually the same sequence in each organism. And more critically, their preferred binding site on DNA stayed the same as well. That tends to be hard to change if their binding to thousands of different sites (~20,000 is the estimate given for each protein) is important for an organism's liver and other organs. Putting it in technical terms, such binding specificities tend to be subject to strong purifying selection.

On the other hand, the individual sites are much less constrained by evolution, since changes affect only that individual target gene. Some genes that have been studied as targets of CEBPA include metabolic enzymes, detoxifying enzymes like cytochromes P450, EPHX1, and SULT2A1, several insulin-regulated genes, growth factors, the gene for albumin, coagulation factor VIII, and other transcriptional regulators in liver development and function.

The current authors use some high-tech wizardry to isolate all the DNA bound to these regulatory proteins from each species of interest, and sequence around each site to see where it maps in the respective species' genome. This gives them the dataset of sites that they then mine to ask whether the sites have stayed consistent over evolutionary time. The answer is no: "For these two liver-specific TFs, binding events appear to be shared 10 to 22% of the time between mammals from any two of the three placental lineages we profiled, separated by approximately 80 million years of evolution (figs. S6 and S7). This result reveals a rapid rate of evolution in transcriptional regulation among closely related vertebrates."

For example, they show the binding of CEBPA to one region around the gene for PCK in liver. Phosphoenolpyruvate carboxykinase is a metabolic enzyme which helps synthesize glucose.


The coding exons of the PCK1 gene are shown at the lower right. kb = kilo basepairs. Hsap = human, Mmus= mouse, Cfam = dog, Mdom = short tailed opossum, and Ggal = chicken.

The pattern in chicken is quite simple. More sites appear in the mammals, with novel and significant sites appearing in dogs and humans. The scoring of these sites is somewhat unclear, in terms of how minor a site could be and still score, not to mention that they had no functional tests of which sites actually affected local gene transcription.

A key and well-occupied site right at the start of the PKC1 gene is well-conserved, however, and probably has a dominant regulatory role. What role the other sites might have is not clear, and might be minimal. So their  conclusion needs to be taken with a bit of salt, as they indicate that most of the highly conserved DNA binding sites are at this kind of most-influential position near genes that rely heavily on regulation by the bound regulator.

Nevertheless, the reason for flexibility in regulator binding is not hard to find, since binding sites are often composed of only six or eight nucleotides, with sloppy allowances for binding to sites with some mutations as well. New sites can appear easily, and old sites can be destroyed just as easily. So these regulatory proteins bind all over the genome and these sites change frequently, allowing regulatory variation to happen easily by mutation. The authors conclude "Taken together, the steady accumulation of small changes in the genetic sequence appears to rapidly remodel thousands of TF binding sites in mammals." [TF refers to transcription factor, another word for DNA binding regulator].

Given the complexity of biology, the network is the real locus of evolution, with the pieces (proteins encoded by genes) being shuffled around by regulatory experiments over time. Indeed, another recent paper compared the multicellular organism Volvox with its single-celled relative Chlamydomonas, and found that they had almost exactly the same number of genes, and few gene differences overall. They conclude: "This is consistent with previous observations indicating co-option of ancestral genes into new developmental processes without changes in copy number or function." And one of the most important mechanisms of such co-option is placing the given gene under novel regulation. This process is slightly reminiscent of the human economy, which is being driven increasingly as a "knowledge economy", shuffling around financing, software, and organization while the basic commodities of existence remain far more constant.

  • Free will, explained. (Only in part, however.. it leaves out our moral responsiveness to others.)
  • A judicious analysis of the wikileaks doc dump.
  • The Taliban is getting desperate and may be in decline.
  • MMT economics in a nutshell.
  • Walter Mead pens an uncharacteristically idiotic screed against the greens. As if "prohibition" were being proposed by anyone, anywhere. Note the mash note to fellow anti-green Andrew Revkin.
  • Meanwhile, yet another CO2 related apocalypse rears its head.
  • Bill Mitchell quote of the week, speaking of Minsky's model of financial cycles, requiring broad anti-cyclical policy as well as (right now) government stimulus:
"Through phases of recession, recovery, tranquility, and euphoria, the economy endogenously moves from robust to fragile financial structures. The fragile structure characterised by high levels of speculative and Ponzi finance becomes vulnerable to a multitude of shocks, any of which, in isolation or concert, can alter perceptions of future income flows needed to validate the debt structure and drive the economy into crisis."