Saturday, October 22, 2011

Evolutionary knob-twiddling and networking

Butterfly wings and waves of ancient innovation highlight the dynamism of transcriptional control in evolution.

When you hear about small changes during evolution.. the retreat of brow ridges from Neanderthals to Homo sapiens, or the growth of our brains over longer spans, you are typically hearing about changes in regulation of proteins that are themselves unchanged. A little more expression here, a little less there, and perhaps a few more years of expression during childhood.. it all adds up to just the kind of gradual change that Darwin had in mind when he saw that variation is pervasive in biology, and the raw material of evolution.

In our genomes, the locations most conserved over geological time are the protein-coding sections of genes. They are the raisins in the pudding, around which much less stable DNA swirls- the introns, promoters, enhancers, centromeres, telomeres, transposons, repetitive elements, and other material / junk, each of which have their own, typically faster, rate of change. When a protein sequence changes, its action everywhere changes immediately. In contrast, changing where and when it is expressed can have smaller and more subtle effects. More gradual effects, in an evolutionary sense.

Regulatory regions (promoters and enhancers) that control protein expression are dispersed, often spread over many times the length of the protein-coding DNA. They are also modular, composed of separable cassettes with specific control effects/patterns, in contrast to the mostly linear one-damn-amino-acid-after-the-next nature of protein-coding DNA. (Putting aside protein coding flexibility based on intron/exon dispersion). This pattern of dispersed controls is an inheritance from our Archeal ancestors, and is part of what made eukaryotes such revolutionary life forms, able to evolve rapidly relative to bacteria.

Example of a human gene (DAC1) of 2,274 protein-coding base pairs, itself a set of exons (B, in center, vertical black lines are exons) dispersed over 430,000 base pairs of introns. It is set within a 2,000,000 base pair region (A and B) with modules (B; red marks similarity to human) conserved in several species, and individually capable of driving gene expression in mouse embryos as shown in C. (H-human, M-mouse, F-frog, P-pufferfish, Z-zebrafish)

Two recent papers highlight this property in different ways- one about the patterning of butterfly wings among species that convergently evolve to mimic each other's designs, and the other tracing ancient innovation in the vertebrate lineage by bursts of regulatory change followed by conservation/stasis.

Taking the second article first, the regulatory regions of our genome are peppered with small modules (typically much smaller than protein-coding segments) that are somewhat conserved, for their regulatory role. Each module typically drives expression of its associated gene in response to an environmental event, or at a specific time and place in development, as in the above example.

The authors dredged up such modules en masse from several vertebrate genomes (about 10% of all gene regulatory elements, from two fish, cow, mouse, and human) and ask when they first became conserved in evolution, and what genes they associate with. Not being as well conserved as proteins, virtually none are traceable beyond 600 million years ago. The paper is mostly devoted to methods, since these sites are small, difficult to align across different species, and their times of origin are difficult to estimate. But I will leave the methods issues aside.

The interesting finding is that over this span of time, there were four distinct patterns of regulatory innovation (i.e. origination and conservation of regulatory modules) tied to different kinds of genes in vertebrates. The first wave, peaking at the very start of available data at 500+ million years ago, was of transcription regulators themselves, which bind to regulatory DNA sites. This indicates variation and evolution in the most basic programs driving animal function and development.

The second epoch ranged from 500 to 200 million years ago and is associated with developmental genes, reinforcing the finding above, but indicating that the deployment of transcriptional regulators was solidified well before the full palette of developmental possibilities was explored. Development is mostly regulated by the coordinated expression of genes, including transcription regulators, that go on to regulate other genes and proteins in a cascade or network.

A third epoch peaked sharply about 250 million years ago, being the fixation of regulatory sites near receptor genes (Figure below). Receptors play central roles in the nervous system, in smell and taste, and in hormonal control systems like the sex hormones. All these areas were important areas of innovation through the vertebrate lineage, but apparently concentrated at this era just as the age of dinosaurs began.

Lastly, the most recently fixed set of regulatory sites lie near genes involved in post-translational protein modification. This is the attachment of molecules ranging from tiny methyl and acetyl groups to fatty acids like palmitate up through whole mini-proteins like ubiquitin and its many relatives to an active protein.

Period during evolution (x-axis) when regulatory modules near specified classes of genes (noted at top) arose and became conserved (y-axis), indicating a function of increased importance,

These modifications exemplify evolutionary tinkering and jury-rigging. They originate in other processes, (markers for protein degradation in the case of ubiquitin), whose components (or copies thereof) were dragged into new regulatory roles, at first perhaps as just a tentative little tweek on top of the existing complexity, then over time used in more roles when other sources of variation and adaptation had already been so networked that they couldn't change. The evolutionary story suggests that in our lineage, the deeper and more central the system of regulation, the earlier it settled into a more-or-less stable, conserved, and unchangeable, state.



Turning from deep time to more recent events, butterflies in Northern South America are (for the time being) highly diverse. Yet some have converged from different lineages towards similar wing patterns in violation of the general rule of species divergence, and particularly the rule that different species need distinct markings to promote correct mate selection and ecological niche maintenance. This mimicry comes in two versions. Either a distasteful species is mimicked by another that is not distasteful in order to steal its advertising.. i.e. its protection from predators, (Batesian mimicry), or two distasteful species converge together in order to raise the level of advertising they share (Müllerian mimicry).

They would make car dealers proud! But what causes all this plasticity of wing patterning? How do butterflies find it so easy to create and then alter their beautiful patterns? Authors of this paper find one gene that seems to be in charge- a transcription regulator whose own regulation holds the key to variation in the Heliconius genus of butterflies, a distasteful genus (to predators that is, not to us).

Geographic distribution (B), lineages (A, horizontal), and mimicry (A, vertical) of selected Helioconius butterflies.

In this figure, the horizonal rows contain genetic close relatives, while the vertical columns show geographic co-occurence. The archetypal Müllerian mimics are H. melpomene and H. erato, which look very similar despite coming from distant lineages. The authors note that the Helioconius genus has hundreds of different wing pattern races and species. So they have helpfully lined up the various mimics that co-habit but arise from different lineages and thus presumably have converged to similar wing patterns from different ones originally. These are the butterflies they use to ask the question: what are the gene(s) responsible for this variation and convergence/mimicry?

A great deal of past genetics had already pointed to one large genomic region responsible for red wing variants in this genus. The authors drilled down further by using high-tech methods to measure the RNA expression from regularly-spaced 60 base pair segments throughout that ~500,000 base pair suspect genome region. The RNA was prepared from dissected pieces of wing, comparing gene expression in red-colored pieces to that in green or black pieces. Only one location spanning about 15,000 base pairs correlated in its expression closely with the observed color variation, surrounding a gene called "optix".

As one can tell from the name, this gene was already known for its role in eye development in flies. Indeed, earlier researchers found that "Ectopic expression of optix leads to the formation of ectopic eyes suggesting that optix has important functions in eye development." No kidding! "Ectopic" meaning that they engineered expression of the gene in novel places, and - holy moly! - saw eyes develop in those places.

Back to wings.. the authors then looked at full-wing patterns of expression of the optix gene, and indeed it seems to closely presage the appearance of red color, such as in these images:

Expression of optix in 72 hour pupal wings (blue patterns), compared with adult wings of the same species.

The authors also looked at the genetics of optix in more detail and found that not only was there high correlation, but there was complete correlation between the alleles of optix and the resulting wing patterns, using hybrids of various races, indicating that optix is not just a downstream reflection of some other patterning component, but that it drives the red patterning by its location of expression.

Now the interesting part was that they sequenced the optix gene from seven of their Heliconius species, and the protein code was identical in each of them. Twenty million years of evolutionary divergence hadn't made any difference. The authors thus deduce that the genetic variation at this optix locus all happens in the surrounding regulatory regions of the DNA, not the protein coding areas. And this makes sense if the function of the protein has remained the same - make red pigmented areas on the wing (or make eyes, in other settings) - while its deployment in space across the wing has varied with the evolutionary needs of the moment.

They state "optix provides a compelling example of a gene that drives adaptation because its various alleles are regulatory variants that have pronounced effects on complex large-scale patterns." Unfortunately, they have not yet found those regulatory regions. Someone's grant and future work surely hangs in the balance. But as noted above, these regions and their variants are sure to be small, modular, dispersed, and hard to detect, since they exist at the edge of efficacy; bordering on random noise, in a DNA sequence sense.

Control is the key. Just as electronics and computer science quickly gave rise to information theory and cybernetics, and our financial and political worlds depend on people knowing what they are doing and having effective management processes in place, (ahem!), biology too is drenched with management issues. The human genome has half the number of genes that soybean does.. so to paraphrase, it isn't how many you have, but how you use them.


"So the graph highlighted in the early stages of the crisis the importance of very large fiscal interventions. My Chinese contacts informed me that at the time there was no discussion over there about the country drowning in debt or that the government was going to “run out of money”. These ideas that crippled the recovery in the West were not allowed to germinate in China."
  • Occupy Marin at 12 noon- be there .. at the square.

No comments: