Showing posts with label article review. Show all posts
Showing posts with label article review. Show all posts

Saturday, December 7, 2024

Cranking Up DNA, One Gyration at a Time

The mechanism of DNA gyrase, which supercoils bacterial DNA.

Imagine that you have a garden hose that is thirty miles long. How would you keep it from getting tangled? That is unlikely to be easy. Now add randomly placed heavy machinery that actively twists that hose as it travels / pulls along, causing it to wind up ahead, and unwind behind. And that machinery can be placed in either direction, often getting into head-on conflicts, not to mention going at quite different speeds. That is the problem our cells have, managing their DNA. 

They use a set of topoisomerases to manage the topology of DNA- that is, its twist-i-ness. One easy method is to nick the DNA on one of its two strands, allowing it to relax by spinning around the remaining phosphate bond, before resealing it back to a double strand and sending it on its way. But what if you encounter coils or knots that can't be resolved that way? The next level is to cut one entire DNA molecule, not just one side/strand of it, and pass the conflicting one though it. All organisms contain topoisomerases of both kinds, and they are essential.

How DNA gets twisted. While most topoisomerases relax DNA (top) to resolve the many twisty problems posed by transcription and replication, gyrase increases twist by grabbing and holding a quasi-positive twist, then cutting and resolving it, as shown at bottom.

Bacteria have an additional enzyme that we do not have, called gyrase, to crank up the supercoiling of their DNA, to make it easier to open for transcription. Gyrase works just like a type II topoisomerase that cuts a double-stranded DNA and lets another DNA through, but it does so in a special way that puts a twist on the DNA first, so instead of relaxing the DNA, it increases the stress. How exactly that works has been a bit mysterious, though gyrases and the general principles they operate under have been clear for decades. Gyrase uses ATP, and grabs onto two parts of a DNA molecule, one of which is pre-twisted into coil, after which one is cut and the other passed through to create a change (-2) in the twisting number of that DNA.

A general model of gyrase action. The G segment of DNA is firmly held by the gyrase dimer in the center.  The same DNA is forcibly twisted about, around the pinwheel structures, and bent back around to enter through the N-gate (as the T segment). Then, the N gate closes, paving the way for the G-segment to be cut and separated (step 3). ATP is the energy source behind all this structural drama. The T-segment then passes through the cut, enters the C-gate, and the cycle is complete.

A recent paper determined the structure of active gyrase complexes, and was able to trace the pre-twisted conformation. This, combined with a lot of past work on the ATPase and cleavage functions of gyrase, allows a reasonably full picture of how this enzyme works. It is a symetric dimer of a two-subunit protein, so there are four protein chains in all. There are three major regions of the full structure. The N-gate at top where one segment (the T-segment) of DNA binds, then the central DNA gate, where the other (G-segment) DNA binds and is later cut to let the T-segment through, and the C-gate, where the T segment ends up and is released at the end of the cycle. 

Focus on the pinwheel structure that dramatically pre-twists the DNA around between the G and T segments, pre-positioning the complex for strand passage and increased supercoiling.

The magic is that the T-segment and the G-segment of DNA are parts of the same DNA molecule, by being wrapped around the ears of the protein, which are also called pinwheels. That is what the newest structure solves in greatest detail. These pinwheels essentially allow the enzyme to yank an otherwise normal DNA strand into a pre-knotted (positive supercoil) form that, when cut and resolved as shown, results in a negative increment of supercoiling or twist. If they mutated the pinwheels away, the enzyme could still hold, cut, and relax DNA, but it could not increase its supercoiling. It is the ability of the pinwheel structures to set up a pre-twisted structure onto the DNA that makes this enzyme a machine to increase negative supercoiling, and thus ease other DNA transactions. 

Topoisomerase enzymes through evolution, from gyrase (left) to human topoII on the right. Note how the details of the protein structure are virtually unrecognizable, while the overall shape and DNA-binding stays the same.

Bacteria also have more normal type II topoisomerases that cut DNA merely to relax it, so one might wonder how these two enzymes get along. Well, gyrase is responsible for the overall negative supercoiling of the bacterial genome, while the other topoisomerases have more localized roles to relieve transient knots and over-twisting. Indeed, if you negatively twist DNA enough, you can separate its strands entirely, which is not usually desirable. Further research shows that too much of either topoisomerase is lethal, and that they are kept in balance by transcriptional controls over the amount of each topoisomerase. This suggests a futile cycle of DNA winding and unwinding, as the optimal condition in bacterial cells when both are present in just the right amounts. 


Saturday, November 9, 2024

Rings of Death

We make pore-forming proteins that poke holes in cells and kill them. Why?

Gasdermin proteins are parts of the immune system, and exist in bacteria as well. It was only in 2016 that their mechanism of action was discovered, as forming unusual pores. The function of these pores was originally assumed to be offensive, killing enemy cells. But it quickly became apparent that they more often kill the cells that make them, as the culmination of a process called pyroptosis, a form of (inflammatory) cell suicide. Further work has only deepened the complexity of this system, showing that gasdermin pores are more dynamic and tunable in their action than originally suspected.

The structure is quite striking. The protein starts as an auto-inhibited storage form, sitting around in the cell. When the cell comes under attack, a cascade of detection and signaling occurs that winds up expressing a family of proteases called caspases. Some of these caspases can cut the gasdermin proteins, removing their inhibitory domain and freeing them to assemble into multimers. About 26 to 32 of these activated proteins can form a ring on top of a membrane (let's say the plasma membrane), which then cooperatively jut down their tails into the membrane and make a massive hole in it.

Overall structure of assembled gasdermin protein pores.


Simulations of pore assembly, showing how the trapped membrane lipids would pop out of the center, once pore assembly is complete.


These holes, or pores, are big enough to allow small proteins through, and certainly all sorts of chemicals. So one can understand that researchers thought that these were lethal events. And gasdermins are known to directly attack bacterial cells, being responsible in part for defense against Shigella bacteria, among others. But then it was found that gasdermins are the main way that important cytokines like the highly pro-inflammatory IL-1β get out of the cell. This was certainly an unusual mode of secretion, and the gasdermin D pore seems specifically tailored, in terms of shape and charge, to conduct the mature form of IL-1β out of the cell. 

It also turned out that gasdermins don't always kill their host cells. Indeed, they are far more widely used for temporary secretion purposes than for cell killing. And this secretion can apparently be regulated, though the details of that remain unclear. In structural terms, gasdermins can apparently form partial and mini-pores that are far less lethal to their hosts, allowing, by way of their own expression levels, a sensitive titration of the level of response to whatever danger the cell is facing.

Schematic of how lower concentrations of gasdermin D (lower path, blue) allow smaller pores to form with less lethality.

Equally interesting, the bacterial forms of gasdermin have just begun to be studied. While they may have other functions, they certainly can kill their host cell in a suicide event, and researchers have shown that they can shut down phage infection of a colony or lawn of bacterial cells. That is, if a phage-infected cell can signal and activate its gasdermin proteins fast enough, it can commit suicide before the phage has time to fully replicate, beating the phage at its own race of infection and propagation. 

Bacteria committing suicide for the good of the colony or larger group? That introduces the theme of group selection, since committing suicide certainly doesn't do the individual bacterium any good. It is only in a family group, clonal colony, or similar community that suicide for the sake of the (genetically related) group makes sense. We, as multicellular organisms, are way past that point. Our cells are fully devoted to the good of the organism, not themselves. But to see this kind of heroism among bacteria is, frankly, remarkable.

Bacteria have even turned around to attack the attacker. The Shigella bacteria mentioned above, which are directly killed by gasdermins, have evolved an enzymatic activity that tags gasdermin with ubiquitin, sending it to the cellular garbage disposal and saving themselves from destruction. It is an interesting validation of the importance of gasdermins and the arms race that is afoot, within our bodies.


  • A tortured ballot.
  • Great again? Corruption and degradation is our lot.
  • We may be in a (lesser) Jacksonian age. Populism, bad taste, big hair, and mass deportation.
  • Beautiful Jupiter.
  • Bill Mitchell on our Depression job guarantee: "So for every $1 outlaid the total societal benefits were around $6 over the lifetime of the participant."
  • US sanctions are scrambling our alliances and the financial system.
  • Solar works for everyone.


Saturday, October 26, 2024

A Hunt for Causes of Atherosclerosis

Using the most advanced tools of molecular biology to sift through the sands of the genome for a little gold.

Blood vessels have a hard life. Every time you put on shoes, the vessels in your feet get smashed and smooshed, for hours on end. And do they complain? Generally, not much. They bounce back and make do with the room you give them. All through the body, vessels are subject to the pumping of the heart, and variations in blood volume brought on by our salt balance. They have to move when we do, and deal with it whenever we sit or lie on them. Curiously, it is the veins in our legs and calves, that are least likely to be crushed in daily life, that accumulate valve problems and go varicose. Atherosclerosis is another, much more serious problem in larger vessels, also brought on by age and injury, where injury and inflammation of the lining endothelial cells can lead to thickening, lipid/cholesterol accumulation, necrosis, calcification, and then flow restriction and fragmentation risk. 

Cross-section of a sclerotic blood vessel. LP stands for lipid pool, while the box shows necrotic and calcified bits of tissue.

The best-known risk factors for atherosclerosis are lipid-related, such as lack of liver re-capture of blood lipids, or lack of uptake around the body, keeping cholesterol and other lipid levels high in the blood. But genetic studies have found hundreds of areas of the genome with risk-conferring (or risk-reducing) variants, most of which are not related to lipid management. These genome-wide association studies (or GWAS) look for correlations between genetic markers and disease in large populations. So they pick up a lot of low-impact genetic variations that are difficult to study, due to their large number and low impact, which can often imply peripheral / indirect function. High-impact variations (mutations) tend to not survive in the population very long, but when found tend to be far more directly involved and informative.

A recent paper harnessed a variety of modern tools and methods to extract more from the poor information provided by GWAS. They come up with a fascinating tradeoff / link between atherosclerosis and cerebral cavernous malformation (CCM), which is distinct blood vessel syndrome that can also lead to rupture and death. The authors set up a program of analysis that was prodigious, and only possible with the latest tools. 

The first step was to select a cell line that could model the endothelial cells at issue. Then they loaded these cells with custom expression-reducing RNA regulators against each one of the ~1600 genes found in the neighborhood of the mutations uncovered by the GWAS analyses above, plus 600 control genes. Then they sequenced all the RNA messages from these single cells, each of which had received one of these "knock-down" RNA regulators. This involved a couple hundred thousand cells and billions of sequencing reads- no simple task! The point was to gather comprehensive data on what other genes were being affected by the genetic lesion found in the GWAS population, and then to (algorithmically) assemble them into coherent functional groups and pathways which could both identify which genes were actually being affected by the original mutations, and also connect them to the problems resulting in atherosclerosis.

Not to be outdone, they went on to harness the AlphaFold program to hunt for interactions among the proteins participating in some of the pathways they resolved through this vast pipeline, to confirm that the connections they found make sense.

They came up with about fifty different regulated molecular programs (or pathways), of which thirteen were endothelial cell specific. Things like angiogenesis, wound healing, flow response, cell migration, and osmoregulation came up, and are naturally of great relevance. Five of these latter programs were particularly strongly connected to coronary artery disease risk, and mostly concerned endothelial-specific programs of cell adhesion. Which makes sense, as the lack of strong adhesion contributes to injury and invasion by macrophages and other detritus from the blood, and adhesion among the endothelial cells plays a central role in their ability / desire to recover from injury, adjust to outside circumstances, reshape the vessel they are in, etc.

Genes near GWAS variations and found as regulators of other endothelial-related genes are mapped into a known pathway (a) of molecular signaling. The color code of changed expression refers to the effect that the marked gene had on other genes within the five most heavily disease-linked programs/pathways. The numbers refer to those programs, (8=angiogenesis and osmoregulation, 48=cell adhesion, 35=focal adhesion, related to cell adhesion, 39=basement membrane, related to cell polarity and adhesion, 47=angiogenesis, or growth of blood vessels). At bottom (c) is a layout of 41 regulated genes within the five disease-related programs, and how they are regulated by knockdown of the indicated genes on the X axis. Lastly, in d, some of these target genes have known effects on atherosclerosis or vascular barrier syndromes when mutated. And this appears to generally correlate with the regulatory effects of the highlighted pathway genes.

"Two regulators of this (CCM) pathway, CCM2 and TLNRD1, are each linked to a CAD (coronary artery disease) risk variant, regulate other CAD risk genes and affect atheroprotective processes in endothelial cells. ... Specifically, we show that knockdown of TLNRD1 or CCM2 mimics the effects of atheroprotective laminar blood flow, and that the poorly characterized gene TLNRD1 is a newly identified regulator in the CCM pathway."

On the other hand, excessive adhesiveness and angiogenesis can be a problem as well, as revealed by the reverse correlation they found with CCM syndrome. The interesting thing was that the gene CCM2 came up as one of strongest regulators of the five core programs associated with atherosclerosis risk mutations. As can be guessed from its name, it can harbor mutations that lead to CCM. CCM is a relatively rare syndrome (at least compared with coronary artery disease) of localized patches of malformed vessels in the brain, which are prone to rupture, which can be lethal. CCM2 is part of a protein complex, with KRIT1 and PDCD10, and part of a known pathway from fluid flow sensing receptors to transcription regulators (TFs) that turn on genes relevant to the endothelial cells. As shown in the diagram above, this pathway is full of genes that came up in this pathway analysis, from the atherosclerosis GWAS mutations. Note that there is a repression effect in the diagram above (a) between the CCM complex and the MAP kinase cascade that sends signals downstream, accounting for the color reversal at this stage of the diagram.

Not only did they find that this known set of three CCM gene are implicated in the atherosclerosis mutation results, but one of the genes they dug up through their pipeline, TLNRD1, turned out to be a fourth, hitherto unknown, member of the CCM complex, shown via the AlphaFold program to dock very neatly with the others. It is loss of function mutations of genes encoding this complex, which inhibits the expression of endothelial cell pro-cell adhesion and pro-angiogenesis sets of genes, that cause CCM, unleashing these angiogenesis genes to do too much. 

The logic of this pathway overall is that proper fluid flow at the cell surface, as expected in well-formed blood vessels, activates the pathway to the CCM complex, which then represses programs of new or corrective angiogenesis and cell adhesion- the tissue is OK as it is. Conversely, when turbulent flow is sensed, the CCM complex is turned down, and its target genes are turned up, activating repair, revision, and angiogenesis pathways that can presumably adjust the vessel shape to reduce turbulence, or simply strengthen it.

Under this model, malformations may occur during brain development when/where turbulent flow occurs, reducing CCM activation, which is abetted by mutations that help the CCM complex to fall apart, resulting (rarely) in run-away angiogenesis. The common variants dealt with in this paper, that decrease risk of cardiovascular disease / atherosclerosis, appear to have similar, but much weaker effects, promoting angiogenesis, including recovery from injury and adhesion between endothelial cells. In this way, they keep the endothelium tighter and more resistant to injury, invasion by macrophages, and all the downstream sequelae that result in atherosclerosis. Thus strong reduction of CCM gene function is dangerous in CCM syndrome, but more modest reductions are protective in atherosclerosis, setting up a sensitive evolutionary tradeoff that we are clearly still on the knife's edge of. I won't get into the nature of the causal mutations themselves, but they are likely to be diffuse and regulatory in the latter case.

Image of the CCM complex, which regulates response to blood flow, and whose mutations are relevant both to CCM and to atherosclerosis. The structures of TLNRD1 and the docking complex are provided by AlphaFold. 


This method is particularly powerful by being unbiased in its downstream gene and pattern finding, because it samples every expressed gene in the cell and automatically creates related pathways from this expression data, given the perturbations (knockdown of expression) of single target genes. It does not depend on using existing curated pathways and literature that would make it difficult to find new components of pathways. (Though in this case the "programs" it found align pretty closely with known pathways.) On the other hand, while these authors claim that this method is widely applicable, it is extremely arduous and costly, as evidenced by the contribution of 27 authors at top-flight institutions, an unusually large number in this field. So, for diseases and GWAS data sets that are highly significant, with plenty of funding, this may be a viable method of deeper analysis. Otherwise, it is beyond the means of a regular lab.

  • A backgrounder on sedition, treason, and insurrection.
  • And why it matters.
  • Jan 6 was an attempted putsch.
  • Trumpies for Putin.
  • Solar is a no-brainer.
  • NDAs are blatantly illegal and immoral. One would think we would value truth over lies.

Saturday, October 12, 2024

Pumping DNA

Arnold has nothing on the DNA pumps that load phages.

DNA is a very unwieldy molecule. Elegant in concept, but as organisms accumulated more features and genes, it got extremely long and twisty. So a series of management proteins arose, such as helicases and gyrases to relieve the torsional tension, and topoisomerases to cut and pass strands through each other to resolve knots. Another class is DNA pumps, which can forcefully travel along DNA to thread it into useful spaces, like the head of a phage, or a domain in our nucleus, to facilitate transcriptional isolation or organized recombination and synapsis. While other motors, acting on actin and microtubules, manage DNA segregation during mitosis, cell division, and cell movement, true DNA motors deal directly with DNA.

An iconic electron micrograph of a phage with its head blown open. The previously enclosed DNA is splayed about, suggesting the capsid's great capacity for DNA, and great pressure it was under. Inset shows an intact phage. Note the landing tentacles, which attach to the target bacterium.

There are several types of DNA pump, the lower-powered of which I have reviewed previously. The champions in terms of force, however, are the pumps that fill phage heads. Phages are viruses that infect bacteria, and they operate under a variety of limitations. Size is one- they have to be small and have small genomes, due to the small size of their targets, the brevity of their life cycle, and the mathematics of scattered propagation. Bacterial cells are under turgor pressure, of about three atmospheres, and have strong cell walls to hold everything in. So their infecting phages have several barriers to overcome. One solution is to be under even higher pressure themselves, up to about sixty atmospheres. That way, once the injection system has cut through the cell wall and inner membrane, the phage genome, which is pretty much the only thing in the phage head (or capsid), can shoot out rapidly and take over the cell. 

Schematic of late phage development, where the motor (blue) docks to the phage head and fills it with DNA, after which the tail assembly is attached.

How does the DNA loading pump work? It is closely docked into the phage head structure, has a pentagonal structure attached to the phage head, and a loosely attached, 12-sided inner rosette that they describe as a sort of bearing or ball-race. The outer pentagon has an ATPase at each vertex, and these fire sequentially during the pumping mechanism. Each ATP advances the DNA by about two base pairs. Presumably the head has a structure that guides the DNA into regular loops around its inside walls. 

Structure of the dodecameric portion of the phage DNA pump, without the ATPase pentameric portion. Obviously, the DNA threads through the center.

In the diagram below (reference), three steps are shown. First, (a, top), the "I" ATPase node (red) is linked to the "J" and "A" rosette nodes. "A" is where the rosette hooks into the DNA (red). Next, the rosette is expanded a bit, bringing "A" out of register from "I" and "C" into register with "II". At the same time, "C" links to the DNA two base pairs down from where "A" latched into it. In the third step, the rosette squashes again, the DNA ends up raised by two base pairs, and the process can start all over. It is a bit of a sleeve/ratchet mechanism. They do not speculate at this point which of these steps is the power stroke- were the ATP is hydrolyzed. Getting only two base pairs into the head per ATP doesn't seem very efficient, but it is evidently at the end of packaging, when the pressure rises to extreme levels, where this pump shines. And it can get a 19,000 bp genome into a phage head in three minutes, (~100 bp per second), so it isn't a slouch when it comes to speed, either. 

Model of how this pump works. See text above for details.


Not only is this pump an amazing and powerful bit of biotechnology, able to compress DNA to sixty atmospheres, but it is a fourth fundamental type of motor, in addition to the rotary motors as found in flagella, the linear motors found along actin and microtubules, and the DNA threading/looping motors of condensin/cohesin.


  • The 2024 Nobel prizes show the close nexus between computers and molecular biology. The original finding of miRNA complementarity could not have been made without a computerized sequence search.
  • When truth is a gaffe, and lies are routine.
  • Could crypto be any worse or more corrupting?

Saturday, September 28, 2024

Dangerous Memories

Some memory formation involves extracellular structures, DNA damage, and immune component activation / inflammation.

The physical nature of memories in the brain is under intensive scrutiny. The leading general theory is that of positive reinforcement, where neurons that are co-activated strengthen their connections, enhancing their ability to co-fire and thus to express the same pattern again in the future. The nature of these connections has been somewhat nebulous, assumed to just be the size and stability of their synaptic touch-points. But it turns out that there is a great deal more going on.

A recent paper started with a fishing expedition, looking at changes in gene expression in neurons at various time points after the mice were subjected to a fear learning regimen. They took this out to much longer time points (up to a month) than had been contemplated previously. At short times, a bunch of well-known signals and growth-oriented gene expression happened. At the longest time points, organization of a structure called the perineural net (PNN) was read out of the gene expression signals. This is a extracellular matrix sheath that appears to stabilize neuronal connections and play a role in long-term memory and learning. 

But the real shocker came at the intermediate time point of about four days. Here, there was overexpression of TLR9, which is an immune system detector of broken / bacterial DNA, and inducer in turn of inflammatory responses. This led the authors down a long rabbit hole of investigating what kind of DNA fragmentation is activating this signal, how common this is, how influential it is for learning, and what the downstream pathways are. Apparently, neuronal excitation, particularly over-excitation that might be experienced under intense fear conditions, isn't just stressful in a semiotic sense, but is highly stressful to the participating neurons. There are signs of mitochondrial over-activity and oxidative stress, which lead to DNA breakage in the nucleus, and even nuclear perforation. It is a shocking situation for cells that need to survive for the lifetime of the animal. Granted, these are not germ cells that prioritize genomic stability above all else, but getting your DNA broken just for the purpose of signaling a stress response that feeds into memory formation? That is weird.

Some neuronal cell bodies after fear learning. The red dye is against a marker of DNA repair proteins, which form tight dots around broken DNA. The blue is a general DNA stain, and the green is against a component of the nuclear envelope, showing here that nuclear envelopes have broken in many of these cells.

The researchers found that there are classic signs of DNA breakage, which are what is turning on the TLR9 protein, such as seeing concentrated double-strand DNA repair complexes. All this stress also turned on proteases called caspases, though not the cell suicide program that these caspases typically initiate. Many of the DNA break and repair complexes were, thanks to nuclear perforation, located diffusely at the centrosome, not in the nucleus. TLR9 turns on an inflammatory response via NFKB / RELA. This is clearly a huge event for these cells, not sending them into suicide, but all the alarms short of that are going off.

The interesting part was when the researchers asked whether, by deleting the TLR9 or related genes in the pathway, they could affect learning. Yes, indeed- the fear memory was dependent on the expression of this gene in neurons, and on this cell stress pathway, which appears to be the precondition of setting up the perineural net structures and overall stabilization. Additionally, the DNA damage still happened, but was not properly recognized and repaired in the absence of TLR9, creating an even more dangerous situation for the affected neurons- of genomic instability amidst unrepaired DNA.

When TRL9 is knocked out, DNA repair is cancelled. At bottom are wild-type cells, and at top are mouse neurons after fear learning that have had the gene TLR9 deleted. The red dye is against DNA repair proteins, as is the blue dye in the right-most frames. The top row is devoid of these repair activities.

This paper and its antecedent literature are making the case that memory formation (at least under these somewhat traumatic conditions- whether this is true for all kinds of memory formation remains to be seen) has commandeered ancient, diverse, and quite dangerous forms of cell stress response. It is no picnic in the park with madeleines. It is an all-hands-on-deck disaster scene that puts the cell into a permanently altered trajectory, and carries a variety of long-term risks, such as cancer formation from all the DNA breakage and end-joining repair, which is not very accurate. They mention in passing that some drugs have been recently developed against TLR9, which are being used to dampen inflammatory activities in the brain. But this new work indicates that such drugs are likely double-edged swords, that could impair both learning and the long-term health of treated neurons and brains.

Saturday, September 21, 2024

Cooperation is Very Hard

Few animals engage in productive cooperation outside family kin groups.

It might be hard to imagine in our current political climate, but cooperation is the core trait of modern humans- the ability to form groups, make rules, exchange favors, and get big things done. We even cooperate for fleeting efforts with people we will never meet again. But theories explaining cooperation have been hard-won and so far quite limited in evolutionary biology. Kin selection is perhaps the only serious theory of this type, making cooperation a strict function of shared genes, which in turn sees its role rapidly diminishing in larger groups of organisms, with the exception of peculiar families like the social insects. 

Two other explanations for cooperative behavior have been developed- repeated interaction, and group selection. In the first, assuming that humans evolved in small groups, everyone knew everything about everyone, so reputation is everything, and thus cooperation within a group is the default state. In the second, different groups with different levels of in-group cooperation would fight it out in some form, perhaps militarily, leading to the success of better-cooperating groups. A recent paper (with review) used improved modeling to suggest that neither of these two explanations holds much water on its own, but in particular ways could be combined into something they call "super-additive cooperation". I.e. human society.

The key modeling advance here was to use graded rather than binary functions for interaction rewards. Likewise, they also allowed other forms of cooperation to compete with reciprocal cooperation. This allowed subjects and modeled entities to do what people always do- get away with giving a little less in return, which sends the whole game sliding into oblivion. That is, unless you are known to others in your own group, in which case, getting and giving positive rewards becomes a virtuous cycle with ever-increasing payoffs, thus the term super-additive. The combination of in-group membership and repeated interactions provides the magic. 

Detailed modeling of cooperation (termed "escalation" of cooperation) under some key conditions. Top is repeated interactions without group selection. Next is group competition without repeated in-group interactions. Third is the joint combination. The legend at top right ranges from generous cooperation at top to selfishness at the bottom. This is modeled as money transfers between participants, which are tallied in the leftmost graphs, and fall to minimal levels over time in the top two scenarios.

But does this amount to group selection? These authors suggest that, as typically understood, group selection is not very strong and not strong enough to support the evolution of cooperation. Among humans, conflicting groups are genetically different to only infinitesimal degrees. Migration and intermarriage (forced and otherwise) are so frequent that it would be practically impossible to build selectable differences over the needed time scales. On the other hand, human societies exhibit cultural variation as well, and this kind of variation is more extensive and much more rapidly developing than genetic variation, creating differences between groups that can withstand moderate levels of migration and remain distinctive and selective. As cultural group selection, this is not the same as group selection in classic evolutionary theory, and indeed, it may be hard to relate this to evolutionary theory at all. But it certainly leads to differential survival and reproduction, whatever the genetic background to the cooperative, group feeling, and other traits that feed into the culture.

"We also show that combining the two mechanisms generates strong positive interactions. Positive interactions occur because intergroup competitions can stabilize ingroup cooperation against ambiguous reciprocity, and intergroup competitions often do this even when they do not support cooperation on their own. When the mechanisms interact, the result is the evolution of cooperative reciprocity with ingroup members, which amplifies cooperation within groups, and uncooperative reciprocity with outgroup members, which erodes cooperation between groups."

...

"Group competition can change the balance of forces by adding a mechanism that favours relatively cooperative groups. The higher payoffs associated with escalation can now dominate the fragility of escalation, with the final outcome a cooperative escalating equilibrium. When group competition shifts the balance in this way, the cooperative outcome does not require large differences between groups."


Humans and their culture are extremely complex, and this is hardly the last word on mechanisms of cooperation, which include surveillance, punishment, and much else. But at least this study can dispose of the simpler evolutionary explanations, that are accessible to uncultured organisms and explain why free cooperation among unrelated individuals is limited out in nature, to behaviors with immediate paybacks like schooling, herding, flocking, and nutrient exchanges.


  • Market-origin theory for Covid gets more support.

Sunday, September 15, 2024

Road Rage Among the Polymerases

DNA polymerase is faster than RNA polymerase. RNA polymerase also leaves detritus in its wake. What happens when they collide?

DNA is a country road- one lane, two directions. Yet in our cells it can be extremely busy, with transcription (RNA synthesis) happening all the time, and innumerable proteins hanging on as signposts, chemical modifications, and even RNA hybridized into sections, creating separated DNA structures called R-loops. When it is time for DNA replication, what happens when all these things collide? One might think that biology had worked all this out by now, but these collisions can be quite dangerous, sending the RNA polymerase careering into the other (new) DNA strand, causing the DNA polymerase to stall or miss sections, and causing DNA breaks, which activate loud cellular alarm bells and mutations.

Despite decades of work, this area of biology is still not yet very well understood, since the conditions are difficult to reproduce and study. So I can only give a few hints of what is going from current work in the field. A couple of decades ago, a classic experiment showed that in bacteria, DNA polymerases can be stopped cold by a collision with an RNA polymerase going in the opposite direction. However, this stall is alleviated by a DNA helicase enzyme, which can pry apart the DNA strands and anything attached, and the DNA replication complex sails through, after a pause of a couple of seconds. The RNA polymerase, meanwhile, is not thrown off completely, but switches its template from the complementary strand it was using previously to the newly synthesized DNA strand just made by the passing DNA polymerase. This was an amazing result, since the elongating RNA polymerase is a rather tightly attached complex. But here, it jumps ship to the new DNA strand, even though the old DNA strand remains present, and will shortly be replicated by the lagging strand DNA polymerase complex.

General schematic of encounters between replication forks and RNA polymerases (pink, RNAP). Only co-directional, not head-on, collisions are shown here. Ribosomes (yellow) in bacteria operate directly on the nascent mRNA, and can helpfully nudge the RNA polymerase along. In this scheme, DNA damage happens after the nascent RNA is used as a primer by a new DNA polymerase (bottom), which will require special repair. 

The ability of the RNA polymerase to switch template strands, along with the nascent RNA it was making, suggests very intriguing flexibility in the system. Indeed, DNA polymerases that come up from behind the RNA polymerase (using the same strand as their template) have a much easier time of it, passing with hardly a pause, and only temporarily displacing the RNA polymerase. But things are different when the RNA polymerase has just found an error and has back-tracked to fix it. Then, the DNA polymerase complex is seriously impeded. It may even use the nascent RNA hanging off the polymerase and hybridized to the local DNA as a primer to continue synthesis, after it has bumped off the RNA polymerase that made it. This leads in turn to difficulties in repair and double strand breaks in that DNA, which is the worst kind of mutation. 

The presence of RNA in the mix, in the form of single strands of RNA hybridized to one of the DNA strands, (that is, R-loops), turns out to be a serious problem. These can arise either from nascent transcription, as above, or from hybridization of non-coding RNAs that are increasingly recognized as significant gene regulators. RNA forms a slightly stronger hybrid with DNA than DNA itself does, in fact. Such R-loops (displacing one DNA strand) are quite common over active genomes, and apparently present a block to replication complexes. One would think that such fork complexes would be supplied with the kinds of helicases that could easily plow through such structures, but that is not quite the case. R-loops cause replication complex stalling, and can invoke DNA damage responses, for reasons that are not entirely clear yet. 

A recent paper that piqued my interest in all this studied an ATPase motor protein that occurs at stalled replication forks and helps them restart, presumably by acting as a DNA or RNA pump of some kind, and forcing the replication complex through obstructions. It is named WRNIP1, for WRN interacting protein, for it also interacts with Werner syndrome protein, another interesting protein at the replication fork. This is another ATPase that is a helicase and also a backwards 3' -> 5' exonuclease that cleans up DNA ends around DNA repair sites, helping to remove mismatched and damaged DNA so the repair can be as accurate as possible. As one can guess, mutations in this gene cause Werner Syndrome, a striking progeria syndrome of early aging and susceptibility to cancer. 

While the details of R-loop toxicity and repair are still being worked out, it is fascinating that such conflicts still exist after several billion years to figure them out. It is apparent that the design of DNA, while exceedingly elegant, results in intrinsic conflicts between expression and replication that are resolved amicably most of the time. But when either process gets overly congested, or encounters unexpected roadblocks, then tempers can flare, and an enormous apparatus of DNA damage signaling and repair is called in, sirens blaring, to do what it can to cut through the mess.


  • Who really believes in climate change?
  • The very strong people of the GOP. 
  • The ancient Easter Islanders mixed with South Americans.

Saturday, August 24, 2024

Aging and Death

Our fate was sealed a very long time ago.

Why do we die? It seems like a cruel and wasteful way to run a biosphere, not to mention a human life. After we have accumulated a lifetime of experience and knowledge, we age, decline, and sign off, whether to go to our just reward, or into oblivion. What is the biological rationale and defense for all this, which the biblical writers assigned to the fairy tale of the snake and the apple?

A recent paper ("A unified framework for evolutionary genetic and physiological theories of aging") discusses evolutionary theories of aging, but in typical French fashion, is both turgid and uninteresting. Aging is widely recognized as the consequence of natural selection, or more precisely, the lack thereof after organisms have finished reproducing. Thus we are at our prime in early adulthood, when we seek mates and raise young. Evolutionarily, it is all downhill from there. In professional sports, athletes are generally over the hill at 30, retiring around 35. Natural selection is increasingly irrelevant after we have done the essential tasks of life- surviving to mate and reproduce. We may participate in our communities, and do useful things, but from an evolutionary perspective, genetic problems at this phase of life have much less impact on reproductive success than those that hit earlier. 

All this is embodied in the "disposable soma" theory of aging, which is that our germ cells are the protected jewels of reproduction, while the rest of our bodies are, well, disposable, and thus experience all the indignities of age once their job of passing on the germ cells is done. The current authors try to push another "developmental" theory of aging, which posits that the tradeoffs between youth and age are not so much the resources or selective constraints focused on germ cell propagation vs the soma, but that developmental pathways are, by selection, optimized for the reproductive phase of life, and thus may be out of tune for later phases. Some pathways are over-functional, some under-functional for the aged body, and that imbalance is sadly uncorrected by evolution. Maybe I am not doing justice to these ideas, which maybe feed into therapeutic options against aging, but I find this distinction uncompelling, and won't discuss it further.

A series of unimpressive distinctions in the academic field studying aging from an evolutionary perspective.

Where did the soma arise? Single cell organisms are naturally unitary- the same cell that survives also mates and is the germ cell for the next generation. There are signs of aging in single cell organisms as well, however. In yeast, "mother" cells have a limited lifespan and ability to put out daughter buds. Even bacteria have "new" and "old" poles, the latter of which accumulate inclusion bodies of proteinaceous junk, which apparently doom the older cell to senescence and death. So all cells are faced with processes that fail over time, and the only sure bet is to start as a "fresh" cell, in some sense. Plants have taken a distinct path from animals, by having bodies and death, yes, but being able to generate germ cells from mature tissues instead of segregating them very early in development into stable and distinct gonads.

Multicellularity began innocently enough. Take slime molds, for example. They live as independent amoebae most of the time, but come together to put out spores, when they have used up the local food. They form a small slug-like body, which then grows a spore-bearing head. Some cells form the spores and get to reproduce, but most don't, being part of the body. The same thing happens with mushrooms, which leave a decaying mushroom body behind after releasing their spores. 

We don't shed alot of tears for the mushrooms of the world, which represent the death-throes of their once-youthful mycelia. But that was the pattern set at the beginning- that bodies are cells differentiated from the germ cells, that provide some useful, competitive function, at the cost of being terminal, and not reproducing. Bodies are forms of both lost energy and material, and lost reproductive potential from all those extra cells. Who could have imagined that they would become so ornate as to totally overwhelm, in mass and complexity, the germ cells that are the point of the whole exercise? Who could have imagined that they would gain feelings, purposes, and memories, and rage against the fate that evolution had in store for them?

On a more mechanistic level, aging appears to arise from many defects. One is the accumulation of mutations, which in soma cells lead to defective proteins being made and defective regulation of cell processes. An extreme form is cancer, as is progeria. Bad proteins and other junk like odd chemicals and chemically modified cell components can accumulate, which is another cause of aging. Cataracts are one example, where the proteins in our lenses wear out from UV exposure. We have quite intricate trash disposal processes, but they can't keep with everything, as we have learned from the advent of modern chemistry and its many toxins. Another cause is more programmatic: senescent cells, which are aged-out and have the virtue that they are blocked from dividing, but have the defect that they put out harmful signals to the immune system that promote inflammation, another general cause of aging.

Aging research has not found a single magic bullet, which makes sense from the evolutionary theory behind it. A few things may be fixable, but mostly the breakdowns were never meant to be remedied or fixed, nor can they be. In fact, our germ cells are not completely immune from aging either, as we learn from older fathers whose children have higher rates of autism. We as somatic bodies are as disposable as any form of packaging, getting those germ cells through a complicated, competitive world, and on to their destination.


Sunday, August 11, 2024

Modeling Cell Division

Is molecular biology ready to use modeling to inform experimental work?

The cell cycle is a holy grail of biology. The first mutants that dissected some of its regulatory apparatus, the CDC mutants of Saccharomyces cerevisiae (yeast), electrified the field and led to a Nobel prize. These were temperature sensitive mutants, making only small changes to the protein sequence that rendered that protein inactive at high temperature (thus inducing a cell cycle arrest phenotype), while allowing wild-type growth at normal temperatures. In the fifty years since, a great deal of the circuitry has been worked out, with the result that it is now possible, as a recent paper describes, to make a detailed mathematical model of the process that claims to be useful in the sense of explaining existing findings in a unified model and making predictions of places to look for additional actors.

At the center of this regulatory scheme are transcription activators, SBF/MBF, that are partly controlled by, and in turn control the synthesis of, a series of cyclins. Cyclins are proteins that were observed (another Nobel prize) to have striking variations in abundance during the cell cycle. There are characteristic cyclins for each phase of the cell cycle, which goes from G1, a resting phase, to S, which is DNA replication, to G2, a second resting phase, and then M, which is mitosis, which brings us back to G1. Cyclins work by binding to a central protein kinase, Cdc28, which, as regulated by each distinct cyclin, phosphorylates and thus regulates distinct sets of target proteins. The key decision a cell has to make is whether to commit to DNA replication, i.e. S phase. No cell wants to run out of energy during this process, so its size and metabolic state needs to be carefully monitored. That is done by Cyclin 3 (Cln3), Whi5, and Bck2, which each influence whether the SBF/MBF regulators are active. 

Some highly simplified elements of the yeast cell cycle. Cyclins (Cln and Clb) are regulators of a central protein kinase, Cdc28, that direct it to regulate appropriate targets at each stage of the cell cycle. Cyclins themselves are regulated by transcriptional control (here, the activators SBF and MBF), and then destroyed at appropriate times by proteolysis, rendering them abundant only at specific times during the cell cycle. Focusing on the "START" process that starts the process from rest (G1 phase) to new bud formation and DNA replication (S phase), Cln3 and Bck2 respond to upstream nutritional and size cues, and each activate the SBF/MBF transcription activator.

As outlined in the figure above, Cyclin 3 is the G1 cyclin, which, in complex with Cdc28 phosphorylates Whi5, turning it off. Whi5 is an inhibitor that binds to SBF/MBF, so the Cyclin 3 activation turns these regulators on, and thus starts off the cell cycle under the proper conditions. Incidentally, the mammalian version of Whi5, Rb (for retinoblastoma), is a notorious oncogene, that, when mutated, releases cells from regulatory control over cell division. SBF and MBF bind to genes for the next series of cyclins, Cln1, Cln2, Clb5, Clb6. The first two are further G1 cyclins that orchestrate the end of G1. They induce phosphorylation and inactivation of Sic1 and Cdc6, which are inhibitors of Clb5 and Clb6. These latter two are then the initiators of S phase and DNA replication. Meanwhile, Cln3 stays around till M phase, but is then degraded in definitive fashion by the proteases that end M phase. Starvation conditions lead to rapid degradation of Cln3 at all times, and thus to no chance of starting a new cell cycle.

Charts of the abundance of some cyclins through the cell cycle. Each one has its time to shine, after which it is ubiquitinated and sent off to the recycling center / proteasome.

Bck2 is another activator of SBF/MBF that is unrelated to the Cln3/Whi5 system, but also integrates cell size and metabolic status information. Null mutants of Cln3 (or Bck2) are viable, if altered in cell cycle, while double null mutants of Cln3 and Bck2 are dead, indicating that these regulators are each important, in a complementary way, in cell cycle control. Given that little is known about Bck2, the modelers in this paper assume various properties and hope for the best down the line, predicting that cell size (at the key transition to S phase) is more affected in the Cln3 null mutant than in the Bck2 null mutant, since in the former, excess active Whi5 soaks up most of the available SBF/MBF, and requiring extra-high and active levels of Bck2 to overcome this barrier and activate the G1 cyclins and other genes.

The modelers are working from the accumulated, mostly genetic data, and in turn validate their models against the same genetic data, plus a few extra mutants they or others have made. The models are mathematical representations of how each node (i.e protein, or gene) in the system responds to the others, but since there are a multitude of unknowns, (such as what really regulates Bck2 from upstream, to cite just one example), the system is not really able to make predictions, but rather fine-tunes/reconciles what knowledge there is, and, at best, points to gaps in knowledge. It is a bit like AI, which magically recombines and regurgitates material from a vast corpus based on piece-wise cues, but is not going to find new data, other than through its notorious hallucinations.

For example, a new paper came out after this modeling, which finds that Cln3 affects Cln2 abundance by mechanisms quite apart from its SBF/MBF transcriptional control, and that it regulates cell size in large part at M phase, not through its G1/S gating. All this comes from new experimental work, unanticipated by the modeling. So, in the end, experimental work always trumps modeling, which is a bit different than how things are in, say, physics, where sometimes the modeling can be so strong that it predicts new particles, forces, and other phenomena, to be validated later experimentally. Biology may have its master predictive model in the theory of evolution, but genetics and molecular biology remain much more of an empirical slog through the resulting glorious mess.


  • Bitcoin isn't a currency, but rather just another asset class, one without any fundamental or socially positive value. A little like gold, actually, except without gold's resilience against social / technological disruption.
  • The disastrous post-Soviet economic transition, on our advice.
  • The enormous labor drain, and resource drain, from global South to North.

Saturday, July 27, 2024

Putting Body Parts in Their Places

How HOX genes run development, on butterfly wings.

I have written about the HOX complex of genes several times, because they constitute a grail of developmental genetics- genes that specify the identity of body parts. They occupy the middle of a body plan cascade of gene regulation, downstream from broader specifiers for anterior/posterior orientation, regional and segment specification, and in turn upstream of many more genes that specify the details of organ and tissue construction. Each of the HOX genes encodes a transcriptional regulator, and the name of one says it all- antennapedia. In fruit flies, where all this was first discovered, loss of antennapedia converts some legs into antennae, and extra expression of antennapedia converts antennae on the head into legs.

The HOX complex (named for the homeobox DNA binding motif of the proteins they encode) is linear, arranged from head-affecting genes (labial, proboscipedia) to abdomen-affecting genes (abdominal A, abdominal B; evidently the geneticist's flair for naming ran out by this point). This arrangement is almost universally conserved, and turns out to reflect molecular mechanisms operating on the complex. That is, it "opens" in a progressive manner during development, on the chromosome. Repression of chromatin is a very common and sturdy way to turn genes off, and tends to affect nearby genes, in a spreading effect. So it turns out to be easy, in some sense, to set up the HOX complex to have this chromatin repression lifted in a segmental fashion, by upstream regulators, whereby only the head sections are allowed to be expressed in head tissues, but all the genes are allowed to be expressed in the final abdominal segment. That is why the unexpected expression of antennapedia, which is the fifth of eight HOX genes, in the head, leads to a thoracic tissue (legs) forming on the head.

A recent paper delved a little more deeply into this story, using butterflies, which have a normal linearly conserved HOX cluster and are easy to diagnose for certain body part transformations (called homeotic) on their beautiful wings. The main thing these researchers were interested in is the genetic elements that separate one part of the HOX cluster from other parts. These are boundary or "insulator" elements that separate topologically associated domains (called TADs). Each HOX gene is surrounded by various regulatory enhancer and inhibitor sites in the DNA that are bound by regulatory proteins. And it is imperative that these sites be directed only to the intended gene, not neighboring genes. That is why such TADs exist, to isolate the regulation of genes from others nearby. There are now a variety of methods to map such TADs, by looking where chromatin (histones) are open or closed, or where DNA can be cut by enzymes in the native chromatin, or where crosslinks can be formed between DNA molecules, and others.

The question posed here was whether a boundary element, if deleted, would cause a homeotic transformation in the butterflies they were studying. They found, unfortunately, that it was impossible to generate whole animals with the deletions and other mutations they were engineering, so they settled for injecting the CRISPER mutational molecules into larval tissues and watching how they affected the adults in mosaic form, with some mutant tissues, some wild-type. The boundary they focused on was between antennapedia (Antp) and ultrabithorax (Ubx), and the tissues the forewings, where Ubx is normally off, and hindwings, where Ubx is normally on. Using methods to look at the open state of chromatin, they found that the Ubx gene is dramatically opened in hindwings, relative to forewings. Nevertheless, the boundary remains in place throughout, showing that there is a pretty strong isolation from Antp to Ubx, though they are next door and a couple hundred thousand basepairs apart. Which in genomic terms is not terribly far, while it leaves plenty of space for enhancers, promotes, introns, boundary elements, and other regulatory paraphernalia.

Analysis of the site-to-site chromosomal closeness and accessibility across the HOX locus of the butterfly Junonia coenia. The genetic loci are noted at the bottom, and the site-to-site hit rates are noted in the top panels, with blue for low rates of contact, and orange/red for high rates of contact. At top is the forewing, and at bottom is the hindwing, where Ubx is expressed, thus the high open-ness and intra-site contact within its topological domain (TAD). Yet the boundary between Ubx and Anp to its left (dotted lines at bottom) remains very strong in both tissues. In green is a measure of transcription from this DNA, in differential terms hindwing minus forewing, showing the strong repression of Ubx in the forewing, top panel.

The researchers naturally wanted to mutate the boundary element, (Antp-Ubx_BE), which they deduced lay at a set of binding sites (featuring CCCTC) for the protein CTCF, a well-known insulating boundary regulator. Note, interestingly, that in the image above, the last exon (blue) of Ubx (transcription goes right to left) lies across the boundary element, and in the topological domain of the Antp gene. This means that while all the regulatory apparatus of Ubx is located in its own domain, on the right side, it is OK for transcription to leak across- that has no regulatory implications. 

Effects of removing the boundary element between Ubx and Antp. Detailed description is in the text below. 

Removal of this boundary element, using CRISPER technology in portions of the larval tissues, had the expected partial effects on the larval, and later adult, wings of this butterfly. First, note that in panel D insets, the wild type larval forewing shows no expression of Ubx, (green), while the wild type hind wing shows wide-spread expression. This is the core role of the HOX locus and the Ubx gene- locate its expression in the correct body parts to then induce the correct tissues to develop. The larval wing tissue of the mosaic mutant, also in D, shows, in the forewing, extensive patchy expression of Ubx. This is then reflected in the adult (different animals) in the upper panels, in the mangled eyespot of the fully formed wing (center panel, compared to wild-type forewing and hindwing to each side). It is a small effect, but then these are small mutations, done in only a fraction of the larval cells, as well.

So here we are, getting into the nuts and bolts of how body parts are positioned and encoded. There are large regions around these genes devoted to regulatory affairs, including the management of chromatin repression, the insulation of one region from another, the enhancer and repressor sites that integrate myriad upstream signals (i.e. other DNA binding proteins) to come up with the detailed pattern of expression of these HOX genes. Which in turn control hundreds of other genes to execute the genetic program. This program can hardly be thought of as a blueprint, nor a "design" in anyone's eye, divine or otherwise. It resembles much more a vast pile of computer code that has accreted over time with occasional additions of subroutines, hacks, duplicated bits, and accidental losses, adding up to a method for making a body that is robust in some respects to the slings and arrows of fortune, but naturally not to mutations in its own code.


Saturday, July 13, 2024

The Long Tail of Genome Duplication

A new genomic sequence of hagfish tells us a little about our origins.

Hagfish- not a fish, and not very pretty, but it occupies a special place in evolution, as a vertebrate that diverged very early (along with lampreys, forming the cyclostome branch) from the rest of the jawed vertebrates (the gnathostome branch). The lamprey has been central to studies of the blood clotting system, which is a classic story of gradual elaboration over time, with more steps added to the cascade, enabling faster clotting and finer regulation.

A highly schematic portrayal (not to scale!) of the evolutionary history of animal life on earth.

A recent paper reported a full genome sequence of hagfish, and came up with some interesting observations about the history of vertebrate genomes. At about three billion nucleotides, this genome is about as large as ours. (Yet again, size doesn't see, to matter much, when it comes to genomes.) They confirm that lampreys and hagfish make up a single lineage, separate from all other animals and especially from the jawed vertebrates. For example, though lampreys have 84 chromosomes to the hagfish's 17, this resulted from repeated splitting of chromosomes, and each lamprey chromosome can be mostly mapped to one hagfish chromosome, accepting that a lot of other gene movement and change has taken place in the roughly 460 million years since these lineages diverged. 

Hagfish (bottom) and lamprey (top) chromosomes pretty much line up, indicating that despite the splitting of the lamprey genome, there hasn't been a great deal of shuffling over the intervening 460 million years.

The most important parts of this paper are on the history of genome duplications that happened during this early phase of vertebrate evolution. Whole genome duplications are an extremely powerful engine of change, supplying the organism with huge amounts of new genetic material. Over time, most of the duplicated genes are discarded again (in a process they call re-diploidization). But many are not, if they have gained some foothold in providing more of an important product, or differentiated themselves from each other in some other way. Our genomes are full of families, some extremely large, of related genes that have finely differentiated functions. Many of these copies originated in long-ago genome duplications, while others originated in smaller duplication accidents. It is startling to hear from self-labeled scientists in the so-called intelligent design movement that there is some rule or law against such copying of information, by their ridiculous theories of specified information. Hagfish certainly never heard of such a thing.

At any rate, these researchers confirm that the earliest vertebrate lineage, around 530 million years ago, experienced two genome duplications which led to a large increment of new genes and evolutionary innovation. What they find now is that the cyclostome lineage experienced another genome three-fold duplication (near its origin, about 460 million years ago, leading to another round of copies and innovation. And lastly, the gnathostome lineage separately experienced its own genome four-fold duplication around the same time, after it had diverged from the cyclostome lineage. One might say that the gnathostomes made better use of their genomic manna, generating jaws, teeth, ears, thymus, better immune systems, and the other features that led them to win the race of the animal kingdom. But hagfish are still around, showing that primitive forms can find a place in the scheme of things, as the biosphere gets larger and more diverse over time.

A classic example of gene replication is the Hox cluster, which are a set of genes that have the power of dictating what body part occurs where. They are gene regulators that function in the middle of the developmental sequence, after determination of the overall body axis and segmentation, and themselves regulating downstream genes governing features as they occur in different segments, such as limbs, parts of the head, fingers, etc. Flies have one Hox cluster, split into two parts. The extremely primitive chordate amphioxus, which far predates the cyclostomes, also has one complete Hox cluster, as diagrammed below. Most other vertebrates, including us, have four Hox clusters, amounting to over thirty of these transcription regulators. These four clusters arose from the inferred genome duplications very early in the vertebrate lineage, prior to the advent of the cyclostomes. 

Hox clusters and their origins, as inferred by the current authors. The red/blue points at the left mark whole genome duplications (or more) that have been inferred by these or other authors. More description is in the main text below.

The inferred genome duplications during early chordate evolution, noted on the far left of the diagram above, led to duplicated clusters of Hox genes. Amphioxus (top) is the earliest branching chordate, and has only one full Hox cluster of transcription regulators, which, in general terms, control, during development, the expression of body parts along the body axis, with the order of genes in the cluster paralleling expression and action along the body axis. Chicken as a gnathostome has four copies of the cluster, with a few of the component genes lost over time. Hagfish have six copies of this Hox cluster, some rather skeletal, stemming from its genome duplication events. Clearly several whole clusters have also been lost, as in some cases the genome duplications experienced by the cyclostomes resolved back to diploidy without leaving an extra copy of this cluster. The net effect is to allow all these organisms greater options for controlling the identity and form of different parts of the body, particularly, in the case of gnathostomes, the head.

Genome duplications are one of those fast events in evolution that are highly influential, unlike the usual slow and steady selection and optimization that is the rule in the Darwinian theory. Unlike mass extinction, another kind of fast event in evolution, genome duplications are highly constructive, providing fodder on a mass (if microscopic) scale for new functions and specializations that help account for some of the more rapid events in the history of life, such as the rise of chordates and then vertebrates in the wake of the Cambrian explosion.