Showing posts with label evolution. Show all posts
Showing posts with label evolution. Show all posts

Saturday, April 27, 2024

Ruffling the Feathers of Dinosaurs

The origin of birds remains uncertain, as does the status of feathers on dinosaurs. Review of "Riddle of the Feathered Dragons", by Alan Feduccia. 

As regular readers can surmise, I was raised (scientifically) in an empirical, experimental tradition- that of molecular biology. In that field there is little drama, since any dispute can be taken back to the lab for adjudication. No titanic battles of conflicting interpretations happen, and extremely high standards pervade the field, since any lapse is easily discovered and replicated. Despite the dominant position of molecular biology in the major journals, due to its high productivity, it is thus rarely in the public spotlight. It has been a bit of a culture shock to realize that other areas of science have significantly different standards and epistemology. Many fields (such as astronomy or paleontology) are at heart observational rather than experimental, or have other restrictions or conflicts (medical science, nutrition studies) that impair their ability to find the truth, leading to a great deal more interpretation, drama, and sometimes, rampant speculation.

Paleontology, and the study of the past in general, has an intrinsic lack of data. If the fossils are missing, what can one do but to wonder and speculate what could have happened during that gap? And when fossils do turn up, they still lack alot of information about their unfortunate contributor- they are only the bare bones, after all. They may be in bad condition and particularly hard to interpret. Whole genera or above may be represented by a tooth or single bone. Millions of years may go by with nothing to show for it. No wonder speculation fills the gap- but is that science? Incidentally, I have to thank the Discovery Insitute, with its keen nose for scientific controversy, for pointing me to today's author, who disputes the now-conventional view that birds arose from dinosaurs. While Alan Feduccia has nothing to do with Creationism and its offshoots, and is a perfectly respectable paleontologist, he is, in the course of at least four books on the subject, (of which this is the third), clearly frustrated with the reigning interpretations of his field, which has jumped to what he regards as unwarranted conclusions that have led to a flurry of portrayals of feathered dinosaurs.

The Archaeopteryx fossil, Berlin specimen, which dates to roughly 150 million years ago.


The first bird, more or less, was Archaeopteryx. Its Berlin fossil, found about 1875, and dating to roughly 150 million years ago (mya) is perhaps the most beautiful, and informative, fossil ever found- a complete bird, with fully spread flight feathers on its arms and legs, a long tail, claws on its hands,and teeth in its mouth. It had the precisely the in-between characteristics of both reptiles and birds that gave immediate validation to Charles Darwin's theory of evolution by gradual change and natural selection. But where did it come from? That is the big question. While a great many other fossil birds have been found, none substantially predate Archaeopteryx, (other than perhaps Anchiornis, very similar to Archaeopterix, and dating to roughly 160 mya), and thus we really do not know (as yet) from fossils how birds originated

Feathers are one diagnostic feature in this lineage. Archaeopterix and many later birds found in China and Mongolia from the Cretaceous have feathers, clearly marking them as birds and as lineally related. But other allied fossils have been found, nominally described as dinosaurs, which are described to have feathers as well. One is shown below. 

Fossil of Sinosauropteryx, which dates roughly to 124 million years ago.

Closeup of the hair-like impressions on the tail of Sinosauropteryx.

Whether these structures are feathers, or related to them, is quite debatable. They look more like hairs, and Feduccia claims that they do not even occur outside the body wall. Some experimentation by others has shown that sub-dermal collagen can form this kind of hair-like fuzz during some forms of decay and fossilization, given proper squashing. Current conventional wisdom, however, describes them as filament-like feathers used for insulation or display. My take, looking closely at these pictures, is that they are not feathers, but are outside the body wall, which, on the tail certainly, would have been very close to the bone. Additionally, as these specimens are all rather late, they could easily be descended from birds, while being large and flightless. Feduccia points out that while land-based animals have never gained/regained flight, flightlessness has evolved many times through the bird lineages. Similarly, extensive lineages of secondarily flightless birds may have developed in the Mesozoic, that conventional paleontologists call dinosaurs, (often with feathers), and posit as evidence that the reverse happened- that birds evolved from dinosaurs. For example, the conventional view of dinosaurs and birds draws on many later fossils from the Cretaceous (such as Deinonychus), which had both bird-like and dinosaur-like features: the "raptors". 

Another character at issue is flight itself. If birds are basal- that is, they arose prior to or separately from the other dinosaurs- then they could easily have developed from small arboreal lizards that learned to glide from place to place. On the other hand, dinosaurs are all relatively large and bipedal. So conventional paleontologists have labored to come up with ways that flight could have developed "from the ground up". Such theories as insect trapping by nascent small wings, or occasional tree climbing with tiny wings, to escape predators, have been invoked as rationales for feathers and wings to develop in terrestraial bipedal dinosaurs. Feduccia counters that in the whole history of flight, all animals (birds, bats, squirrels, others) have developed flight from gliding, not from the ground up. Indeed, there are countless flightless birds, and none of them have resumed flight, despite presumably having much of the genetic wherewithal to do so.

Given patchy data, the leading method to make sense of it and organize organisms from the fossil record into a phylogenetic story is the cladistic method. Practitioners choose a wide range of "characters", (such as the lengths, angles, holes, and other morphologies in the available bones) and tabluate their values from all the proposed species. Then they can mathematically just total up who is more distant from whom. Feducci emphasizes that this is an excellent method for ordering closely related genera and species. But over the long run, evolution repeats itself alot, making numerous flightless birds, for example, or similarly shaped swimming animals, not all of which are as closely related as they might look morphologically. Cladistics is a classic case of garbage-in-garbage-out analysis, and has routinely been overturned by molecular evidence when, among extant species, genomic data is available. Sadly, genomic data is not available for the fossils from the Mesozoic (the age of the dinosaurs, which encompasses, in order, the Triassic (245 mya to 208 mya), the Jurassic (208 to 144 mya) and the Cretaceous (144 to 65 mya) periods), nor from any living descendants of the dinosaurs... other than their putative decendants, birds.

As an aside, molecular phylogenies are also at heart cladistic in their theory and method. They just have a lot more "characters"- i.e. the letters of the DNA sequences in homologous / aligned sequences. But even more importantly, since a large proportion of these characters are neutral, (to natural selection), and thus vary (in sort-of clock-like fashion) no matter what convergent evolution might happen morphologically, molecular phylogenies can easily resolve difficult questions of phylogeny on the short to medium geologic terms. When it comes to the deepest phylogenies, however, going over a billion years, neutral characters become wholly useless due to homogenization by the vast times that have passed, so for such time periods these methods become less incisive.

Crude cladogram illustrating the alternative hypotheses- that birds are descended from theropod dinosaurs, or that birds arise from a basal lineage of their own, directly from the common stem of archosaurs. In the latter hypothesis, numerous bird-like lineages currently construed as dinosaurs might be secondarily flightless birds.

It is cladistics (along with other evidence) that has enshrined birds within the dinosaur lineage, finding that theropods came first, and the avians came later on. (With a contrasting view, and a critique of the contrasting view.) Theropods and birds are certainly similar, compared to their crocodilian / archosaur antecedents. They are bipedal, with similar hip structures, neck structures, and hands/feet reduced from five to three toes. But if much of what we take to be the dinosaurs (those with feathers and the whole so-called "raptor" class), are actually secondarily flightless birds, then one can make a lot of sense of some of these similarities, while casting the origin of birds quite a bit father back in time, more or less co-incident with the origin of true dinosaurs. Such as in the diagram above.

The problem with all this is again time. The early Jurassic and Triassic, amounting to almost one hundred million years before Archaeopterix, provide a lot of evidence for dinosaurs. They first appear roughly 240 mya, and flourish after the major exinction event that ended the Triassic, at 201 mya. The stark lack of evidence for birds, and widespread evidence for dinosaurs, including the lineage (theropods) that are most related to birds, suggests strongly that birds did not originate back in the Triassic, in parallel with the core dinosaur lineages. It suggests, rather, that among the many theropod dinosaurs during the ten or twenty million years before Archaeopterix were some small enough to take to the trees, grow longer arms, and be in position for flight. There were doubtless plenty of insects up there, at least until just about this time of the late-Jurassic, when birds started to eat them! Fossil record gaps are treacherous things, but this one indicates strongly that birds evolved in the middle Jurassic, along with (and within) the wider adaptive radiation of dinosaurs.

"Yet Archaeopteryx is still the classic urvogel- the oldest well-studied bird yet discovered, perhaps some 25 or more million years older than most of the Early Cretaceous Chinese fossils. As we saw in chapter 3, the Solnhofen urvogel is a mosaic of reptilian and avian features, a true bird, and the more it is studied, the more and more birdlike it is revealed to be. Ignoring the element of geologic time, however, many paleontologists have proposed that the Liaoning fossils provide evidence for all the stages of the evolution not only of birds and bird flight but also of feathers, from fiberlike protofeathers to pennaceous, asymmetrical flight remiges. Such a claim is remarkable and would be astounding in any fauna, but is especially so for a fauna so temporally removed from the time of avian origins, presumably before the Middle Jurassic and perhaps well back into the Triassic. 

University of Pennsylvania paleontologist Peter Dodson, remarking on the inadequacies of cladistic methodology, tells us: 'To maintain that the problem of the origin of birds has been solved when the fossil record of the Middle or Late Jurassic bird ancestors is nearly a complete blank is completely absurd. The contemporary obsession with readily available computer-assisted algorithms that yield seemingly precise results that obviate the need for clear-headed analysis diverts attention away fron the effort that is needed to discover the very fossils that may be the true ancestors of birds. When such fossils are found, will cladistics be able to recognize them? Probably not.'"

Feducci makes a lot of insightful points and hits some sensitive marks, in addition to all the trash-talk. Cladistics has problems, hairs are not feathers, and Cretaceous birds don't tell us much about the evolution of bird flight, which doubtless began as gliding between trees tens of millions of years earlier. And he is right that the hunt for clear antecedents of Archaeopterix, whether far in the past or near, should be the focus of this field. But overall, it is hard to fully credit the "birds early" story. 


Sunday, March 31, 2024

Nominee for Most Amazing Protein: RAD51

On the repair and resurrection of DNA, which gets a lot of help from a family of proteins including RAD51, DMC1, and RecA.

Proteins do all sorts of amazing things, from composing pores that can select a single kind of ion- even just a proton- to allow across a membrane, to massive polymerizing enzymes that synthesize other proteins, DNA, and RNA. There is really no end to it. But one of the most amazing, even incredible, things that happens in a cell is the hunt for DNA homology. Even over a genome of billions of base pairs, it is possible for one DNA segment to find the single other DNA segment that matches it. This hunt is executed for several reasons. One is to line up the homologous chromosomes at meiosis, and carry out the genetic cross-overs between them (when they are lined up precisely) that help scramble our genetic lineages for optimal mix-and-matching during reproduction. Another is for DNA repair, which is best done with a good copy for reference, especially when a full double-strand break has happened. Just this week, a fascinating article showed that memories in our brains depend in some weird way on DNA breaks occurring in neurons, some of which then use the homologous repair process, including homology search, to patch things up.

The protein that facilitates this DNA homology search is deeply conserved in evolution. It is called RecA in bacteria, radA and radB in archaea, and the RAD51 family in eukaryotes. Naturally, the eukaryotic family is most closely related to the archaeal versions (RAD51 and DMC1 evolving from radA, and a series of other, and poorly understood family members, from radB). In this post, I will mostly just call them all RAD51, unless I am referring to DMC1 specifically. The name comes from genetic screens for radiation-sensitive mutants in human and other eukaryotes, since RAD51 plays a crucial role in DNA repair, as noted above. RAD51 is not a huge protein, but it is an ATPase. It binds to itself, forming linear filaments with ATP at the junction points between units. It binds to a single strand of DNA, which is going to be what does the hunting. And it binds, in a complicated way, to another double-stranded DNA, which it helps to open briefly to allow its quality as a target to be evaluated. 

This diagram describes the repair of double strand breaks (DSB) in DNA. First the ends are covered with a bunch of proteins that signal far and wide that something terrible has happened- the cell cycle has to stop.. fire engines need to be called. One of these proteins is RPA, which simply binds all over single-stranded DNA and protects it. Then the RAD51 protein comes in, displaces RPA, and begins the homology search process. The second DNA shown, in dark black, doesn't just happen, but is hunted for high and low throughout the nucleus to find the exact homolog of the broken end. When that exact match is found, the repair process can proceed, with continued DNA synthesis through the lesion, and resolution of the newly repaired double strands, either to copy up the homolog version, or exchange versions (GC, for gene conversion). 

This diagram shows how the notorious (when mutated) oncogene BRCA2 (in green) works. It binds RAD51 (in blue) and brings it, chain-gang style, to the breakpoints of DNA damage to speed up and specify repair.


There have been several structural studies by this point that clarify how RAD51 does its thing. ATP is simply required to form filaments on single-stranded DNA. When a match has been found and RAD51 is no longer needed, ATP is cleaved, and RAD51 falls off, back to reserve status. The magic starts with how RAD51 binds the single stranded DNA. One RAD51 binds for every ~3 bases in the DNA, and the it binds the phosphate backbone, so that the bases are nicely exposed in front, and all stretched out, ready to hunt for matching DNA.

A series of RAD51 molecules (in this case, RecA from bacteria) bound sequentially to single-stranded DNA (red). Note the ATP homolog chemicals in yellow, positioned between each protein unit. One can see that the DNA is stretched out a bit and the bases point outwards.

A closeup view of one of the RAD51 units from above, showing how the bases of the DNA (yellow) are splayed out into the medium, ready to find their partners. They are arranged in orientations similar to how they sit in normal (B-form) DNA, further enhancing their ability to find partners.

The second, and more mysterious part of the operation is how RAD51 scans double-stranded DNA throughout the genome. It has binding sites for double-stranded DNA, away from the single-stranded DNA, and then it also has a little finger that splits open the double-stranded DNA, encouraging separation and allowing one strand to face up to the single stranded DNA that is held firmly by the RAD51 polymer. The transient search happens in eight-base increments, with tighter capture of the double-strand DNA happening when nine bases are matched, and committment to recombination or repair happening when a match of fifteen bases is found.  

These structures show an intermediate where a double-stranded DNA (ends in teal and lavender, and separated DNA segments in green and red) has been captured, making a twelve base match with the stable single-stranded DNA (brown). Note how the double-stranded DNA ends are held by outside portions of the RAD51 protein. Closeup on the right shows the dangling, non-paired DNA strand in red, and the newly matched duplex DNA with green-brown colored base interactions.

These structures can only give a hint of what is going on, since the whole process relies so clearly on the brownian motion that allows super-rapid diffusion of the stablized single-strand DNA+RAD51 over the genome, which it scans efficiently in one-dimensional fashion, despite all the chromatin and other proteins parked all over the place. And while the structures provide insight into how the process happens, it remains incredible that this search can happen, on what is clearly a quite reliable basis, day and day out, as our genomes get hit by whatever the environment throws at us.

"Unfortunately, most RAD51 and RAD51 paralog point mutations that have been clinically identified are classified as variants of unknown significance (VUSs). Future studies to reclassify these RAD51 gene family VUSs as pathogenic or benign are desperately needed, as many of these genes are now included on hereditary breast and ovarian cancer screening panels. Reclassification of HR-deficient VUSs would enable these patients to benefit from therapies that specifically target HR deficiency, as do poly(ADP)-ribose polymerase (PARP) inhibitors in BRCA1/2-deficient cells."

Lastly, one paper made the point that clinicians need better understanding of the various mutations that can affect RAD51 itself. Genetic testing now is able to find all of our mutations, but we don't always know what each mutation is capable of doing. Thus deeper studies of RAD51 will have beneficial effects on clinical diagnosis, when particular mutations can be assigned as disease-causing, thus justifying specific therapies that would otherwise not be attempted.


Saturday, February 17, 2024

A New Form of Life is Discovered

An extremely short RNA is infectious and prevalent in the human microbiome.

While the last century might be called the DNA century, at least for molecular biology, the current century might be called that of RNA. A blizzard of new RNA types and potentials have been discovered in the normal eukaryotic milieu, including miRNA, eRNA, lincRNA. An RNA virus caused a pandemic, which was remedied by an RNA vaccine. Nobel prizes have been handed out in these fields, and we are also increasingly aware that RNA lies at the origin of life itself, as the first genetic and catalytic mechanism.

One of these Nobel prize winners recently undertook a hunt for small RNAs that might be lurking in the human microbiome- the soup of bacteria, fungi, and all the combined products that cover our surfaces, inside and out. What they found was astonishing- an RNA of merely 1164 nucleotides, which folds up into a rigid, linear rod, which they call "obelisks". This is not a product of the host genome, nor of any other known organism, but is rather some kind of extremely minimal pathogen that, like a transposon or self-splicing intron, is entirely nucleic-acid based. And the more they hunted, the more they found, ultimately finding thousands of obelisk-like entities hidden in the many databases of the world drawn from various environmental and microbiome samples. There is some precedent for this kind of structure, in the form of hepatitis D. This "viroid" of only 1682 nucleotides is a parasite of hepatitis B virus, depending on that virus for key replication functions. While normal viruses (like hepatitis B) encode many key functions of their own, like envelope proteins, genome packaging proteins, and replication enzymes, viroids tend to not encode anything, though hepatitis D does encode one antigenic protein, which exacerbates hepatitis B infections.

The obelisk RNA viroid-like species appear to encode one or two proteins, and possibly a ribozyme as well. The functions of all these are as yet unknown, but necessarily the RNAs rely entirely some host cell (currently unknown) functions to do their thing, such as the RNA polymerase to create copies of itself. Unknown also is whether they are dependent on other viruses, or only on cells for their propagation. Being just discovered, the researchers can do a great deal of bioinformatics, such as predicting the structure of the encoded protein, and the structure of the RNA genome. But key biology, like how they interact with host cells, what functions the host provides, and how they replicate, not to mention possible pathogenic consequences, remain unknown.

The highly self-complementary structure of one obelisk RNA sequence, leading to its identification and naming. In green is one reading frame, which codes for the main protein, of unknown function.

The curious thing about these new obelisk viroid-like RNAs is that, while common in human microbiomes, both oral and gut-derived, they are found only in 5-10% of them, not in all samples. This sort of suggests that they may account for some of the variability traceable to microbiomes, such as autoimmune issues, chronic ailments, nutritional variations, even effects on mood, etc.

Once a lot of databases were searched, obelisk RNAs turn up everywhere, even in some bacteria.

This work was done entirely in silico. Not a single wet-lab experiment was performed. It is a testament to the power of having alot of genomes at our disposal, and of modern computational firepower. This lab just had the idea that novel small viroid-like RNAs might exhibit certain types of (circular, self-complementary) structure, which led to this discovery of a novel form of "life". Are these RNAs alive? Certainly not. They are mere molecules and parasites that feed off, and transport themselves between, more fully functional cells. But they are part of the tapestry of life, which itself is wholly molecular, with many amazing emergent properties. Whether these obelisks turn out to have any medical or ecological significance, they are one more example of the lengths (and shorts) to which Darwinian selection has gone in the struggle for existence. 


Saturday, January 27, 2024

Evolutionary Elaboration of mRNA Splicing

An RNA helicase scoots through the spliceosome to advance the process of mRNA splicing. And some other tricks.

In our cells, virtually all mRNAs transcribed from DNA have to go through an editing process to cut out intervening junk called introns. This process is called splicing, and its evolutionary origin, later elaborations, and current mechanism are all quite interesting. Life didn't start with introns, and only eukaryotes have them as a regular feature of their genomes. They appear to have arrived with the bacterium that became our mitochondria, which come from a lineage that has (relatively few of) what are called group II self-splicing introns. These are RNA segments that behave a bit like transposons, being able to jump into DNA, and then reverse-transcribe that segment into a copy of itself in genomic DNA. 

The ur-eukaryote seems to have had an incredibly prolific infection, which left its host genome riddled with these bits of DNA. A key point is that, in group II introns, their splicing out of a transcribed RNA message is auto-catalytic- entirely mediated by their own RNA structure. They are self-propagating parasites, which have, over time in eukaryotes, been tamed to become fertile aspects of our own gene regulation and evolution. For example, introns often fall between protein domains, allowing these relatively compact modules of protein structure to be replicated, moved, and plugged, via rare mutational events, into new settings to contribute new functions to existing or novel proteins.

A map of a group II self-splicing intron. In red are the ends of the host RNA (or DNA) which are to be either jumped into or excised out of. The rest is the structure of the intron, which carries its own catalytic ability to do these reactions. This kind of thing is what appears to have turned into our own splicing and intron/exon systems, since the core catalytic mechanisms, such as the use of lariats and branch points and RNA catalysis, are the same.

Representation of the core spliceosomal reactions in eukaryotes, which result in a free lariat form of the excised intron, and the joined exons, which go off to code for their protein. "SS" stands for splice site. The sequences in red characterize introns.

The mechanism of intron excision in our cells is, at its core, still RNA-based, even though there are now also hundreds of proteins involved in the rather massive machinery of what is now called the spliceosome. It is clear that over evolutionary time, what was originally an unwelcome and shocking invasion of proto-mitochondrial introns into the proto-nuclear host genome has been regulated, speeded up, accessorized, and integrated into our normal method of gene expression. The spliceosome is the result- a huge and dynamic complex that uses key bits descended from the original RNA catalytic components to guide and catalyze the splicing reaction.

Representation of the core splicing reactions, with the key small RNAs added in (U1, U2, U4, U5, U6). These both guide (by direct RNA-RNA hybrid formation) and perform catalysis at the two chemical bond-breaking/reforming steps.

There are three key locations seized upon by the spliceosome. First is the 5 prime splice site- the end of the coding exon and beginning of the intron, typically a "G" nucleoside at the start of the intron. Second is the branch point, an "A" near the end of the intron, which is where the chemistry of splicing begins. And third is the 3 prime splice site- the end of the intron, with another "G" nucleoside, next to the beginning of the next coding exon. While the first two sites are specifically recognized by RNA components of the spliceosome, (U1 at the 5 prime splice site and U2 at the branch site), the 3 prime splice site is simply recognized by scanning for the first "AG" downstream of the branch site.

The first reaction is to bring the branch site and the 5 prime splice site in proximity, such that the branch site A covalently invades the G at that site and displaces it, releasing the exon end and forming a loop (called a lariat, in red above) in the intronic RNA. The second reaction is to bring the 5 prime exon end over to the 3 prime exon end, and similarly prompt and invasion that links them, displacing the intron entirely.

So simple to describe, but not so easy to do. Accuracy is paramount, since the three-codon reading frame of mRNA would be destroyed by even a 1 nucleoside error in splicing. Splicing now gates the export of mRNA from the nucleus, so that only fully and accurately spliced transcripts get out to the ribosomes in the cytoplasm for translation to protein. This gating has been considered by some the very reason that the nucleus exists at all- a way to solve some of the knotty problems that arose in very early eukaryotic evolution when all these introns invaded. 

Another reaction scheme of splicing, showing the key RNA and some other proteins along the way, principally the key helicases that help drive things forward. Note where PRP2/ATP comes into the picture, just as the complex is preparing for the first catalytic step.


Be that as it may, it is clear that the originally RNA-only mechanism changed over time by accreting proteins that each decided they had something useful to add to the process. At the same time, the RNA got separated into several pieces (on independent genes) that could then be carried and precisely manipulated by these helping proteins. The spliceosome now involves five distinct small RNAs and over 200 proteins, which engage in a complex ballet of sequential steps. A special class of these proteins, the helicases, are the subject of a recent paper that provides new structural information. Helicases are proteins that can use the power of ATP to unwind DNA or RNA, or just chug along it. At least eight such proteins participate in splicing. 

Structures from a recent paper, showing how PRP2 (at bottom, in violet) chugs its way along the mRNA intron (blue) into the very heart of the spliceosome complex, partially evicting the SF3B1 protein (green), among others, and prompting many other changes. At top is shown the 19 nucleoside stretch of the mRNA that was traversed, getting close to the branch site "A" in red. 

The paper makes the interesting observation that, structurally, most of the helicases reside on the periphery of the spliceosomal complexes, while the catalytic and guiding RNA are, naturally, at the center. They use a mutant form of one of them, Aquarius, to freeze spliceosomes in a key conformation just before the branch site and 5 prime splice sites are brought together. In combination with a bunch of other structural work by others in this and other labs, they show that one dynamic event is the tracking by a second helicase, PRP2 (violet, above), that brings it from its peripheral position (b, at bottom) along the intronic mRNA (blue strand) into the core of the splicesome near the U2 RNA (c; U2 is not shown here). They show that PRP2 traverses 19 nucleosides (top, a), a rather remarkable trip that forms part of the sequence of events that brings the branch site and 5 prime splice site close to each other.

Further structures, focusing on the catalytic site and RNAs. Note how the branch site (red, "BS-A") is, after the action of the Aquarius helicase, (third panel), brought in tightly close to the 5 prime splice site (green, "5'SS) in the C, or catalytic, complex. The U2 and U6 RNAs then have an easy job of bond-exchanging catalysis.

So it turns out that these helicases appear sometimes to be used as ratchets, that start on the outside of the complex. Once activated by some prior trigger, they pull on a thread in a way that helps to overall process forward. The progression of PRP2 into the spliceosome core evicts a bunch of other proteins and activates the other helicase Aquarius. That protein is likewise positioned perpipherally but is hanging onto another thread of the intronic RNA and helps to further push the branch and 5 prime splice sites together, in a way that finally leads to the desired reaction. Note in the image above that it is the RNAs that occupy the central reaction site- the intron in blue (green), and the U6 and U2 RNAs, which catalyze this first key reaction of splicing.

RNAs are not great catalysts, so it is understandable that, as in the case of translation by the ribosome, a bunch of proteins shoehorned their way into the process (over evolutionary time) in ways that evidently made splicing more accurate and more rapid. Indeed, yeast cells get along without the Aquarius protein at all, though they otherwise have a very similar splicing apparatus, showing that the accretion of proteins on the spliceosome did not end in very early stages of eukaryotic evolution, but continued through the origin of metazoans, and may still be continuing. The added proteins did this through using their talents for precise spatial positioning, and for the use of energy (from ATP) to drive things ahead, if only by intricate conformational ballet rather than direct catalysis.


  • "Ron DeSantis should be forced to carry his presidential campaign to term."
  • Nones are not nuns.
  • Medical errors and AI.
  • The worse the better... GOP edition.
  • Two minutes hate.

Saturday, December 23, 2023

How Does Speciation Happen?

Niles Eldredge and the theory of punctuated equilibrium in evolution.

I have been enjoying "Eternal Ephemera", which is an end-of-career memoir/intellectual history from a leading theorist in paleontology and evolution, Niles Eldredge. In this genre, often of epic proportions and scope, the author takes stock of the historical setting of his or her work and tries to put it into the larger context of general intellectual progress, (yes, as pontifically as possible!), with maybe some gestures towards future developments. I wish more researchers would write such personal and deeply researched accounts, of which this one is a classic. It is a book that deserves to be in print and more widely read.

Eldredge's claim to fame is punctuated equilibrium, the theory (or, perhaps better, observation) that evolution occurs much more haltingly than in the majestic gradual progression that Darwin presented in "Origin of Species". This is an observation that comes straight out of the fossil record. And perhaps the major point of the book is that the earliest biologists, even before Darwin, but also including Darwin, knew about this aspect of the fossil record, and were thus led to concepts like catastrophism and "etagen". Only Lamarck had a steadfastly gradualist view of biological change, which Darwin eventually took up, while replacing Lamarck's mechanism of intentional/habitual change with that of natural selection. Eldridge unearths tantalizing and, to him, supremely frustrating, evidence that Darwin was fully aware of the static nature of most fossil series, and even recognized the probable mechanism behind it (speciation in remote, peripheral areas), only to discard it for what must have seemed a clearer, more sweeping theory. But along the way, the actual mechanism of speciation got somewhat lost on the shuffle.

Punctuated equilibrium observes that most species recognized in the fossil record do not gradually turn into their descendents, but are replaced by them. Eldredge's subject of choice is trilobites, which have a long and storied record for almost 300 million years, featuring replacement after replacement, with species averaging a few million years duration each. It is a simple fact, but one that is a bit hard to square with the traditional / Darwinian and even molecular account of evolution. DNA is supposed to act like a "clock", with constant mutational change through time. And natural selection likewise acts everywhere and always... so why the stasis exhibited by species, and why the apparently rapid evolution in between replacements? That is the conundrum of punctuated equilibrium.

There have been lot of trilobites. This comes from a paper about their origin during the Cambrian explosion, arguing that only about 20 million years was enough for their initial speciation (bottom of image).

The equilibrium part, also termed stasis, is seen in the current / recent world as well as in the fossil record. We see species such as horses, bison, and lions that are identical to those drawn in cave paintings. We see fossils of animals like wildebeest that are identical to those living, going back millions of years. And we see unusual species in recent fossils, like saber-toothed cats, that have gone extinct. We do not typically see animals that have transformed over recent geological history from one (morphological) species into another, or really, into anything very different at all. A million years ago, wildebeest seem to have split off a related species, the black wildebeest, and that is about it.

But this stasis is only apparent. Beneath the surface, mutations are constantly happening and piling up in the genome, and selection is relentlessly working to ... do something. But what? This is where the equilibrium part comes in, positing that wide-spread, successful species are so hemmed in by the diversity of ecologies they participate in that they occupy a very narrow adaptive peak, which selection works to keep the species on, resulting in apparent stasis. It is a very dynamic equilibrium. The constant gene flow among all parts of the population that keeps the species marching forward as one gene pool, despite the ecological variability, makes it impossible to adapt to new conditions that do not affect the whole range. Thus, paradoxically, the more successful the species, and the more prominent it is in the fossil record, the less change will be apparent in those fossils over time.

The punctuated part is that these static species in the fossil record eventually disappear and are replaced by other species that are typically similar, but not the same, and do not segue from the original in a gradual way that is visible in the fossil record. No, most species and locations show sudden replacement. How can this be so if evolution by natural selection is true? As above, wide-spread species are limited in what selection can do. Isolated populations, however, are more free to adapt to local conditions. And if one of those local conditions (such as arctic cold) happens to be what later happens to the whole range (such as an ice age), then it is more likely that a peripherally (pre-)adapted population will take over the whole range, than that the resident species adapts with sufficient speed to the new conditions. Range expansion, for the peripheral species, is easier and faster than adaptation, for the wide-ranging originating species.

The punctuated equilibrium proposition came out in the 1970's, and naturally followed theories of speciation by geographic separation that had previously come out (also resurrected from earlier ideas) in the 1930's to 1950's, but which had not made much impression (!) on paleontologists. Paleontologists are always grappling with the difficulties of the record, which is partial, and does not preserve a lot of what we would like to know, like behavior, ecological relationships, and mutational history. But they did come to agree that species stasis is a real thing, not just, as Darwin claimed, an artifact of the incomplete fossil record. Granted- if we had fossils of all the isolated and peripheral locations, which is where speciation would be taking place by this theory, we would see the gradual change and adaptation taking place. So there are gaps in the fossil record, in a way. But as long as we look at the dominant populations, we will rarely see speciation taking place before our eyes, in the fossils.

So what does a molecular biologist have to say about all this? As Darwin insisted early in "Origin", we can learn quite a bit from domesticated animals. It turns out that wild species have a great amount of mostly hidden genetic variation. This is apparent whenever one is domesticated and bred for desired traits. We have bred dogs, for example, to an astonishingly wide variety of traits. At the same time, we have bred them out to very low genetic diversity. Many breeds are saddled with genetic defects that can not be resolved without outbreeding. So we have in essence exchanged the vast hidden genetic diversity of a wild species for great visible diversity in the domesticated species, combined with low genetic diversity.

What this suggests is that wild species have great reservoirs of possible traits that can be selected for the purposes of adaptation under selective conditions. Which suggests that speciation in range edges and isolated environments can be very fast, as the punctuated part of punctuated equilibrium posits. And again, it reinforces the idea that during equilibrium with large populations and ranges, species have plenty of genetic resources to adapt and change, but spend those resources reinforcing / fine tuning their core ecological "franchise", as it were.

In population genetics, it is well known that mutations arise and fix (that is, spread to 100% of the population on both alleles) at the same rate no matter how large the population, in theory. That is to say- bigger populations generate more mutations, but correspondingly hide them better in recessive form (if deleterious) and for neutral mutations, take much longer to allow any individual mutation to drift to either extinction or fixation. Selection against deleterious mutations is more relentless in larger populations, while relaxed selection and higher drift can allow smaller populations to explore wider ranges of adaptive space, perhaps finding globally higher (fitness) peaks than the parent species could find.

Eldredge cites some molecular work that claims that at least twenty percent of sequence change in animal lineages is due specifically to punctuational events of speciation, and not to the gradual background accumulation of mutations. What could explain this? The actual mutation rate is not at issue, (though see here), but the numbers of mutations retained, perhaps due to relaxed purifying selection in small populations, and founder effects and positive selection during the speciation process. This kind of phenomenon also helps to explain why the DNA "clock" mentioned above is not at all regular, but quite variable, making an uneven guide to dating the past.

Humans are another good example. Our species is notoriously low in genetic diversity, compared to most wild species, including chimpanzees. It is evident that our extremely low population numbers (over prehistoric time) have facilitated speciation, (that is, the fixation of variants which might be swamped in bigger populations), which has resulted in a bewildering branching pattern of different hominid forms over the last few million years. That makes fossils hard to find, and speciation hard to pinpoint. But now that we have taken over the planet with a huge population, our bones will be found everywhere, and they will be largely static for the foreseeable future, as a successful, wide-spread species (barring engineered changes). 

I think this all adds up to a reasonably coherent theory that reconciles the rest of biology with the fossil record. However, it remains frustratingly abstract, given the nature of fossils that rarely yield up the branching events whose rich results they record.


Saturday, December 16, 2023

Easy Does it

The eukaryotic ribosome is significantly slower than, and more accurate than, the bacterial ribosome.

Despite the focus, in molecular biology, on interesting molecules like genes and regulators, the most striking thing facing anyone who breaks open cells is the prevalence of ribosomes. Run the cellular proteins or RNAs out on a gel, and bulk of the material is always ribosomal proteins and ribosomal RNAs, along with tRNAs. That is because ribosomes are critically important, immense in size, and quite slow. They are sort of the beating heart of the cell- not the brains, not the energy source, but the big lumpy, ancient, shape-shifting object that pumps out another essential form of life-blood- all the proteins the cell needs to keep going.

With the revolution in structural biology, we have gotten an increasingly clear view of the ribosome, and a recent paper took it up another notch with a structural analysis of how tRNA handling works and how / why it is that the eukaryotic ribosome is about ten times slower than its bacterial progenitor. One of their figures provides a beautiful (if partial) view of each kind of ribosome, showing how well-conserved this structure is, despite the roughly three billion or more years that have elapsed since their divergence into the bacterial and archaeal lineages, from which the eukaryotic ribosome comes. 

Above, the human ribosome, and below, the ribosome of E. coli, a bacterium, in partial views. The perspective is from the back, relative to conventional views, and only a small amount of the large subunit (LSU) appears at the top of each structure, with more of the small subunit (SSU) shown below. Between them is the cleft where tRNAs bind, in a dynamic sequence of incoming rRNA at the A (acceptor) site, then catalysis of peptide bond addition at the P (peptidyl transfer) site, and ejection of the last tRNA at the E (ejection) site. In concert with the conveyor belt of tRNAs going through, the nascent protein is being synthesized in the large subunit and the mRNA is going by, codon by codon, in the small subunit. Note the overall conservation of structure, despite quite a bit of difference in detail.

The ribosome is an RNA machine at its core, with a lot of accessory proteins that were added later on. And it comes in two parts, the large and small subunits. These subunits do different things, do a lot of rolling about relative to each other, and bind a conveyor belt of tRNAs between them. The tRNAs are pre-loaded with an amino acid on one end (top) and an anticodon on the other end (bottom). They also come with a helper protein (EF-Tu in bacterial, eEF1A in eukaryotes), which plays a role later on. The anticodon is a set of three nucleotides that constitute the genetic code, whereby this tRNA is always going to match one codon to a particular amino acid. 

The ribosome doesn't care what the code is or which tRNA comes in. It only cares that the tRNA matches the mRNA held by the small subunit, as transcribed from the DNA. This process is called decoding, and the researchers show some of the differences that make it slower, but also more accurate, in eukaryotes. In bacteria, ribosomes can work at up to 20 amino acids per second, while human ribosomes top out at about 2 amino acids per second. That is pretty slow, for an enzyme! Its accuracy is about one error per thousand to ten thousand codons.

See text for description of this diagram of the ribosomal process. 50 S is the large ribosomal subunit in bacteria (60S in eukaryotes). 30S is the small subunit in bacteria (40S in eukaryotes). S stands for Svedberg units, a unit of sedimentation in high-speed centrifugation, which was used to study proteins at the dawn of molecular biology.

Above is diagrammed the stepwise logic of protein synthesis. The first step is that a tRNA comes in and lands on the empty A site, and tests whether its anticodon sequence fits the codon on the mRNA being threaded through the bottom. This fitting and testing is the key quality control process, and the slower and more selective it is, the more accurate the resulting translation. The EF-Tu/eEF1A+GTP protein holds on to the tRNA at the acceptor (A) position, and only when the fit is good does that fit communicate back up from the small subunit to the large subunit and cause hydrolysis of GTP to GDP, and release of the top of the tRNA, which allows it to swing into position (accommodation) to the catalytic site of the ribosome. This is where the tRNA contributes its amino acid to the growing protein chain. That chain, previously attached to the tRNA in the P site, now is attached to the tRNA in the A site. Now another GTP-binding protein comes in, EF-G (EEF2 in eukaryotes), which bumps the tRNA from the A site to the P site, and simultaneously the mRNA one codon ahead. This also releases whatever was in the E site of the ribosome and frees up the A site to accept another new tRNA.

See text for description. IC = initiation complex, CR = codon recognition complex, GA = GTPase activation complex, AC = accommodated complex. FRET = fluorescence resonance energy transfer. Head and shoulder refer to structural features of the small ribosomal subunit.

These researchers did both detailed structural studies of ribosomes stuck in various positions, and also mounted fluorescent labels at key sites in the P and A sites. These double labels allowed one to be flashed with light, (at its absorbance peak), and the energy to be transferred between them, resulting in fluorescence of light back out from the second fluorophore. The emitted energy from the second fluorophore provides an exquisitely sensitive measure of the distance between the two fluorophores, since its ability to capture light from the first fluorophore is sensitive to distance (cubed). The graph above (right) provides a trace of the fluorescence seen in one ribosomal cycle, as the distance between the two tRNAs changes slightly as the reaction proceeds and the two tRNAs come closer together. This technical method allows real-time analysis of the reaction as it is going along, especially one as slow as this one.

Structures of the ribosome accentuating the tRNA positions in the A, P, and E sites. Note how the green tRNA in the A site starts bent over towards the eEF1A GTPase (blue), as the decoding and quality control are going on, after which it is released and swings over next to the P site tRNA, ready for peptide bond formation. Note also how the structure of the anticodon-codon pairing (pink, bottom) evolves from loose and disordered to tight after the tRNA straightens up.

Above is shown a gross level view in stop-motion of ribosomal progress, achieved with various inhibitors and altered substrates. The mRNA is in pink (insets), and shows how the codon-anticodon match evolves from loose to tight. Note how at first only two bases of the mRNA are well-paired, while all three are paired later on. This reflects in a dim way the genetic code, which has redundancies in the third position for many amino acids, and is thought to have first had only two letters, before transitioning to three letters.

Higher detail on the structures of the tRNAs in the P site and the A site as they progress through the proof-reading phase of protein synthesis. The fluorescence probes are pictured, (Red and green dots), as is more the mRNA strand (pink).

These researchers have a great deal to say about the details of these structures- what differentiates the human from the E. coli ribosome, why the human one is slower and allows more time and more hindrance during the proof-reading step, thereby helping badly matched tRNAs to escape and increasing overall fidelity. For example, how does the GTPase eEF1A, docked to the large subunit, know when a match down at the codon-anticodon pair has been successful down in the small ribosomal subunit?

"Base pairing between the mRNA codon and the aa-tRNA anticodon stem loop (ASL) is verified through a network of ribosomal RNA (rRNA) and protein interactions within the SSU A site known as the decoding centre. Recognition of cognate aa-tRNA closes the SSU shoulder domain towards the SSU body and head domains. Consequent ternary complex engagement of the LSU GTPase-activating centre (GAC), including the catalytic sarcin-ricin loop12 (SRL), induces rearrangements in the GTPase, including switch-I and switch-II remodeling, that trigger GTP hydrolysis"

They note that there seem to be at least two proofreading steps, both in activating the eEF1A and also afterwards, during the large swing of the tRNA towards the P site. And they note novel rolling motions of the human ribosome compared with the bacterial ribosome, to help explain some of its distinctive proofreading abilities, which may be adjustable in humans by regulatory processes. Thus we are gaining ever more detailed window on the heart of this process, which is foundational to the origin of life, central to all cells, and not without medical implications, since many poisons that bacteria have devised attack the ribosome, and several of our current antibiotics do likewise.


Saturday, December 9, 2023

The Way We Were: Origins of Meiosis and Sex

Sex is as foundational for eukaryotes as are mitochondria and internal membranes. Why and how did it happen?

Sexual reproduction is a rather expensive proposition. The anxiety, the dating, the weddings- ugh! But biologically as well, having to find mates is no picnic for any species. Why do we bother, when bacteria get along just fine just dividing in two? This is a deep question in biology, with a lot of issues in play. And it turns out that bacteria do have quite a bit of something-like-sex: they exchange DNA with each other in small pieces, for similar reasons we do. But the eukaryotic form of sex is uniquely powerful and has supported the rapid evolution of eukaryotes to be by far the dominant domain of life on earth.

A major enemy of DNA-encoded life is mutation. Despite the many DNA replication accuracy and repair mechanisms, some rate of mutation still occurs, and is indeed essential for evolution. But for larger genomes, the mutation rate always exceeds the replication rate, (and the purifying natural selection rate), so that damaging mutations build up and the lineage will inevitably die out without some help. This process is called Muller's ratchet, and is why all organisms appear to exchange DNA with others in their environment, either sporadically like bacteria, or systematically, like eukaryotes.

An even worse enemy of the genome is unrepaired damage like complete (double strand) breaks in the DNA. These stop replication entirely, and are fatal. These also need to be repaired, and again, having extra copies of a genome is the way to allow these to be fixed, by processes like homologous recombination and gene conversion. So having access to other genomes has two crucial roles for organisms- allowing immediate repair, and allowing some way to sweep out deleterious mutations over the longer term.

Our ancestors, the archaea, which are distinct from bacteria, typically have circular, single molecule genomes, in multiple copies per cell, with frequent gene conversions among the copies and frequent exchange with other cells. They routinely have five to twenty copies of their genome, and can easily repair any immediate damage using those other copies. They do not hide mutant copies like we do in a recessive allele, but rather by gene conversion (which means, replicating parts of a chromosome into other ones, piecemeal) make each genome identical over time so that it (and the cell) is visible to selection, despite their polyploid condition. Similarly, taking in DNA from other, similar cells uses the target cells' status as live cells (also visible to selection) to insure that the recipients are getting high quality DNA that can repair their own defects or correct minor mutations. All this ensures that their progeny are all set up with viable genomes, instead of genomes riddled with defects. But it comes at various costs as well, such as a constant race between getting lethal mutation and finding the DNA that might repair it. 

Both mitosis and meiosis were eukaryotic innovations. In both, the chromosomes all line up for orderly segregation to descendants. But meiosis engages in two divisions, and features homolog synapsis and recombination before the first division of the parental homologs.

This is evidently a precursor to the process that led, very roughly 2.5 billion years ago, to eukaryotes, but is all done in a piecemeal basis, nothing like what we do now as eukaryotes. To get to that point, the following innovations needed to happen:

  • Linearized genomes, with centromeres and telomeres, and >1 number of chromosomes.
  • Mitosis to organize normal cellular division, where multiple chromosomes are systematically lined up and distributed 1:1 to daughter cells, using extensive cytoskeletal rearrangements and regulation.
  • Mating with cell fusion, where entire genomes are combined, recombined, and then reduced back to a single complement, and packaged into progeny cells.
  • Synapsis, as part of meiosis, where all sister homologs are lined up, damaged to initiate DNA repair and crossing-over.
  • Meiosis division one, where the now-recombined parental homologs are separated.
  • Meiosis division two, which largely follows the same mechanisms as mitosis, separating the reshuffled and recombined sister chromosomes.

This is a lot of novelty on the path to eukaryogenesis, and is just a portion of the many other innovations that happened in this lineage. What drove all this, and what were some plausible steps in the process? The advent of true sex generated several powerful effects:

  1. A definitive solution to Muller's ratchet, by exposing every locus in a systematic way to partial selection and sweeping out deleterious mutations, while protecting most members of the population from those same mutations. Continual recombination of the parental genomes allows beneficial mutations to separate from deleterious ones and be differentially preserved.
  2. Mutated alleles are partially, yet systematically, hidden as recessive alleles, allowing selection when they come into homozygous status, but also allowing them to exist for limited time to buffer the mutation rate and to generate new variation. This vastly increases accessible genetic variation.
  3. Full genome-length alignment and repair by crossing over is part of the process, correcting various kinds of damage and allowing accurate recombination across arbitrarily large genomes.
  4. Crossing over during meiotic synapsis mixes up the parental chromosomes, allowing true recombination among the parental genomes, beyond just the shuffling of the full-length chromosomes. This vastly increases the power of mating to sample genetic variation across the population, and generates what we think of as "species", which represent more or less closed interbreeding pools of genetic variants that are not clones but diverse individuals.

The time point of 2.5 billion years ago is significant because this is the general time of the great oxidation event, when cyanobacteria were finally producing enough oxygen by photosynthesis to alter the geology of earth. (However our current level of atmospheric oxygen did not come about until almost two billion years later, with rise of land plants.) While this mainly prompted the logic of acquiring mitochondria, either to detoxify oxygen or use it metabolically, some believe that it is relevant to the development of meiosis as well. 

There was a window of time when oxygen was present, but the ozone layer had not yet formed, possibly generating a particularly mutagenic environment of UV irradiation and reactive oxygen species. Such higher mutagenesis may have pressured the archaea mentioned above to get their act together- to not distribute their chromosomes so sporadically to offspring, to mate fully across their chromosomes, not just pieces of them, and to recombine / repair across those entire mated chromosomes. In this proposal, synapsis, as seen in meiosis I, had its origin in a repair process that solved the problem of large genomes under mutational load by aligning them more securely than previously. 

It is notable that one of the special enzymes of meiosis is Spo11, which induces the double-strand breaks that lead to crossing-over, recombination, and the chiasmata that hold the homologs together during the first division. This DNA damage happens at quite high rates all over the genome, and is programmed, via the structures of the synaptonemal complex, to favor crossing-over between (parental) homologs vs duplicate sister chromosomes. Such intensive repair, while now aimed at ensuring recombination, may have originally had other purposes.

Alternately, others suggest that it is larger genome size that motivated this innovation. This origin event involves many gene duplication events that ramified the capabilities of the symbiotic assemblage. Such gene dupilcations would naturally lead to recombinational errors in traditional gene conversion models of bacterial / archaeal genetic exchange, so there was pressure to generate a more accurate whole-genome alignment system that confined recombination to the precise homologs of genes, rather than to any similar relative that happened to be present. This led to the synapsis that currently is part of meiosis I, but it is also part of "parameiosis" systems on some eukaryotes, which, while clearly derived, might resemble primitive steps to full-blown meiosis.

It has long been apparent that the mechanisms of meiosis division one are largely derived from (or related to) the mechanisms used for mitosis, via gene duplications and regulatory tinkering. So these processes (mitosis and the two divisions of meiosis) are highly related and may have arisen as a package deal (along with linear chromosomes) during the long and murky road from the last archaeal ancestor and the last common eukaryotic ancestor, which possessed a much larger suite of additional innovations, from mitochondria to nuclei, mitosis, meiosis, cytoskeleton, introns / mRNA splicing, peroxisomes, other organelles, etc.  

Modeling of different mitotic/meiotic features. All cells modeled have 18 copies of a polypoid genome, with a newly evolved process of mitosis. Green = addition of crossing over / recombination of parental chromosomes, but no chromosome exchange. Red = chromosome exchange, but no crossing over. Blue = both crossing over and chromosome exchange, as occurs now in eukaryotes. The Y axis is fitness / survival and the X axis is time in generations after start of modeling.

A modeling paper points to the quantitative benefits of the mitosis when combined with the meiotic suite of innovations. They suggest that in a polyploid archaean lineage, the establishment of mitosis alone would have had revolutionary effects, ensuring accurate segregation of all the chromosomes, and that this would have enabled differentiation among those polyploid chromosome copies, since they would be each be faithfully transmitted individually to offspring (assuming all, instead of one, were replicated and transmitted). Thus they could develop into different chromosomes, rather than remain copies. This would, as above, encourage meiosis-like synapsis over the whole genome to align all the (highly similar) genes properly.

"Modeling suggests that mitosis (accurate segregation of sister chromosomes) immediately removes all long-term disadvantages of polyploidy."

Additional modeling of the meiotic features of chromosome shuffling, and recombination between parental chromosomes, indicates (shown above) that these are highly beneficial to long-term fitness, which can rise instead of decaying with time, per the various benefits of true sex as described above. 

The field has definitely not settled on one story of how meiosis (and mitosis) evolved, and these ideas and hypotheses are tentative at this point. But the accumulating findings that the archaea that most closely resemble the root of the eukaryotic (nuclear) tree have many of the needed ingredients, such as active cytoskeletons, a variety of molecular antecedents of ramified eukaryotic features, and now extensive polyploidy to go with gene conversion and DNA exchange with other cells, makes the momentous gap from archaea to eukaryotes somewhat narrower.