Showing posts with label naturalism. Show all posts
Showing posts with label naturalism. Show all posts

Saturday, December 30, 2023

Some Challenges of Biological Modeling

If modeling one small aspect of one cell is this difficult, how much more difficult is it to model whole cells and organisms?

While the biological literature is full of data / knowledge about how cells and organisms work, we remain far from true understanding- the kind of understanding that would allow computer modeling of their processes. This is both a problem of the kind of data, which is largely qualitative and descriptive, and also of amount- that countless processes and enzymes have never had their detailed characteristics evaluated. In the human genome, I would estimate that roughly half its genes have only been described (if at all) in the most rudimentary way, typically by loose analogy to similar ones. And the rest, when studied more closely, present all sorts of other interesting issues that deflect researchers from core data like their enzymatic rate constants and binding constants to other proteins, as might occur under a plethora of different modification, expression, and other regulatory conditions. 

Then how do we get to usable models of cellular activities? Typically, a lot of guessing is involved, to make anything that approaches a computer model. A recent paper offered a novel way to go down this path, which was to ignore all the rate constants and even interactions, and just focus on the measurements we can make more conveniently- whole metabolome assessments. These are experiments where mass spectrometry is used to evaluate the level of all the smaller chemicals in a cell. If such levels are known, perhaps at a few different conditions, then, these authors argue, we can derive models of their mutual regulation- disregarding all the details and just establishing that some sort of feedback system among these metabolic chemicals must exist to keep them at the observed concentrations.

Their experimental subject is a relatively understandable, but by no means simple, system- the management of iron concentrations in yeast cells. Iron is quite toxic, so keeping it at controlled concentrations and in various carefully-constructed complexes is important for any cell. It is used to make heme, which functions not only in hemoglobin, but in several core respiratory enzymes of mitochondria. It also gets placed into iron-sulfur clusters, which are used even more widely, in respiratory enzymes, in the DNA replication, transcription, protein synthesis, and iron assimilation machineries. It is iron's strong and flexible redox chemistry (and its ancient abundance in the rocks and fluids life evolved with) that make it essential as well as dangerous.

Author's model for iron use and regulation in yeast cells. Outside is on left, cytoplasm is blue, vacuole is green, and mitochondrion is yellow. See text below for abbreviations and description. O2 stands for the oxygen  molecule. The various rate constants R refer to the transition between each state or location.

Iron is imported from outside and forms a pool of free iron in the cytoplasm (FC, in the diagram above). From there, it can be stored into membrane-bound vacuoles (F2, F3), or imported to the mitochondria (FM), where it is corporated into iron-sulfur clusters and heme (FS). Some of the mitochondrially assembled iron-sulfur clusters are exported back out to the cytoplasm to be integrated to a variety of proteins there (CIA). This is indeed one of the most essential roles of mitochondria- needed even if metabolic respiration is for some reason not needed (in hypoxic or anaerobic conditions). If there is a dramatic overload of iron, it can build up as rust particles in the mitochondria (MP). And finally, the iron-sulfur complexes contribute to respiration of oxygen in mitochondria, and thus influence the respiration rate of the whole cell.

The task these authors set themselves was to derive a regulatory scheme using only the elements shown above, in combination with known levels of all the metabolites, under the conditions of 1) normal levels of iron, 2) low iron, and 3) a mutant condition- a defect in the yeast gene YFG1, which binds iron inside mitochondria and participates in iron-sulfur cluster assembly. A slew of differential equations later, and selection through millions of possible regulatory circuits, and they come up with the one shown above, where the red lines/arrows indicate positive regulation, and the red lines ending with bars indicate repression. The latter is typically feedback repression, such as of the import of iron, repressed by the amount already in the cell, in the FC pool. 

They show that this model provides accurate control of iron levels at all the various points, with stable behavior, no singularities or wobbling, and the expected responses to the various conditions. In low iron, the vacuole is emptied of iron, and in the mutant case, iron nanoparticles (MP) accumulate in the mitochondrion, due in part to excess amounts of oxygen admitted to the mitochondrial matrix, which in turn is due to defects in metabolic respiration caused by a lack of iron-sulfur clusters. What seemed so simple at the outset does have quite a few wrinkles!

The authors present their best regulatory scheme, selected from among millions, which provides accurate metabolite control in simulation, as shown by key transitions between conditions as shown here, one line per molecular species. See text and image above for abbreviations.


But note that none of this is actually biological. There are no transcription regulators, such as the AFT1/2 proteins known to regulate a large set of iron assimilation genes. There are no enzymes explicitly cited, and no other regulatory mechanisms like protein modifications, protein disposal, etc. Nor does the cytosolic level of iron actually regulate the import machinery- that is done by the level of iron-sulfur clusters in the mitochondria, as sensed by the AFT regulators, among other mechanisms.

Thus it is not all clear what work like this has to offer. It takes the known concentrations of metabolites (which can be ascertained in bulk) to create a toy system that accurately reproduces a very restricted set of variations, limited to what the researchers could assess elsewhere, in lab experiments. It does not inform the biology of what is going on, since it is not based on the biology, and clearly even contravenes it. It does not inform diseases associated with iron metabolism- in this case Friedreich's ataxia which is caused in humans by a gene related to YFH1- because again it is not biologically based. Knowing where some regulatory events might occur in theory, as one could have done almost as well (if not quantitatively!) on a cocktail napkin, is of little help when drugs need to be made against actual enzymes and actual regulators. It is a classic case of looking under the streetlight- working with the data one has, rather than the data one needs to do something useful.

"Like most ODE (ordinary differential equation)-based biochemical models, sufficient kinetic information was unavailable to solve the system rigorously and uniquely, whereas substantial concentration data were available. Relying on concentrations of cellular components increasingly makes sense because such quantitative concentration determinations are becoming increasingly available due to mass-spectrometry-based proteomic and metabolomics studies. In contrast, determining kinetic parameters experimentally for individual biochemical reactions remain an arduous task." ...

"The actual biochemical mechanisms by which gene expression levels are controlled were either too complicated to be employed in autoregulation, or they were unknown. Thus, we decided to augment every regulatable reaction using soft Heaviside functions as surrogate regulatory systems." ...

"We caution that applying the same strategy for selecting viable autoregulatory mechanisms will become increasing difficult computationally as the complexity of models increases."


But the larger point that motivated a review of this paper is the challenge of modeling a system so small as to be almost infinitesimal in the larger scheme of biology. If dedicated modelers, as this laboratory is, dispair of getting the data they need for even such a modest system, (indeed, the mitochondrial iron and sulfur-containing signaling compound that mediates repression of the AFT regulators is still referred to in the literature as "X-S"), then things are bleak indeed for the prospect of modeling higher levels of biology, such as whole cells. Unknowns are unfortunately gaping all over the place. As has been mentioned a few times, molecular biologists tend to think in cartoons, simplifying the relations they deal with to the bare minimum. Getting beyond that is going to take another few quantum leaps in data- the vaunted "omics" revolutions. It will also take better interpolation methods (dare one invoke AI?) that use all the available scraps of biology, not just mathematics, in a Bayesian ratchet that provides iteratively better models. 


Saturday, December 9, 2023

The Way We Were: Origins of Meiosis and Sex

Sex is as foundational for eukaryotes as are mitochondria and internal membranes. Why and how did it happen?

Sexual reproduction is a rather expensive proposition. The anxiety, the dating, the weddings- ugh! But biologically as well, having to find mates is no picnic for any species. Why do we bother, when bacteria get along just fine just dividing in two? This is a deep question in biology, with a lot of issues in play. And it turns out that bacteria do have quite a bit of something-like-sex: they exchange DNA with each other in small pieces, for similar reasons we do. But the eukaryotic form of sex is uniquely powerful and has supported the rapid evolution of eukaryotes to be by far the dominant domain of life on earth.

A major enemy of DNA-encoded life is mutation. Despite the many DNA replication accuracy and repair mechanisms, some rate of mutation still occurs, and is indeed essential for evolution. But for larger genomes, the mutation rate always exceeds the replication rate, (and the purifying natural selection rate), so that damaging mutations build up and the lineage will inevitably die out without some help. This process is called Muller's ratchet, and is why all organisms appear to exchange DNA with others in their environment, either sporadically like bacteria, or systematically, like eukaryotes.

An even worse enemy of the genome is unrepaired damage like complete (double strand) breaks in the DNA. These stop replication entirely, and are fatal. These also need to be repaired, and again, having extra copies of a genome is the way to allow these to be fixed, by processes like homologous recombination and gene conversion. So having access to other genomes has two crucial roles for organisms- allowing immediate repair, and allowing some way to sweep out deleterious mutations over the longer term.

Our ancestors, the archaea, which are distinct from bacteria, typically have circular, single molecule genomes, in multiple copies per cell, with frequent gene conversions among the copies and frequent exchange with other cells. They routinely have five to twenty copies of their genome, and can easily repair any immediate damage using those other copies. They do not hide mutant copies like we do in a recessive allele, but rather by gene conversion (which means, replicating parts of a chromosome into other ones, piecemeal) make each genome identical over time so that it (and the cell) is visible to selection, despite their polyploid condition. Similarly, taking in DNA from other, similar cells uses the target cells' status as live cells (also visible to selection) to insure that the recipients are getting high quality DNA that can repair their own defects or correct minor mutations. All this ensures that their progeny are all set up with viable genomes, instead of genomes riddled with defects. But it comes at various costs as well, such as a constant race between getting lethal mutation and finding the DNA that might repair it. 

Both mitosis and meiosis were eukaryotic innovations. In both, the chromosomes all line up for orderly segregation to descendants. But meiosis engages in two divisions, and features homolog synapsis and recombination before the first division of the parental homologs.

This is evidently a precursor to the process that led, very roughly 2.5 billion years ago, to eukaryotes, but is all done in a piecemeal basis, nothing like what we do now as eukaryotes. To get to that point, the following innovations needed to happen:

  • Linearized genomes, with centromeres and telomeres, and >1 number of chromosomes.
  • Mitosis to organize normal cellular division, where multiple chromosomes are systematically lined up and distributed 1:1 to daughter cells, using extensive cytoskeletal rearrangements and regulation.
  • Mating with cell fusion, where entire genomes are combined, recombined, and then reduced back to a single complement, and packaged into progeny cells.
  • Synapsis, as part of meiosis, where all sister homologs are lined up, damaged to initiate DNA repair and crossing-over.
  • Meiosis division one, where the now-recombined parental homologs are separated.
  • Meiosis division two, which largely follows the same mechanisms as mitosis, separating the reshuffled and recombined sister chromosomes.

This is a lot of novelty on the path to eukaryogenesis, and is just a portion of the many other innovations that happened in this lineage. What drove all this, and what were some plausible steps in the process? The advent of true sex generated several powerful effects:

  1. A definitive solution to Muller's ratchet, by exposing every locus in a systematic way to partial selection and sweeping out deleterious mutations, while protecting most members of the population from those same mutations. Continual recombination of the parental genomes allows beneficial mutations to separate from deleterious ones and be differentially preserved.
  2. Mutated alleles are partially, yet systematically, hidden as recessive alleles, allowing selection when they come into homozygous status, but also allowing them to exist for limited time to buffer the mutation rate and to generate new variation. This vastly increases accessible genetic variation.
  3. Full genome-length alignment and repair by crossing over is part of the process, correcting various kinds of damage and allowing accurate recombination across arbitrarily large genomes.
  4. Crossing over during meiotic synapsis mixes up the parental chromosomes, allowing true recombination among the parental genomes, beyond just the shuffling of the full-length chromosomes. This vastly increases the power of mating to sample genetic variation across the population, and generates what we think of as "species", which represent more or less closed interbreeding pools of genetic variants that are not clones but diverse individuals.

The time point of 2.5 billion years ago is significant because this is the general time of the great oxidation event, when cyanobacteria were finally producing enough oxygen by photosynthesis to alter the geology of earth. (However our current level of atmospheric oxygen did not come about until almost two billion years later, with rise of land plants.) While this mainly prompted the logic of acquiring mitochondria, either to detoxify oxygen or use it metabolically, some believe that it is relevant to the development of meiosis as well. 

There was a window of time when oxygen was present, but the ozone layer had not yet formed, possibly generating a particularly mutagenic environment of UV irradiation and reactive oxygen species. Such higher mutagenesis may have pressured the archaea mentioned above to get their act together- to not distribute their chromosomes so sporadically to offspring, to mate fully across their chromosomes, not just pieces of them, and to recombine / repair across those entire mated chromosomes. In this proposal, synapsis, as seen in meiosis I, had its origin in a repair process that solved the problem of large genomes under mutational load by aligning them more securely than previously. 

It is notable that one of the special enzymes of meiosis is Spo11, which induces the double-strand breaks that lead to crossing-over, recombination, and the chiasmata that hold the homologs together during the first division. This DNA damage happens at quite high rates all over the genome, and is programmed, via the structures of the synaptonemal complex, to favor crossing-over between (parental) homologs vs duplicate sister chromosomes. Such intensive repair, while now aimed at ensuring recombination, may have originally had other purposes.

Alternately, others suggest that it is larger genome size that motivated this innovation. This origin event involves many gene duplication events that ramified the capabilities of the symbiotic assemblage. Such gene dupilcations would naturally lead to recombinational errors in traditional gene conversion models of bacterial / archaeal genetic exchange, so there was pressure to generate a more accurate whole-genome alignment system that confined recombination to the precise homologs of genes, rather than to any similar relative that happened to be present. This led to the synapsis that currently is part of meiosis I, but it is also part of "parameiosis" systems on some eukaryotes, which, while clearly derived, might resemble primitive steps to full-blown meiosis.

It has long been apparent that the mechanisms of meiosis division one are largely derived from (or related to) the mechanisms used for mitosis, via gene duplications and regulatory tinkering. So these processes (mitosis and the two divisions of meiosis) are highly related and may have arisen as a package deal (along with linear chromosomes) during the long and murky road from the last archaeal ancestor and the last common eukaryotic ancestor, which possessed a much larger suite of additional innovations, from mitochondria to nuclei, mitosis, meiosis, cytoskeleton, introns / mRNA splicing, peroxisomes, other organelles, etc.  

Modeling of different mitotic/meiotic features. All cells modeled have 18 copies of a polypoid genome, with a newly evolved process of mitosis. Green = addition of crossing over / recombination of parental chromosomes, but no chromosome exchange. Red = chromosome exchange, but no crossing over. Blue = both crossing over and chromosome exchange, as occurs now in eukaryotes. The Y axis is fitness / survival and the X axis is time in generations after start of modeling.

A modeling paper points to the quantitative benefits of the mitosis when combined with the meiotic suite of innovations. They suggest that in a polyploid archaean lineage, the establishment of mitosis alone would have had revolutionary effects, ensuring accurate segregation of all the chromosomes, and that this would have enabled differentiation among those polyploid chromosome copies, since they would be each be faithfully transmitted individually to offspring (assuming all, instead of one, were replicated and transmitted). Thus they could develop into different chromosomes, rather than remain copies. This would, as above, encourage meiosis-like synapsis over the whole genome to align all the (highly similar) genes properly.

"Modeling suggests that mitosis (accurate segregation of sister chromosomes) immediately removes all long-term disadvantages of polyploidy."

Additional modeling of the meiotic features of chromosome shuffling, and recombination between parental chromosomes, indicates (shown above) that these are highly beneficial to long-term fitness, which can rise instead of decaying with time, per the various benefits of true sex as described above. 

The field has definitely not settled on one story of how meiosis (and mitosis) evolved, and these ideas and hypotheses are tentative at this point. But the accumulating findings that the archaea that most closely resemble the root of the eukaryotic (nuclear) tree have many of the needed ingredients, such as active cytoskeletons, a variety of molecular antecedents of ramified eukaryotic features, and now extensive polyploidy to go with gene conversion and DNA exchange with other cells, makes the momentous gap from archaea to eukaryotes somewhat narrower.


Saturday, November 25, 2023

Are Archaea Archaic?

It remains controversial whether the archaeal domain of life is 1 or 4.5 billion years old. That is a big difference!

Back in the 1970's, the nascent technologies of molecular analysis and DNA sequencing produced a big surprise- that hidden in the bogs and hot springs of the world are micro-organisms so extremely different from known bacteria and protists that they were given their own domain on the tree of life. These are now called the archaea, and in addition to being deeply different from bacteria, they were eventually found to be the progenitors of eukaryotic cell- the third (and greatest!) domain of life that arose later in the history of the biosphere. The archaeal cell contributed most of the nuclear, informational, membrane management, and cytoskeletal functions, while one or more assimilated bacteria (most prominently the future mitochondrion and chloroplast) contributed most of the metabolic functions, as well as membrane lipid synthesis and peroxisomal functions.

Carl Woese, who discovered and named archaea, put his thumb heavily on the scale with that name, (originally archaebacteria), suggesting that these new cells were not just an independent domain of life, totally distinct from bacteria, but were perhaps the original cell- that is, the LUCA, or last universal common ancestor. All this was based on the sequences of rRNA genes, which form the structural and catalytic core of the ribosome, and are conserved in all known life. But it has since become apparent that sequences of this kind, which were originally touted as "molecular clocks", or even "chronometers" are nothing of the kind. They bear the traces of mutations that happen along the way, and, being highly important and conserved, do not track the raw mutation rate, (which itself is not so uniform either), but rather the rate at which change is tolerated by natural selection. And this rate can be wildly different at different times, as lineages go through crises, bottlenecks, adaptive radiations, and whatever else happened in the far, far distant past.

Carl Woese, looking over filmed spots of 32P labeled ribosomal RNA from different species, after size separation by electrophoresis. This is how RNAs were analyzed, back in 1976, and such rough analysis already suggested that archaea were something very different from bacteria.

There since has been a tremendous amount of speculation, re-analysis, gathering of more data, and vitriol in the overall debate about the deep divergences in evolution, such as where eukaryotes come from, and where the archaea fit into the overall scheme. Compared with the rest of molecular biology, where experiments routinely address questions productively and efficiently due to a rich tool chest and immediate access to the subject at hand, deep phylogeny is far more speculative and prone to subjective interpretation, sketchy data, personal hobbyhorses, and abusive writing. A recent symposium in honor of one of its more argumentative practitioners made that clear, as his ideas were being discarded virtually at the graveside.

Over the last decade, estimates of the branching date of archaea from the rest of the tree of life have varied from 0.8 to 4.5 Gya (billion years ago). That is a tremendous range, and is a sign of the difficulty of this field. The frustrations of doing molecular phylogeny are legion, just as the temptations are alluring. Firstly, there are very few landmarks in the fossil record to pin all this down. There are stromatolites from roughly 3.5 Gya, which pin down the first documented life of any kind. Second are eukaryotic fossils, which start, at the earliest, about 1.5 Gya. Other microbial fossils pin down occasional sub-groups of bacteria, but archaea are not represented in the fossil record at all, being hardly distinguishable from bacteria in their remains. Then we get the Cambrian explosion of multicellular life, roughly 0.5 Gya. That is pretty much it for the fossil record, aside from the age of the moon, which is about 4.5 Gya and gives us the baseline of when the earth became geologically capable of supporting life of any kind.

The molecules of living organisms, however, form a digital record of history. Following evolutionary theory, each organism descends from others, and carries, in mutated and altered form, traces of that history. We have parts of our genomes that vary with each generation, (useful for forensics and personal identification), we have other parts that show how we changed and evolved from other apes, and we have yet other areas that vary hardly at all- that carry recognizable sequences shared with all other forms of life, and presumably with LUCA. This is a real treasure trove, if only we can make sense of it.

But therein lies the rub. As mentioned above, these deeply conserved sequences are hardly chronometers. So for all the data collection and computer wizardry, the data itself tells a mangled story. Rapid evolution in one lineage can make it look much older than it really is, confounding the whole tree. Over the years, practitioners have learned to be as judicious as possible in selecting target sequences, while getting as many as possible into the mix. For example, adding up the sequences of 50-odd ribosomal proteins can give more and better data than assembling the 2 long-ish ribosomal RNAs. They provide more and more diverse data. But they have their problems as well, since some are much less conserved than others, and some were lost or gained along the way. 

A partisan of the later birth of archaea provides a phylogenetic tree with countless microbial species, and one bold claim: "inflated" distances to the archaeal and eukaryotic stems. This is given as the reason that archaea (lower part of the diagram, including eukaryotes, termed "archaebacteria"), looks very ancient, but really just sped away from its originating bacterial parent, (the red bacteria), estimated at about 1 Gya. This tree is based on an aligned concatentation of 26 universally conserved ribosomal protein sequences, (51 from eukaryotes), with custom adjustments.

So there has been a camp that claims that the huge apparent / molecular distance between the archaea and other cells is just such a chimera of fast evolution. Just as the revolution that led to the eukaryotic cell involved alot of molecular change including the co-habitation of countless proteins that had never seen each other before, duplications / specializations, and many novel inventions, whatever process led to the archaeal cell (from a pre-existing bacterial cell) might also have caused the key molecules we use to look into this deep time to mutate much more rapidly than is true elsewhere in the vast tree of life. What are the reasons? There is the general disbelief / unwillingness to accept someone else's work, and evidence like possible horizontal transfers of genes from chloroplasts to basal archaea, some large sequence deletion features that can be tracked through these lineages and interpreted to support late origination, some papering over of substantial differences in membrane and metabolic systems, and there are plausible (via some tortured logic) candidates for an originating, and late-evolving, bacterial parent. 

This thread of argument puts the origin of eukaryotes roughly at 0.8 Gya, which is, frankly, uncomfortably close to the origination of multicellular life, and gives precious little time for the bulk of eukaryotic diversity to develop, which exists largely, as shown above, at the microbial level. (Note that "Animalia" in the tree above is a tiny red blip among the eukaryotes.) All this is quite implausible, even to a casual reader, and makes this project hard to take seriously, despite its insistent and voluminous documentation.

Parenthetically, there was a fascinating paper that used the evolution of the genetic code itself to make a related point, though without absolute time attributions. The code bears hallmarks of some amino acids being added relatively late (tryptophan, histidine), while others were foundational from the start (glycine, alanine), when it may have consisted of two RNA bases (or even one) rather than three. All of this took place long before LUCA, naturally. This broad analysis of genetic code usage argued that bacteria tend to use a more ancient subset of the code, which may reflect their significantly more ancient position on the tree of life. While the full code was certainly in place by the time of LUCA, there may still at this time have been, in the inherited genome / pool of proteins, a bias against the relatively novel amino acids. This finding implies that the time of archaeal origination was later than the origination of bacteria, by some unspecified but significant amount.

So, attractive as it would be to demote the archaea from their perch as super-ancient organisms, given their small sizes, small genomes, specialization in extreme environments, and peripheral ecological position relative to bacteria, that turns out to be difficult to do. I will turn, then, to a very recent paper that gives what I think is much more reasoned and plausible picture of the deeper levels of the tree of life, and the best general picture to date. This paper is based on the protein sequences of the rotary ATPases that are universal, and were present in LUCA, despite their significant complexity. Indeed, the more we learn about LUCA, the more complete and complex this ancestor turns out to be. Our mitochondrion uses a (bacterial) F-type ATPase to synthesize ATP from the food-derived proton gradient. Our lysosomes use a (archaeal) V-type ATPase to drive protons into / acidify the lysosome in exchange for ATP. These are related, derived from one distant ancestor, and apparently each was likely to have been present in LUCA. Additionally, each ATPase is composed of two types of subunits, one catalytic, and one non-catalytic, which originated from an ancient protein duplication, also prior to LUCA. The availability of these molecular cousins / duplications provides helpful points of comparison throughout, particularly for locating the root of the evolutionary tree.

Phylogenetic trees based on ATP synthase enzymes that are present in all forms of life. On left is shown the general tree, with branch points of key events / lineages. On right are shown sub-trees for the major types of the ATP synthase, whether catalytic subunit (c), non-catalytic (n), F-type, common in bacteria, or V type, common in archaea. Note how congruent these trees are. At bottom right in the tiny print is a guide to absolute time, and the various last common ancestors.

This paper also works quite hard to pin the molecular data to the fossil and absolute time record, which is not always provided The bottom line is that archaea by this tree arise quite early, (see above), co-incident with or within about 0.5 Gy of LUCA, which was bacterial, at roughly 4.4 Gya. The bacterial and archaeal last common ancestors are dated to 4.3 and 3.7 Gya, respectively. The (fused) eukaryotic last common ancestor dates to about 1.9 Gya, with the proto-mitochondrion's individual last common ancestor among the bacteria some time before that, at roughly 2.4 Gya. 

This time line makes sense on many fronts. First, it provides a realistic time frame for the formation and diversification of eukaryotes. It puts their origin right around the great oxidation event, which is when oxygen became dominant in earth's atmosphere, (about 2 to 2.4 Gya), which was a precondition for the usefulness of mitochondria to what are otherwise anaerobic archaeal cells. It places the origin of archaea (LACA) a substantial stretch after the origin of bacteria, which agrees with the critic's points above that bacteria are the truly basal lineage of all life, and archaea, while highly different and pretty archaic, also share a lot of characteristics with bacteria, and perhaps more so with certain early lineages than with others that came later. The distinction between LUCA and the last common bacterial ancestor (LBCA) is a technical one given the trees they were working from, and are not, given the ranges of age presented, (see figure above), significantly different.

I believe this field is settling down, and though this paper, working from only a subset of the most ancient sequences plus fossil set-points, is hardly the last word, it appears to represent a consensus view and is the best picture to date of the deepest and most significant waypoints in the deep history of life. This is what comes from looking through microscopes, and finding entire invisible worlds that we had no idea existed. Genetic sequencing is another level over that of microscopy, looking right at life's code, and at its history, if darkly. What we see in the macroscopic world around us is only the latest act in a drama of tremendous scale and antiquity.


Sunday, November 12, 2023

Missing Links in Eukaryotic Evolution

The things you find in Slovenian mud! Like an archaeal cell that is the closest thing to the eukaryotic root organism.

Creationists and "intelligent" design advocates tirelessly point to the fossil record. Not how orderly it is and revealing of the astonishingly sequenced, slow, and relentless elaboration of life. No, they decry its gaps- places where fossils do not account for major evolutionary (er, designed) transitions to more modern forms. It is a sad kind of argument, lacking in imagination and dishonest in its unfairness and hypocrisy. Does the life of Jesus have gaps in the historical record? Sure enough! And are those historical records anywhere near as concrete and informative as fossils? No way. What we have as a record of Christianity's history is riven with fantasy, forgery, and uncertainty.

But enough trash talk. One thing that science has going for it is a relentlessly accumulating process by which new fossils appear, and new data from other sources, like newly found organisms and newly sequenced genomes, arise to clarify what were only imaginative (if reasonable) hypotheses previously. Darwin's theory of evolution, convincing and elegantly argued as it was originally, has gained such evidence without fail over the subsequent century and a half, from discoveries of the age of the earth (and thus the solar system) to the mechanics of genetic inheritance.

A recent paper describes the occurence of cytoskeletal proteins and structures in an organism that is neither a bacterium nor a eukaryote, but appears to be within the family of Archaea that is the closest thing we have to the eukaryotic progenitor. These are the Asgard Archaea, a family that was discovered only in the last decade, as massive environmental sequencing projects have sampled the vast genetic diversity hidden in the muds, sediments, soils, rocks, and waters of the world. 

Sampling stray DNA is one thing, but studying these organisms in depth requires growing them in the lab. After trolling through the same muds in Slovenia where promising DNA sequences were fond, this group fished out, and then carefully cultured, a novel archaeal cell. But growing these cells is notoriously difficult. They are anaerobic, never having made the transition to the oxygenated atmosphere of the later earth. They have finicky nutritional requirements. They grow very slowly. And they generally have to live with other organisms (bacteria) with which they have reciprocal metabolic relationships. In the ur-eukaryote, this was a relationship with the proto-mitochondrion, which was later internalized. For the species cultured by this research group, it is a pair of other free-living bacteria. One is related to sulfur-reducing Desulfovibrio, and the other one is related to a simpler archaeal Methanogenium that uses hydrogen and CO2 or related simple carbon compounds to make methane. Anaerobic Asgard archaea generally have relatively simple metabolisms and make hydrogen from small organic compounds, through a kind of fermentation.

A phylogenetic tree showing relations between the newly found organisms (bottom) and eukaryotes (orange), other archaea, and the entirely separate domain of bacteria (red). This is based on a set of sequences of universally used / conserved ribosomal proteins. While the eukaryotes have strayed far from the root, that root is extremely close to some archaeal groups.

Micrographs of cultured lokiarchaeal cells, with a scale bar of 500 nanometers. These are rather amoeboid cells with extensive cytoskeletal and membrane regulation.

Another micrograph of part of a lokiarchaeal cell, showing not just its whacky shape, but a good bit of internal structure as well. The main scale bar is 100 nanometers. There are internal actin filaments (yellow arrowheads), lined up ribosomes (gray arrowhead) and cell surface proteins of some kind (blue arrowheads).

What they found after all this was pretty astonishing. They found cells that are quite unlike typical bacterial or even archaeal cells, which are compact round or rod shapes. These (termed lokiarchaeal) cells have luxurious processes extending all over the place, and a profusion of internal structural elements reminiscent of eukaryotic cells, though without membrane-bound internal organelles. But they have membrane-bound protrusions and what look like vesicles budding off. At only six million base pairs (compared to our three billion) and under five thousand genes, these cells have a small and streamlined genome. Yet there are a large number (i.e. 258) of eukaryotic-related (signature) proteins (outlined below), particularly concerning cytoskeletal and membrane trafficking. The researchers delved into the subcellular structures, labeling actin and obtaining structural data for both actin and ribosomes, confirming their archaeal affinity with added features. 

A schematic of eukaryotic-like proteins in the newly cultured lokiarchaeal Asgard genome. Comparison (blue) is to a closely related organism isolated recently in Japan.


This work is the first time that the cytoskeleton of Asgard cells has been visualized, along with its role in their amoeboid capabilities. What is it used for? That remains unknown. The lush protrusions may collaborate with this organism's metabolic partners, or be used for sensing and locomoting to find new food within its sediment habitat, or for interacting with fellow lokiarchaeal cells, as shown above. Or all of these roles. Evolutionarily, this organism, while modern, appears to be a descendent of the closest thing we have to the missing link at the origin of eukaryotes, (that is, the archaeal dominant partner of the founding symbiosis), and in that sense seems both ancient in its characteristics, and possibly little changed from that time. Who would have expected such a thing? Well, molecular biologists and evolutionary biologists have been expecting it for a long time.


  • Fossil fuel consumption is still going up, not down.

Saturday, October 21, 2023

One Pump to Rule ... a Tiny Vesicle

Synaptic vesicles are powered by a single pump that has two speeds- on and off.

While some neural circuits are connected by direct electrical contact, via membrane pores, most use a synapse, where the electrical signal stops, gets turned into secretion of a neurotransmitter molecule, which crosses to the next cell, where receptors pick it up and boot up a new electrical signal. A slow and primitive system, doubtless thanks to some locked-in features of our evolutionary history. But it works, thanks to a lot of improvements and optimization over the eons.

The neurotransmitters, of which there are many types, sit ready and waiting at the nerve terminals in synaptic vesicles, which are tiny membrane bags that are specialized to hold high concentrations of their designated transmitter, and to fuse rapidly with the (pre-) synaptic membrane of their nerve terminal, to release their contents when needed, into the synaptic cleft between the two neurons. While the vesicle surfaces are mostly composed of membranes, it is the suite of proteins on their surfaces that provide all the key functions, such as transport of neurotransmitters, sensing of the activating nerve impulse (voltage), fusing with the plasma membrane, and later retrieval of the fused membrane patches/proteins and recycling into new synaptic vesicles.

Experimental scheme- synaptic vesicles are loaded with a pH-sensitive fluorescent dye that tells how the V-ATPase (pink) is doing pumping protons in, powered by ATP from the cytoplasm. The proton gradient is then used by the other transporters in the synaptic vesicle (brown) to load it with its neurotransmitter.

The neurotransmitters of whatever type are loaded into synaptic vesicles by proton antiporter pumps. That is, one or two protons are pumped out in exchange for a molecule of the transmitter being pumped in. They are all proton-powered. And there is one source of that power, an ATP-using proton pump called a V-type ATPase. These ATPases are deeply related to the F-type ATP synthase that does the opposite job, in mitochondria, making ATP from the proton gradient that mitochondria set up from our oxygen-dependent respiration / burning of food. Both are rotors, which spin around as they carefully let protons go by, while a separate domain of the protein- attached via stator and rotor segments- makes or breaks down ATP, depending on the direction of rotation. Both enzymes can go in either direction, as needed, to pump protons either in or out, and traverse the reaction ADP <=> ATP. It is just an evolutionary matter of duplication and specialization that the V-type and F-type enzymes have taken separate paths and turn up where they do.

Intriguingly, synaptic vesicles are each served by one V-type ATPase. One is enough. That means that one molecule has to flexibly respond to variety of loads, from the initial transmitter loading, to occasional replenishment and lots of sitting around. A recent paper discussed the detailed function of the V-type ATPase, especially how it handles partial loads and resting states. For the vesicles spend most of their time full, waiting for the next nerve impulse to come along. The authors find that this ATPase has three states it switches between- pumping, resting, and leaking. 

Averaging over many molecules/vesicles, the V-type ATPase pump operates as expected. Add ATP, and it acidifies its vesicle. The Y-axis is the fluorescent signal of proton accumulation in the vesicle. Then when a poison of the ATPase is added (bafilomycin), the gradient dissipates in a few minutes.

They isolate synaptic vesicles directly from rat brains and then fuse them with smaller experimental vesicles that contain a fluorescent tracer that is sensitive to pH- just the perfect way to monitor what is going on in each vesicle, given a powerful enough microscope. The main surprise was the stochastic nature of the performance of single pumps. Comparing the average of hundreds of vesicles (above) with a trace from a single vesicle (below) shows a huge difference. The single vesicle comes up to full acidity, but then falls back for long stretches of time. These vesicles are properly loaded and maintained on average, but individually, they are a mess, falling back to pH / chemical baseline with alarming frequency.


On the other hand, at the single molecule level, the pump is startlingly stochastic. Over several hours, it pumps its vesicle full of protons, then quits, then restarts several times.

The authors checked that the protons had no other way out that would look like this stochastic unloading event, and concluded that the loss of protons was monotonic, thus due to general leakage, not some other channel that occasionally opens to let out a flood of protons. But then they added an inhibitor that blocks the V-ATPase, which showed that particularly (and peculiarly) rapid events of proton leakage come from the V-ATPase, not general membrane leakage. They have a hard time explaining this, discounting various theories such that it represents ATP synthesis (a backwards reaction, in the face of overwhelming ratios of ATP/ADP in their experiment), or that the inactive mode of the pump can switch to a leakage mode, or that the pump naturally leaks a bit while it operates in the forward direction. It appears that only while the pump is on and churning through ATP, it can occasionally fail catastrophically and leak out a flood of protons. But then it can go on as if nothing had happened and either keep pumping or take a rest break.

Regulation by ATP is relatively minor, with a flood of ATP helping keep the pump more active longer. But physiological concentrations tend to be stable, so not very influential for pumping rates. These are two separate individual pumps/vesicles shown, top and bottom. It is good to see the control- the first segment of time when no ATP was present and the pump could not run at all. But then look at the bottom middle trace- plenty of ATP, but nothing going on- very odd. Lastly, the sudden unloading seen in some of these traces (bottom right) is attributed to an extremely odd leakage state of the same V-ATPase pump. Not something you want to see, generally.

The main finding is that this pump has quite long dwell times (3 minutes or so) under optimal conditions, and switches with this time period between active pumping and an inactive resting state. And that the pumping dwell time is mostly regulated, not by the ambient ATP concentration, but by the proton gradient, which is expressed by some combination of the charge differential across the vesicle membrane and the relative proton concentration gradient (the chemical gradient). It is a bit like a furnace, which has only two speeds- on or off, though in this case the thermostat is pretty rough. They note that other researchers have noted that synaptic vesicles seem to have quite variable amounts of transmitter, which must derive from the variability of this pump seen here. But averaged over the many vesicles fused during each neuronal firing, this probably isn't a big deal.

The behavior of this pump is a bit weird, however, since most machines that we are familiar with show more gradual breakdowns under stress, straining and slowing down. But here, the pump just decides to shut down for long periods of time, generally when the vesicle is fully charged up, but sometimes when it is not. It is a reflection that we are near the quantum level here, dealing with molecules that are very large in some molecular sense, but still operating at the atomic scale, particularly at the key choke points of this kind of protein that surely involve subtle shifts of just a few atoms that impart this regulatory shift, from active to inactive. What is worse, the pump sometimes freaks out completely and, while in its on state, switches to a leaking state that lets out protons ten times faster than the passive leakage through the rest of the vesicle membrane. The authors naturally urge deeper structural studies of what might be going on!


Saturday, September 9, 2023

Keeping Cellular Signals Straight

Cells often use the same signaling systems for different inputs and purposes. Scaffolds come to the rescue.

Eukaryotic cells are huge, at least compared with their progenitors, bacteria. Thanks to their mitochondria and other organizational apparatus, the typical eukaryotic cell has about 100,000 times the volume of a bacterium. These cells are virtual cities- partially organized by their membrane organelles, but there is a much more going on, with tremendous complexity to manage. One issue that was puzzling over the decades was how signals were kept straight. Eukaryotic cells use a variety of signaling systems, proto-typicaly starting with a receptor at the cell surface, linking to a kinase (or series of kinases) that then amplifies and broadcasts the signal inside the cell, ending up with the target phosphorylated proteins entering the nucleus and changing the transcription program of the cell. 

While our genome does have roughly 500 kinases, and one to two thousand receptors, a few of them (especially some kinases and their partners, which form "intracellular signaling systems") tend to crop up frequently in different systems and cell types, like the MAP kinase cascade, associated with growth and stress responses, and the AKT kinase, associated with nutrient sensing and growth responses. Not only do many different receptors turn these cellular signaling hubs on, but their effects can often be different as well, even from unrelated signals hitting the same cell.

If all these proteins were diffusable all over the cell, such specificity of signaling would be impossible. But it turns out that they are usually tethered in particular ways, by organizational helpers called scaffold proteins. These scaffolds may localize the signaling to some small volume within the larger cell, such as a membrane "raft" domain. They may also bind multiple actors of the same signaling cascade, bringing several proteins (kinases and targets) together to make signaling both efficient and (sterically) insulated from outside interference. And, in a recent paper, they can also tweak their binding targets allosterically to insulate them from outside interference.

What is allostery vs stery? If one protein (A) binds another (B) such that a phosphorylation or other site is physically hidden from other proteins, such as a kinase (C) that would activate it, that site is said to be sterically hidden- that is, by geometry alone. On the other hand, if that site remains free and accessible, but the binding of A re-arranges protein B such that it no longer binds C very well, blocking the kinase event despite the site of phosphorylation being available, then A has allosterically regulated B. It has altered the shape of B in some subtle way that alters its behavior. While steric effects are dominant and occur everywhere in protein interactions and regulation, allostery comes up pretty frequently as well, proteins being very flexible gymnasts. 

GSK3 is part of insulin signaling. It is turned off by phosphorylation, which affects a large number of downstream functions, such as turning on glycogen synthase.

The current case turns on the kinase GSK3, which, according to Wikipedia... "has been identified as a protein kinase for over 100 different proteins in a variety of different pathways. ... GSK-3 has been the subject of much research since it has been implicated in ... diseases, including type 2 diabetes, Alzheimer's disease, inflammation, cancer, addiction and bipolar disorder." GSK3 was named for its kinase activity targeting glycogen synthase, which inactivates the synthase, thus shutting down production of glycogen, which is a way to store sugar for later use. Connected with this homeostatic role, the hormone insulin turns GSK3 off by phosphorylation by a pathway downstream of the membrane-resident insulin receptor called the PI3 kinase / protein kinase B pathway. Insulin thus indirectly increases glycogen synthesis, mopping up excess blood sugar. The circuit reads: insulin --> kinases --| GSK3 --| glycogen synthase --> more glycogen.

GSK3 also functions in this totally different pathway, downstream of WNT and Frizzled. Here, GSK3 phosphorylates beta-catenin and turns it off, most of the time. WNT (like insulin) turns GSK3 off, which allows beta-catenin to accumulate and do its gene regulation in the nucleus. Cross-talk between these pathways would be very inopportune, and is prevented by the various functions of Axin, a scaffold protein. 


Another well-studied role of GSK3 is in a developmental signal, called WNT, which promotes developmental decisions of cells during embryogenisis, wound repair, and cancer, cell migration, proliferation, etc. GSK3 is central here for the phosphorylation of beta-catenin, which is a transcription regulator, among other things, and when active migrates to the nucleus to turn its target genes on. But when phosphorylated, beta-catenin is diverted to the proteosome and destroyed, instead. This is the usual state of affairs, with WNT inactive, GSK3 active, and beta-catenin getting constantly made and then immediately disposed of. This complex is called a "destruction" complex. But an incoming WNT signal, typically from neighboring cells carrying out some developmental program, alters the activity of a key scaffold in this pathway, Axin, which is destroyed and replaced by Dishevelled, which turns GSK3 off.

How is GSK3 kept on all the time for the developmental purposes of the WNT pathway, while allowing cells to still be responsive to insulin and other signals that also use GSK3 for their intracellular transmission? The current authors found that the Axin scaffold has a special property of allosterically preventing the phosphorylation of its bound GSK3 by other upstream signaling systems. They even re-engineered Axin to an extremely minimal 26 amino acid portion that binds GSK3, and this still performed the inhibition, showing that the binding doesn't sterically block phosphorylation by insulin signals, but blocks allosterically. 

That is great, but what about the downstream connections? Keeping GSK3 on is great, but doesn't that make a mess of the other pathways it participates in? This is where scaffolds have a second job, which is to bring upstream and downstream components together, to keep the whole signal flow isolated. Axin also binds beta-catenin, the GSK3 substrate in WNT signaling, keeping everything segregated and straight. 

Scaffold proteins may not "do" anything, as enzymes or signaling proteins in their own right, but they have critical functions as "conveners" of specific, channeled communication pathways, and allow the re-use of powerful signaling modules, over evolutionary time, in new circuits and functions, even in the same cells.


  • The oceans need more help, less talk.
  • Is Trump your church?
  • Can Poland make it through?

Sunday, August 27, 2023

Better Red Than Dead

Some cyanobacteria strain for photosynthetic efficiency at the red end of the light spectrum.

The plant world is green around us- why green, and not some other color, like, say, black? That plants are green means that they are letting green light through (or out by reflection), giving up some energy. Chlorophyll absorbs both red light and blue light, but not green, though all are near the peak of solar output. Some accessory pigments within the light-gathering antenna complexes can extend the range of wavelenghts absorbed, but clearly a fair amount of green light gets through. A recent theory suggests that this use of two separated bands of light is an optimal solution to stabilize power output. At any rate, it is not just the green light- the extra energy of the blue light is also thrown away as heat- its excitation is allowed to decay to the red level of excitation, within the antenna complex of chlorophyll molecules, since the only excited state used in photosynthesis is that at ~690 nm. This forms a uniform common denominator for all incoming light energy that then induces charge separation at the oxygen reaction center, (stripping water of electrons and protons), and sends newly energized electrons out to quinone molecules and on into the biosynthetic apparatus.

The solar output, which plants have to work with.

Fine. But what if you live deeper in the water, or in the veins of a rock, or in a mossy, shady nook? What if all you have access to is deeper red light, like at 720 nm, with lower energy than the standard input? In that case, you might want to re-engineer your version of photosynthesis to get by with slightly lower-energy light, while getting the same end results of oxygen splitting and carbon fixation. A few cyanobacteria (the same bacterial lineage that pioneered chlorophyll and the standard photosynthesis we know so well) have done just that, and a recent paper discusses the tradeoffs involved, which are of two different types.

The chlorophylls with respective absorption spectra and partial structures. Redder light is toward the right. Chlorophyll a is one used most widely in plants and cyanobacteria. Chlorophyll b is also widely used in these organisms as an additional antenna pigment that extends the range of absorbed light. Chlorophylls d and f are red-shifted and used in specialized species discussed here. 

One of the species, Chroococcidiopsis thermalis, is able to switch states, from bright/white light absorbtion with normal array of pigments, to a second state where it expresses chlorophylls d and f, which absorb light at the lower energy 720 nm, in the far red. This "facultative" ability means that it can optimize the low-light state without much regard to efficiency or photo-damage protection, which it can address by switching back to the high energy wavelength pigment system. The other species is Acaryochloris marina, which has no bright light system, but only chlorophyll d. This bacterium lives inside the cells of bigger red algae, so has a relatively stable, if shaded, environment to deal with.

What these and prior researchers found was that the ultimate quantum energy used to split water to O2, and to send energized electrons off the photosystem I and carbon compound synthesis, is the same as in any other chlorophyll a-using system. The energetics of those parts of the system apparently can not be changed. The shortfall needs to be made up in the front end, where there is a sharp drop in energy from that absorbed- 1.82 electron volts (eV) from photons at 680 nm (but only 1.72 eV from far-red photons)- and that needed at the next points in the electron transport chains (about 1.0 eV). This difference plays a large role in directing those electrons to where the plant wants them to go- down the gradient to the oxygen-evolving center, and to the quinones that ferry energized electrons to other synthetic centers. While it seems like more waste, a smaller difference allows the energized electrons to go astray, forming chemical radicals and other products dangerous to the cell. 

Summary diagram, described in text. Energy levels are described for photon excitation of chlorophyll (Chl, left axis, and energy transitions through the reaction center (Phe- pheophytin), and quinones (Q) that conduct energized electrons out to the other photosynthetic center and biosynthesis. On top are shown the respective system types- normal chlorophyll a from white-light adapted C. thermalis, chlorophyll d in A. marina, and chlorophyll f in red-adapted C. thermalis. 

What these researchers summarize in the end is that both of the red light-using cyanobacteria squeeze this middle zone of the power gradient in different ways. There is an intermediate event in the trail from photon-induced electron excitation to the outgoing quinone (+ electron) and O2 that is the target of all the antenna chlorophylls- the photosynthetic reaction center. This typically has chlorophyll a (called P680) and pheophytin, a chlorophyll-like molecule. It is at this chlorophyll a molecule that the key step takes place- the excitation energy (an electron bumped to a higher energy level) conducted in from the antenna of ~30 other chlorophylls pops out its excited electron, which flits over to the pheophytin, then thence to the carrier quinone molecules and photosystem I. Simultaneously, an electron comes in to replace it from the oxygen-evolving center, which receives alternate units of photon energy, also from the chlorophyll/pheophytin reaction center. The figure above describes these steps in energetic terms, from the original excited state, to the pheophytin (Phe-, loss of 0.16 eV) to the exiting quinone state (Qa-, loss of 0.385 eV). In the organisms discussed here, chlorophyll d replaces a at this center, and since its structure is different and absorbance is different, its energized electron is about 0.1 eV less energetic. 

In A. marina, (center in the diagram above), the energy gap between the pheophytin and the quinone is squeezed, losing about 0.06 eV. This has the effect of losing some of the downward "slope" on the energy landscape that prevents side reactions. Since A. marina has no choice but to use this lower energy system, it needs all the efficiency it can get, in terms of the transfer from chlorophyll to pheopytin. But it then sacrifices some driving force from the next step to the quinone. This has the ultimate effect of raising damage levels and side reactions when faced with more intense light. However, given its typically stable and symbiotic life style, that is a reasonable tradeoff.

On the other hand, C. thermalis (right-most in the diagram above) uses its chlorophyll d/f system on an optional basis when the light is bad. So it can give up some efficiency (in driving pheophytin electron acceptance) for better damage control. It has dramatically squeezed the gap between chlorophyll and pheophytin, from 0.16 eV to 0.08 eV, while keeping the main pheophytin-to-quinone gap unchanged. This has the effect of keeping the pumping of electrons out to the quinones in good condition, with low side-effect damage, but restricts overall efficiency, slowing the rate of excitation transfer to pheophytin, which affects not only the quinone-mediated path of energy to photosystem I, but also the path to the oxygen evolving center. The authors mention that this cyanobacterium recovers some efficiency by making extra light-harvesting pigments that provide more inputs, under these low / far-red light conditions.

The methods used to study all this were mostly based on fluorescence, which emerges from the photosynthetic system when electrons fall back from their excited states. A variety of inhibitors have been developed to prevent electron transfer, such as to the quinones, which bottles up the system and causes increased fluorescence and thermoluminescence, whose wavelengths reveal the energy gaps causing them. Thus it is natural, though also impressive, that light provides such an incisive and precise tool to study this light-driven system. There has been much talk that these far red-adapted photosynthetic organisms validate the possibility of life around dim stars, including red dwarves. But obviously these particular systems developed evolutionarily out of the dominant chlorophyll a-based system, so wouldn't provide a direct path. There are other chlorophyll systems in bacteria, however, and systems that predate the use of oxygen as the electron source, so there are doubtless many ways to skin this cat.


  • Maybe humiliating Russia would not be such a bad thing.
  • Republicans might benefit from reading the Federalist Papers.
  • Fanny Willis schools Meadows on the Hatch act.
  • "The top 1% of households are responsible for more emissions (15-17%) than the lower earning half of American households put together (14% of national emissions)."

Sunday, July 30, 2023

To Sleep- Perchance to Inactivate OX2R

The perils of developing sleeping, or anti-sleeping, drugs.

Sleep- the elixir of rest and repose. While we know of many good things that happen during sleep- the consolidation of memories, the cardiovascular rest, the hormonal and immune resetting, the slow waves and glymphatic cleansing of the brain- we don't know yet why it is absolutely essential, and lethal if repeatedly denied. Civilized life tends to damage our sleep habits, given artificial light and the endless distractions we have devised, leading to chronic sleeplessness and a spiral of narcotic drug consumption. Some conditions and mutations, like narcolepsy, have offered clues about how sleep is regulated, which has led to new treatments, though to be honest, good sleep hygiene is by far the best remedy.

Genetic narcolepsy was found to be due to mutations in the second receptor of the hormone orexin (OX2R), or also due to auto-immune conditions that kill off a specialized set of neurons in the hypothalamus- a basal part of the brain that sits just over the brain stem. This region normally has ~ 50,000 neurons that secrete orexin (which comes in two kinds as well, 1 and 2), and project to areas all over the brain, especially basal areas like the basal forebrain and amygdala, to regulate not just sleep but feeding, mood, reward, memory, and learning. Like any hormone receptor, the orexin receptors can be approached in two ways- by turning them on (agonist) or by turning them off (antagonist). Antagonist drugs were developed which turn off both orexin receptors, and thus promote sleep. The first was named suvorexant, using the "orex" and "ant" lexical elements to mark its functions, which is now standard for generic drug names

 This drug is moderately effective, and is a true sleep enhancer, promoting falling to sleep, restful sleep, and length of sleep, unlike some other sleep aids. Suvorexant antagonizes both receptors, but the researchers knew that only the deletion of OX2R, not OX1R, (in dogs, mice, and other animals), generates narcolepsy, so they developed a drug more specific to OX2R only. But the result was that it was less effective. It turned out that binding and turning off OX1R was helpful to sleep promotion, and there were no particularly bad side effects from binding both receptors, despite the wide ranging activities they appear to have. So while the trial of Merck's MK-1064 was successful, it was not better than their exising two-receptor drug, so its development was shelved. And we learned something intriguing about this system. While all animals have some kind of orexin, only mammals have the second orexin family member and receptor, suggesting that some interesting, but not complete, bifurcation happened in the functions of this system in evolution. 

What got me interested in this topic was a brief article from yet another drug company, Takeda, which was testing an agonist against the orexin receptors in an effort to treat narcolepsy. They created TAK-994, which binds to OX2R specifically, and showed a lot of promise in animal trials. It is a pill form, orally taken drug, in contrast to the existing treatment, danavorexton, which must be injected. In the human trial, it was remarkably effective, virtually eliminating cataleptic / narcoleptic episodes. But there was a problem- it caused enough liver toxicity that the trial was stopped and the drug shelved. Presumably, this company will try again, making variants of this compound that retain affinity and activity but not the toxicity. 

This brings up an underappreciated peril in drug design- where drugs end up. Drugs don't just go into our systems, hopefully slipping through the incredibly difficult gauntlet of our digestive system. But they all need to go somewhere after they have done their jobs, as well. Some drugs are hydrophilic enough, and generally inert enough, that they partition into the urine by dilution and don't have any further metabolic events. Most, however, are recognized by our internal detoxification systems as foreign, (that is, hydrophobic, but not recognizable as fats/lipids that are usual nutrients), and are derivatized by liver enzymes and sent out in the bile. 

Structure of TAK-994, which treats narcolepsy, but at the cost of liver dysfunction.

As you can see from the chemical structure above, TAK-994 is not a normal compound that might be encountered in the body, or as food. The amino sulfate is quite unusual, and the fluorines sprinkled about are totally unnatural. This would be a red flag substance, like the various PFAS materials we hear about in the news. The rings and fluorines create a relatively hydrophobic substance, which would need to be modified so that it can be routed out of the body. That is what a key enzyme of the liver, CYP3A4 does. It (and many family members that have arisen over evolutionary time) oxidizes all manner of foreign hydrophobic compounds, using a heme cofactor to handle the oxygen. It can add OH- groups (hydroxylation), break open double bonds (epoxidation), and break open phenol ring structures (aromatic oxidation). 

But then what? Evolution has met most of the toxic substances we meet with in nature with appropriate enzymes and routes out of the body. But these novel compounds we are making with modern chemistry are something else altogether. Some drugs are turned on by this process, waiting till they get to the liver to attain their active form. Others, apparently such as this one, are made into toxic compounds (as yet unknown) by this process, such that the liver is damaged. That is why animal studies and safety trials are so important. This drug binds to its target receptor, and does what it is supposed to do, but that isn't enough to be a good drug.