Showing posts with label evolution. Show all posts
Showing posts with label evolution. Show all posts

Saturday, February 14, 2026

We Have Rocks in Our Heads ... And Everywhere Else, Too

On the evolution and role of iron-sulfur complexes.

Some of the more persuasive ideas about the origin of life have it beginning in the rocks of hydrothermal vents. Here was a place with plenty of energy, interesting chemistry, and proto-cellular structures available to host it. Some kind of metabolism would by this theory have come first, followed by other critical elements like membranes and RNA coding/catalysis. This early earth lacked oxygen, so iron was easily available, not prone to oxidation as now. Thus life at this early time used many minerals in its metabolic processes, as they were available for free. Now, on today's earth, they are not so free, and we have complex processes to acquire and manage them. One of the major minerals we use is the iron-sulfur complex, (similar to pyrite), which comes in a variety of forms and is used by innumerable enzymes in our cells. 

The more common iron-sulfur complexes, with sulfur in yellow, iron in orange.


The principle virtue of the iron-sulfur complex is its redox flexibility. With the relatively electronically "soft" sulfur, iron forms semi-covalent-style bonds, while being able to absorb or give up an electron safely, without destroying nearby chemicals as iron alone typically does. Depending on the structure and liganding, the voltage potential of such complexes can be tuned all over the (reduction potential) map, from -600 to +400 mV. Many other cofactors and metals are used in redox reactions, but iron-sulfur is the most common by far.

Reduction potentials (ability to take up an electron, given an electrical push) of various iron-sulfur complexes.

Researchers had assumed that, given the abundance of these elements, iron-sulfur complexes were essentially freely acquired until the great oxidation event, about two to three billion years ago, when free oxygen started rising and free iron (and sulfur) disappeared, salted away into vast geological deposits. Life faced a dilemma- how to reliably construct minerals that were now getting scarce. The most common solution was a three enzyme system in mitochondria that 1) strips a sulfur from the amino acid cysteine, a convenient source inside cells, 2) scaffolds the construction of the iron-sulfur complex, with iron coming from carrier proteins such as frataxin, and 3) employs several carrier proteins to transfer the resulting complexes to enzymes that need them. 

But a recent paper described work that alters this story, finding archaeal microbes that live anaerobically and make do with only the second of these enzymes. A deep phylogenetic analysis shows that the (#2) assembly/scaffold enzymes are the core of this process, and have existed since the last common ancestor of all life. So they are incredibly ancient, and it turns out to that iron-sulfur complexes can not just be gobbled up from the environment, at least not by any reasonably advanced life form. Rather, these complexes need to be built and managed under the care of an enzyme.

The presented structures of the dimer of SmsB (orange) and SmsC (blue) that dimerize again to make up a full iron-sulfur scaffolding and production enzyme in the archaean Methanocaldococcus jannaschii. Note the reaction scheme where ATP comes in and evicts the iron-sulfur cluster. On right is shown how ATP fits into the structure, and how it nudges the iron-sulfur binding area (blue vs green tracing).

A recent paper from this group extended their analysis to the structure of the assembly/scaffold enzyme. They find that, though it is a symmetrical dimer of a complex of two proteins, it only deals with one iron-sulfur complex at at time. It also binds and cleaves ATP. But ATP seems to have more of an inhibitory role than one that stimulates assembly directly. The authors suggest that high levels of ATP signal that less iron-sulfur complex is needed to sustain the core electron transport chains of metabolism, making this ATP inhibition an allosteric feedback control mechanism in these archaeal cells. I might add, however, that ATP binding may well also have a role in extricating the assembled iron-sulfur cluster from the enzyme, as that complex is quite well coordinated, and could use a push to pop out into the waiting arms of target enzymes.

"These ancestral systems were kept in archaea whereas they went through stepwise complexification in bacteria to incorporate additional functions for higher Fe-S cluster synthesis efficiency leading to SUF, ISC and NIF." - That is, the three-component systems present in eukaryotes, which come in three types.

In the author's structure, the iron-sulfur complex, liganded by three cysteines within the SmsC protein. But note how, facing the viewer, the complex is quite exposed, ready to be taken up by some other enzyme that has a nice empty spot for it.

Additionally, these archaea, with this simple one-step iron cluster formation pathway, get their sulfur not from cysteine, but from ambient elemental sulfur. Which is possible, as they live only in anaerobic environments, such as deep sea hydrothermal vents. So they represent a primitive condition for the whole system as may have occurred in the last common ancestor of all life. This ancestor is located at the split between bacteria and archaea, so was a fully fledged and advanced cell, far beyond the earlier glimmers of abiogenesis, the iron sulfur world, and the RNA world.


Saturday, January 24, 2026

Jonathan Singer and the Cranky Book

An eminent scientist at the end of his career writes out his thoughts and preoccupations.

Jonathan Singer was a famous scientist at my graduate school. I did not interact with him, but he played a role in attracting me to the program, as I was interested in biological membranes at the time. Singer himself studied with Linus Pauling, and they were the first to identify a human mutation in a specific gene as a cause for a specific disease- sickle cell disease. After further notable work in electron microscopy, he reached a career triumph by developing, in 1972, the fluid mosaic model of biological membranes. This revolutionized and clarified the field, showing that cells are bounded by something incredibly simple- a bilayer of phospholipids that naturally order themselves into a remarkably stable sheet, (a bubble, one might say), all organized by their charged headgroups and hydrophobic fatty tails. This model also showed that proteins would be swimming around freely in this membrane, and could be integrated in various ways, ether lightly attached on one side, or spanning it completely, thereby enabling complex channel and transporter functions. The model implied the typical length of a protein alpha helix that, by virtue of its hydrophobic side chains, would naturally be able to do this spanning function- a prediction that was spot-on. He could have easily won a Nobel for this work.

I was intrigued when I learned recently that Singer had written a book near the end of his career. It is just the kind of thing that a retired professor loves to do in the sunset of his career, sharing the wisdom and staving off the darkness by taking a stab at the book biz. And Singer's is a classic of the form- highly personal, a bit stilted, and ultimately meandering. I will review some of its high points, and then take a stab of my own at knitting together some of the interesting themes he grapples with.

For at base, Singer turns out to be a spiritual compadre of this blog. He claims to be a rationalist, in a world where, as he has it, no more than 9% of people are rational. Definition? It is the poll question of whether one believes that god created man, rather than the other way around. Singer recognizes that the world around him is crazy, and that the communities he has been a part of have been precious oases amid the general indifference and grasping of the world. But changing it? He is rather fatalistic about that, recognizing that reason is up against overwhelming forces.

His specific themes cover a great deal of biology, and then some more mystical reflections on balance and diversity in biology, and later, in capitalism and politics. He points out that the nature/nurture debate has been settled by twin studies. Nature, which is to say, genetics, is the dominant influence on human characteristics, including a wide variety of psychological traits, including intelligence. Environment and nurture is critical for reaching one's highest potential, and for using it in socially constructive ways, but the limits of that potential are largely set by one's genes. Singer does not, however, draw the inevitable conclusion from these observations, which is that some kind of long-term eugenic approach would be beneficial to our collective future, assuming machines do not replace us forthwith. Biologists know that very small selective coefficients can have big effects, so nothing drastic is needed. But what criteria to use- that is the sticky part. Just as success in the capitalist system hardly signals high moral or personal qualities, nor does incarceration by the justice system always show low ones. It is virtually an insoluble problem, so we muddle along, destined probably for continued cycles of Spenglerian civilizational collapse.

Turning to social affairs, Singer settles on "structural chaos" as his description of how the scientific enterprise works, and how capitalism at large works. With a great deal of waste, and misdirected effort, it nevertheless ends up providing good results- better than those that top-down direction can provide. He seems a sigh a little that "scientific" methods of social organization, such as those in Soviet Russia, were so ineffective, and that the best we can do is to muddle along with the spontaneous entrepreneurship and occasional flashes of innovation that push the process along. Not to mention the "monstrous vulgarity" of advertising, etc. Likewise, democracy is a mess, with most people totally incapable of making the reasoned decisions needed to maintain it. Again, the chaos of democracy is sadly the best we can do, and the duty of rational people, in Singer's view, is to keep alive the flame of intellectual freedom while outside pressures constantly threaten.

Art, and science.

What can we do with this? I think that the unifying thread that Singer was groping for was competition. One can frame competition as a universal principle that shapes the physical, biological, and social worlds. Put two children on a teeter-totter, and you can see how physical forces (e.g. gravitation) compete all the time, subtly producing equilibria that characterize the universe. Chemical equilibria are likewise a product of constant competition, even including the perpetual jostling of phospholipids to find their lowest energy configuration amidst the biological membrane bilayer, which has the side-effect of creating such a stable, yet highly flexible, structure. With Darwin, competition reaches its apotheosis- the endless proliferation, diversification, and selection of organisms. Singer marvels at the fragility of individual life, at the same time that life writ large is so incredibly durable and prolific. Well, the mechanism behind that is competition. And naturally, economics of any free kind, including capitalism and grant-making in science, are based on competition as well- the natural principle that selects which products are useful, which employees are productive, and which technologies are helpful. Waste is part of the process, as diversity amidst excess production is the essential ingredient for subsequent selection. 

And yet.. something is missing. The earth's biosphere would still be a mere bacterial soup if competition were the only principle at work. Bacteria (and their viruses) are the most streamlined competition machines- battlebots of the living world. It took cooperation between a bacterial cell and an archaeal cell to make a revolutionary new entity- the eukaryotic cell. It then took some more cooperation for eukaryotic cells to band together into bodies, making plants and animals. And among animals, cooperation in modest amounts provides for reproduction, family structure, flock structures, and even complex insect societies. It is with humans that cooperation and competition reach their most complex heights, for we are able to regulate ourselves, rationally. We make rules. 

Without rules, human society is anarchic mayhem- a trumpian, dystopian and corrupt nightmare. With them, it (ideally) balances competition with cooperation to harness the benefits of each. Our devotion to sports can be seen as a form of rule worship, and explicit management of the competitive landscape. Can there be too many rules? Absolutely, there are dangers on both sides. Take China as an example. In the last half-century, it revamped its system of rules to lower the instability of political competition, harness the power of economic competition, and completely transform its society. 

The most characteristic and powerful human institution may be the legislature, which is our ongoing effort to make rational rules regulating how the incredibly powerful motive force of competition shapes our lives. Our rules, in the US, were authored, at the outset, by the founders, who were- drumroll please- rationalists. To read the Federalist Papers is to see exquisite reasoning drawing on wide historical precedent, and particularly on the inspirations of the rationalist enlightenment, to formulate a new set of rules mediating between cooperation and competition. Not only were they more fair than the old rules, but they were designed for perpetual improvement and adjustment. The founding was, at base, a rationlist moment, when characters like Franklin, Hamilton, Madison, and Jefferson- deists at best and rationalists through and through, led the new country into a hopeful, constitutional future. At the current moment, two hundred and fifty years on, as our institutions are being wantonly destroyed and anything resembling reason, civility, and truth is under particularly vengeful attack, we should appreciate and own that heritage, which informs a true patriotism against the forces of darkness.


Saturday, December 13, 2025

Mutations That Make Us Human

The ongoing quest to make biologic sense of genomic regions that differentiate us from other apes.

Some people are still, at this late date, taken aback by the fact that we are animals, biologically hardly more than cousins to fellow apes like the chimpanzee, and descendants through billions of years of other life forms far more humble. It has taken a lot of suffering and drama to get to where we are today. But what are those specific genetic endowments that make us different from the other apes? That, like much of genetics and genetic variation, is a tough question to answer.

At the DNA level, we are roughly one percent different from chimpanzees. A recent sequencing of great apes provided a gross overview of these differences. There are inversions, and larger changes in junk DNA that can look like bigger differences, but these have little biological importance, and are not counted in the sequence difference. A difference of one percent is really quite large. For a three gigabyte genome, that works out to 30 million differences. That is plenty of room for big things to happen.

Gross alignment of one chromosome between the great apes. [HSA- human, PTR- chimpanzee, PPA- bonobo, GGO- gorilla, PPY- orangutan (Borneo), PAB- orangutan (Sumatra)]. Fully aligned regions (not showing smaller single nucleotide differences) are shown in blue. Large inversions of DNA order are shown in yellow. Other junk DNA gains and losses are shown in red, pink, purple. One large-scale jump of a DNA segment is show in green. One can see that there has been significant rearrangement of genomes along the way, even as most of this chromosome (and others as well) are easly alignable and traceable through the evolutionary tree.


But most of those differences are totally unimportant. Mutations happen all the time, and most have no effect, since most positions (particularly the most variable ones) in our DNA are junk, like transposons, heterochromatin, telomeres, centromeres, introns, intergenic space, etc. Even in protein-coding genes, a third of the positions are "synonymous", with no effect on the coded amino acid, and even when an amino acid is changed, that protein's function is frequently unaffected. The next biggest group of mutations have bad effects, and are selected against. These make up the tragic pool of genetic syndromes and diseases, from mild to severe. Only a tiny proportion of mutations will have been beneficial at any point in this story. But those mutations have tremendous power. They can drag along their local DNA regions as they are positively selected, and gain "fixation" in the genome, which is to say, they are sufficiently beneficial to their hosts that they outcompete all others, with the ultimate result that mutation becomes universal in the population- the new standard. This process happens in parallel, across all positions of the genome, all at the same time. So a process that seems painfully slow can actually add up to quite a bit of change over evolutionary time, as we see.

So the hunt was on to find "human accelerated regions" (HAR), which are parts of our genome that were conserved in other apes, but suddenly changed on the way to humans. There roughly three thousand such regions, but figuring out what they might be doing is quite difficult, and there is a long tail from strong to weak effects. There are two general rationales for their occurrence. First, selection was lost over a genomic region, if that function became unimportant. That would allow faster mutation and divergence from the progenitors. Or second, some novel beneficial mutation happened there, bringing it under positive selection and to fixation. Some recent work found, interestingly, that clusters of mutations in HAR segments often have countervailing effects, with one major mutation causing one change, and a few other mutations (vs the ancestral sequence) causing opposite changes, in a process hypothesized to amount to evolutionary fine tuning. 

A second property of HARs is that they are overwhelmingly not in coding regions of the genome, but in regulatory areas. They constitute fine tuning adjustments of timing and amount of gene regulation, not so much changes in the proteins produced. That is, our evolution was more about subtle changes in management of processes than of the processes themselves. A recent paper delved in detail into HAR5, one of the strongest such regions, (that is, strongest prior conservation, compared with changes in human sequence), which lies in the regulatory regions upstream of Frizzled8 (FZD8). FZD8 is a cell surface receptor, which receives signals from a class of signaling molecules called WNT (wingless and int). These molecules were originally discovered in flies, where they signal body development programs, allowing cells to know where they are and when they are in the developmental program, in relation to cells next door, and then to grow or migrate as needed. They have central roles in embryonic development, in organ development, and also in cancer, where their function is misused.

For our story, the WNT/FZD8 circuit is important in fetal brain development. Our brains undergo massive cell division and migration during fetal development, and clearly this is one of the most momentous and interesting differences between ourselves and all other animals. The current authors made mutations in mice that reproduce some of the HAR5 sequences, and investigated their effects. 

Two mouse brains at three months of age, one with the human version of the HAR5 region. Hard to see here, but the latter brain is ~7% bigger.

The authors claim that these brains, one with native mouse sequence, and the other with the human sequences from HAR5, have about a seven percent difference in mass. Thus the HAR5 region, all by itself, explains about one fourteenth of the gross difference in brain size between us and chimpanzees. 

HAR5 is a 619 base-pair region with only four sequence differences between ourselves and chimpanzees. It lies 300,000 bases upstream of FZD8, in a vast region of over a million base pairs with no genes. While this region contains many regulatory elements, (generally called enhancers or enhancer modules, only some of which are mapped), it is at the same time an example of junk DNA, where most of the individual positions in this vast sea of DNA are likely of little significance. The multifarious regulation by all these modules is of course important because this receptor participates in so many different developmental programs, and has doubtless been fine-tuned over the millennia not just for brain development, but for every location and time point where it is needed.

Location of the FZD8 gene, in the standard view of the genome at NIH. I have added an arrow that points to the tiny (in relative terms) FZD8 coding region (green), and a star at the location of HAR5, far upstream among a multitude of enhancer sequences. One can see that this upstream region is a vast area (of roughly 1.5 million bases) with no other genes in sight, providing space for extremely complicated and detailed regulation, little of which is as yet characterized.

Diving into the HAR5 functions in more detail, the authors show that it directly increases FZD8 gene expression, (about 2 fold, in very rough terms), while deleting the region from mice strongly decreases expression in mice. Of the four individual base changes in the HAR5 region, two have strong (additive) effects increasing FZD8 expression, while the other two have weaker, but still activating, effects. Thus, no compensatory regulation here.. it is full speed ahead at HAR5 for bigger brain size. Additionally, a variant in human populations that is responsible for autism spectrum disorders also resides in this region, and the authors show that this change decreases FZD8 expression about 20%. Small numbers, sure, but for a process that directs cell division over many cycles in early brain development, this kind of difference can have profound effects.


The HAR5 region causes increased transcription of FZD8, in mice, compared to the native version and a deletion.

The HAR5 region causes increased cell proliferation in embryonic day 14.5 brain areas, stained for neural markers.

"This reveals Hs-HARE5 modifies radial glial progenitor behavior, with increased self-renewal at early developmental stages followed by expanded neurogenic potential. ... Using these orthogonal strategies we show four human-specific variants in HARE5 drive increased enhancer activity which promotes progenitor proliferation. These findings illustrate how small changes in regulatory DNA can directly impact critical signaling pathways and brain development."

So there you have it. The nuts and bolts of evolution, from the molecular to the cellular, the organ, and then the organismal, levels. Humans do not just have bigger brains, but better brains, and countless other subtle differences all over the body. Each of these is directed by genetic differences, as the combined inheritance of the last six million years since our divergence versus chimpanzees. Only with the modern molecular tools can we see Darwin's vision come into concrete focus, as particular, even quantum, changes in the code, and thus biology, of humanity. There is a great deal left to decipher, but the answers are all in there, waiting.


Saturday, November 22, 2025

Ground Truth for Genetic Mutations

Saturation mutagenasis shows that our estimates of the functional effect of uncharacterized mutations are not so great.

Human genomes can now be sequenced for less than $1,000. This technological revolution has enabled a large expansion of genetic testing, used for cancer tissue diagnosis and tracking, and for genetic syndrome analysis both of embryos before birth and affected people after birth. But just because a base among the 3 billion of the genome is different from the "reference" genome, that does not mean it is bad. Judging whether a variant (the modern, more neutral term for mutation) is bad takes a lot of educated guesswork.

A recent paper described a deep dive into one gene, where the authors created and characterized the functional consequence of every possible coding variant. Then they evaluated how well our current rules of thumb and prediction programs for variant analysis compare with what they found. It was a mediocre performance. The gene is CDKN2A, one of our more curious oddities. This is an important tumor suppressor gene that inhibits cell cycle progression and promotes DNA repair- it is often mutated in cancers. But it encodes not one, but two entirely different proteins, by virtue of a complex mRNA splicing pattern that uses distinct exons in some coding portions, and parts of one sequence in two different frames, to encode these two proteins, called p16 and p14. 

One gene, two proteins. CDKN2A has a splicing pattern (mRNA exons shown as boxes at top, with pink segments leading to the p14 product, and the blue segments leading the p16 product) that generates two entirely different proteins from one gene. Each product has tumor suppressing effects, though via distinct mechanisms.

Regardless of the complex splicing and protein coding characteristics, the authors generated all possible variants in every possible coded amino acid (156 amino acids in all, as both produced proteins are relatively short). Since the primary roles of these proteins are in cell cycle and proliferation control, it was possible to assay function by their effect when expressed in cultured pancreatic cells. A deleterious effect on the protein was revealed as, paradoxically, increased growth of these cells. They found that about 600 of the 3,000 different variants in their catalog had such an effect, or 20%.

This is an expected rate of effect, on the whole. Most positions in proteins are not that important, and can be substituted by several similar amino acids. For a typical enzyme, for instance, the active site may be made up of a few amino acids in a particular orientation, and the rest of the protein is there to fold into the required shape to form that active site. Similar folding can be facilitated by numerous amino acids at most positions, as has been richly documented in evolutionary studies of closely-related proteins. These p16 and p14 proteins interact with a few partners, so they need to maintain those key interfacial surfaces to be fully functional. Additionally, the assay these researchers ran, of a few generations of growth, is far less sensitive than a long-term true evolutionary setting, which can sift out very small effects on a protein, so they were setting a relatively high bar for seeing a deleterious effect. They did a selective replication of their own study, and found a reproducibility rate of about 80%, which is not great, frankly.

"Of variants identified in patients with cancer and previously reported to be functionally deleterious in published literature and/or reported in ClinVar as pathogenic or likely pathogenic (benchmark pathogenic variants), 27 of 32 (84.4%) were functionally deleterious in our assay"

"Of 156 synonymous variants and six missense variants previously reported to be functionally neutral in published literature and/or reported in ClinVar as benign or likely benign (benchmark benign variants), all were characterized as functionally neutral in our assay "

"Of 31 VUSs previously reported to be functionally deleterious, 28 (90.3%) were functionally deleterious and 3 (9.7%) were of indeterminate function in our assay."

"Similarly, of 18 VUSs previously reported to be functionally neutral, 16 (88.9%) were functionally neutral and 2 (11.1%) were of indeterminate function in our assay"

Here we get to the key issues. Variants are generally classified as benign, pathogenic/deleterious, or "variant of unknown/uncertain significance". The latter are particularly vexing to clinical geneticists. The whole point of sequencing a patient's tumor or genomic DNA is to find causal variants that can illuminate their condition, and possibly direct treatment. Seeing lots of "VUS" in the report leaves everyone in the dark. The authors pulled in all the common prediction programs that are officially sanctioned by the ACMG- Americal College of Medical Genetics, which is the foremost guide to clinical genetics, including the functional prediction of otherwise uncharacterized sequence variants. There are seven such programs, including one driven by AI, AlphaMissense that is related to the Nobel prize-winning AlphaFold. 

These programs strain to classify uncharacterized mutations as "likely pathogenic", "likely benign", or, if unable to make a conclusion, VUS/indeterminate. They rely on many kinds of data, like amino acid similarity, protein structure, evolutionary conservation, and known effects in proteins of related structure. They can be extensively validated against known mutations, and against new experimental work as it comes out, so we have a pretty good idea of how they perform. Thus they are trusted to some extent to provide clinical judgements, in the absence of better data. 

Each of seven programs (on bottom) gives estimations of variant effect over the same pool of mutations generated in this paper. This was a weird way to present simple data, but each bar contains the functional results the authors developed in their own data (numbers at the bottom, in parentheses, vertical). The bars were then colored with the rate of deleterious (black) vs benign (white) prediction from the program. The ideal case would be total black for the first bar in each set of three (deleterious) and total white in the third bar in each set (benign). The overall lineup/accuracy of all program predictions vs the author data was then overlaid by a red bar (right axis). The PrimateAI program was specially derived from comparison of homologous genes from primates only, yielding a high-quality dataset about the importance of each coded amino acid. However, it only gave estimates for 906 out of the whole set of 2964 variants. On the other hand, cruder programs like PolyPhen-2 gave less than 40% accuracy, which is quite disappointing for clinical use.

As shown above, the algorithms gave highly variable results, from under 40% accurate to over 80%. It is pretty clear that some of the lesser programs should be phased out. Of programs that fielded all the variants, the best were AlphaMissense and VEST, which each achieved about 70% accuracy. This is still not great. The issue is that, if a whole genome sequence is run for a patient with an obscure disease or syndrome, and variants vs the reference sequence are seen in several hundred genes, then a gene like CDKN2A could easily be pulled into the list of pathogenic (and possibly causal) variants, or be left out, on very shaky evidence. That is why even small increments in accuracy are critically important in this field. Genetic testing is a classic needle-in-a-haystack problem- a quest to find the one mutation (out of millions) that is driving a patient's cancer, or a child's inherited syndrome.

Still outstanding is the issue of non-coding variants. Genes are not just affected by mutations in their protein coding regions (indeed many important genes do not code for proteins at all), but by regulatory regions nearby and far. This is a huge area of mutation effects that are not really algorithmically accessible yet. As a prediction problem, it is far more difficult than predicting effects on a coded protein. It will requiring modeling of the entire gene expression apparatus, much of which remains shrouded in mystery.


Saturday, October 18, 2025

When the Battery Goes Dead

How do mitochondria know when to die?

Mitochondria are the energy centers within our cells, but they are so much more. They are primordial bacteria that joined with archaea to collaborate in the creation of eukaryotes. They still have their own genomes, RNA transcription and protein translation. They play central roles in the life and death of cells, they divide and coalesce, they motor around the cell as needed, kiss other organelles to share membranes, and they can get old and die. When mitochondria die, they are sent to the great garbage disposal in the sky, the autophagosome, which is a vesicle that is constructed as needed, and joins with a lysosome to digest large bits of the cell, or of food particles from the outside.

The mitochondrion spends its life (only a few months) doing a lot of dangerous reactions and keeping an electric charge elevated over its inner membrane. It is this charge, built up from metabolic breakdown of sugars and other molecules, that powers the ATP-producing rotary enzyme. And the decline of this charge is a sign that the mitochondrion is getting old and tired. A recent paper described how one key sensor protein, PINK1, detects this condition and sets off the disposal process. It turns out that the membrane charge does not only power ATP synthesis, but it powers protein import to the mitochondrion as well. Over the eons, most of the mitochondrion's genes have been taken over by the nucleus, so all but a few of the mitochondrion's proteins arrive via import- about 1500 different proteins in all. And this is a complicated process, since mitochondria have inner and outer membranes, (just as many bacteria do), and proteins can be destined to any of these four compartments- in either membrane, in the inside (matrix), or in the inter-membrane space. 

Figure 12-26. Protein import by mitochondria.
Textbook representation of mitochondrial protein import, with a signal sequence (red) at the front (N-terminus) of the incoming protein (green), helping it bind successively to the TOM and TIM translocators. 

The outer membrane carries a protein import complex called TOM, while the inner membrane carries an import complex called TIM. These can dock to each other, easing the whole transport process. The PINK1 protein is a somewhat weird product of evolution, spending its life being synthesized, transported across both mitochondrial membranes, and then partially chopped up in the mitochondrial matrix before its remains are exported again and fully degraded. That is when everything is working correctly! When the mitochondrial charge declines, PINK1 gets stuck, threaded through TOM, but unable to transit the TIM complex. PINK1 is a kinase, which phosphorylates itself as well as ubiquitin, so when it is stuck, two PINK1 kinases meet on the outside of the outer membrane, activate each other, and ultimately activate another protein, PARKIN, whose name derives from its importance in parkinson's disease, which can be caused by an excess of defective mitochondria in sensitive tissues, specifically certain regions and neurons of the brain. PARKIN is a ubiquitin ligase, which attaches the degradation signal ubiquitin to many proteins on the surface of the aged mitochondrion, thus signaling the whole mess to be gobbled up by an autophagosome.

A data-rich figure 1 from the paper shows purification of the tagged complex (top), and then the EM structure at bottom. While the purification (B, C) show the presence of TIM subunits, they did not show up in the EM structures, perhaps becuase they were not stable enough or frequent enough in proportion to the TOM subunits. But the PINK1+TOM_VDAC2 structures are stunning, helping explain how PINK1 dimerized so easily when it translocation is blocked.

The current authors found that PINK1 had convenient cysteine residues that allowed it to be experimentally crosslinked in the paired state, and thus freeze the PARKIN-activating conformation. They isolated large amounts of such arrested complexes from human cells, and used electon microscopy to determine the structure. They were amazed to see, not just PINK1 and the associated TOM complex, but also VDAC2, which is the major transporter that lets smaller molecules easily cross the outer membrane. The TOM complexes were beautifully laid out, showing the front end (N-terminus) of PINK1 threaded through each TOM complex, specifically the TOM40 ring structure.

What was missing, unfortunately, was any of the TIM complex, though some TIM subunits did co-purify with the whole complex. Nor was PARKIN or ubiquitin present, leaving out a good bit of the story. So what is VDAC2 doing there? The authors really don't know, though they note that reactive oxygen byproducts of mitochondrial metabolism would build up during loss of charge, acting as a second signal of mitochondrial age. These byproducts are known to encourage dimerization of VDAC channels, which naturally leads by the complex seen here to dimerization and activation of the PINK1 protein. Additionally, VDACs are very prevalent in the outer membrane and prominent ubiquitination targets for autophagy signaling.

To actually activate PARKIN ubiquitination, PINK1 needs to dissociate again, a process that the authors speculate may be driven by binding of ubiquitin by PINK1, which might be bulky enough to drive the VDACs apart. This part was quite speculative, and the authors promise further structural studies to figure out this process in more detail. In any case, what is known is quite significant- that the VDACs template the joining of two PINK1 kinases in mid-translocation, which, when the inner membrane charge dies away, prompts the stranded PINK1 kinases to activate and start the whole disposal cascade. 

Summary figure from the authors, indicating some speculative steps, such as where the reactive oxygen species excreted by VDAC2 sensitise PINK1, perhaps by dimerizing the VDAC channel itself. And where ubiquitin binding by PINK1 and/or VDAC prompts dissociation, allowing PARKIN to come in and get activated by PINK1 and spread the death signal around the surface of the mitochondrion.

It is worth returning briefly to the PINK1 life cycle. This is a protein whose whole purpose, as far as we know, is to signal that mitochondria are old and need to be given last rites. But it has a curiously inefficient way of doing that, being synthesized, transported, and degraded continuously in a futile and wasteful cycle. Evolution could hardly have come up with a more cumbersome, convoluted way to sense the vitality of mitochondria. Yet there we are, doubtless trapped by some early decision which was surely convenient at the time, but results today in a constant waste of energy, only made possible by the otherwise amazingly efficient and finely tuned metabolic operations of PINK1's target, the mitochondrion.


Note that at the glacial maxima, sea levels were almost 500 feet (150 meters) lower than today. And today, we are hitting a 3 million year peak level.

Saturday, October 11, 2025

The Role of Empathy in Science

Jane Goodall's career was not just a watershed in ethology and primate psychology, but in the way science is done.

I vividly remember reading Jane Goodall's descriptions of the chimpanzees in her Gombe project. Here we had been looking for intelligent alien life with SETI, and wondering about life on Mars. But she revealed that intelligent, curious personalities exist right here, on Earth, in the African forest. Alien, but not so alien. Indeed, they loved their families, suffered heartbreaking losses, and fought vicious battles. They had cultures, and tools, deviousness and generosity. 

What was striking was not just the implications of all this for us as humans and as conservationists, but also what it overturned about scientific attitudes. Science had traditionally had a buttoned-up attitude- "hard science", as it were. This reached a crescendo with behaviorism, where nothing was imputed to the psychology of others, whether animals or children, other than machine-like input/output reflexes. Machines were the reigning model, as though we had learned nothing since Descartes. 

Ask a simple question, get a simple answer.

This was appalling enough on its own terms, but it really impoverished scientific progress as well. Goodall helped break open this box by showing in a particularly dramatic way the payoff possible from having deep empathy with one's scientific object. Scientists have always engaged with their questions out of interest and imagination. It is a process of feeling one's way through essentially a fantasy world, until one proves that the rules you have divined actually are provable via some concrete demonstration- doing an experiment, or observing the evidence of tool use by chimpanzees. It is intrinsically an empathetic process, even if the object of that empathy is a geological formation, or a sub-atomic particle. 

But discipline is needed too. Mathematics reigns supreme in physics, because, luckily, physics follows extremely regular rules. That is what is so irritating and uncomfortable about quantum mechanics. That is a field where empathy sort of fails- notoriously, no one really "understands" quantum mechanics, even though the math certainly works out. But in most fields, it is understanding we are after, led by empathy and followed by systematization of the rules at work, if any. This use of empathy has methodological implications. We become attached to the objects of our work, and to our ideas about them. So discipline involves doing things like double-blind trials to insulate a truth-finding process from bias. And transparency with open publication followed by open critique.

In the 20th century, science was being overwhelmed by the discipline and the adulation of physics, and losing the spark of inspiration. Jane Goodall helped to right that ship, reminding us that scientific methods and attitudes need to match the objects we are working with. Sure, math might be the right approach to electrons. But our fellow animals are an entirely different kettle of fish. For example, all animals follow their desires. The complexities of mating among animals means that they are all driven just as we are- by emotions, by desire, by pain, by love. The complexity may differ, but the intensity of these emotions can not possibly be anything but universal.


Saturday, September 13, 2025

Action at the Heart of Action

How myosin works as a motor against actin to generate motion.

We use our muscles a lot, but do we know how they work? No one does, fully, but quite a bit is known. At the core is a myosin motor protein, which levers against actin filaments that are ordered in almost crystalline arrays inside muscle cells. This system long predates the advent of muscles, however, since all of our cells contain actin and myosin, which jointly help cells move around, and move cargoes around within cells. Vesicles, for instance, often traffic to where they are needed on roads of actin. The human genome encodes forty different forms of myosin, specialized for all sorts of different tasks. For example, hearing (and balance) depends in tiny rod-like hair cells filled with tight bundles of actin. Several myosin genes have variants associated with severe hearing loss, because they have important developmental roles in helping these structures form. Actin/myosin is one of the ancient transportation systems of life (the other is the dynein motor and microtubules).

Myosin uses ATP to power motion, and a great deal of work has gone into figuring how this happens. A recent paper took things to a new level by slowing down the action significantly. They used a mutant form of myosin that is specifically slower in the power stroke. And they used a quick mix and spray method that cut times between adding actin to the cocked myosin, and getting it frozen in a state ready for cryo-electron microscopy, down to 10 milliseconds. The cycle of the myosin motor goes like this:

  • End of power stroke, myosin bound to actin
  • ATP binds to myosin, unbinds from actin
  • Lever arm of myosin cocks back to a primed state, as ATP is hydolyzed to ADP + Pi
  • ADP is present, and myosin binds to actin again
  • Actin binding triggers both power stroke of the lever, and release of Pi and ADP
  • End of power stroke, myosin bound to actin

A schematic of the myosin/actin cycle. Actin is in pink, myosin in gray and green, with cargoes (if any, or bundle of other myosins as in muscle) linked below the green lever.

The structure that these researchers came up with is:

Basic structure of myosin (colors) with actin (gray), in two conformations- primed or post-power stroke. The blue domain at top (converter) is where the lever extension is attached and is the place with the motion / force is focused. But note how the rest of the myosin structure (lavender, green, yellow, red) also shifts subtly to assist the motion. 

They also provide a video of these transformations, based on molecular dynamics simulations.

Sampling times between 10 milliseconds and 120 milliseconds, they saw structures in each of the before and after configurations, but none in intermediate states. That indicates that the motor action is very fast, and the cocking/priming event puts the enzyme in an unstable configuration. The power stroke may not look like much, but the converter domain is typically hitched to a long element that binds to cargos, leading (below) to quite a bit of motion per stroke and per ATP. About 13 actin units can be traversed along the filament in a single bound, in fact. It is also noteworthy that this mechanism is very linear. The converter domain flips in the power stroke without twisting much, so that cargoes progress linearly along the actin road, without much loss of energy from side-to-side motion.

Fuller picture of how myosin (colored) with its lever extensions (blue) walks along actin (gray) by large steps, that cross up to 13 actin subunits at a time. The inset describes the very small amount of twist that happens, small enough that myosin walks in a rather straight line and easily finds the next actin landing spot without a lot of feeling about.

Finally, these authors delved into a few more details about the big structural transition of the power stroke. Each of these show subtle shifts in the structure that help the main transition along. In f/g the HCM loop dips down to bind actin more tightly. In h/i the black segment already bound to actin squinches down into a new loop, probably swinging myosin slightly over to the right. This segment is at the base of the green segment, so has strong transmission effects on the power stroke. In j/k the ATP binding site, now holding ADP and Pi, loses the phosphate Pi, and there are big re-arrangements of all the surrounding loops- green, lavender, and blue. These images do not really do justice to the whole motion, nor really communicate how the ATP site sends power through the green domain to the converter (top, blue) domain which flips for the power stroke. The video referenced above gives more details, though without much annotation.

Detailed closeups of the before/after power stroke structures. Coloring is consistent with the strucutres above.


  • Reaping what one sows.
  • Oh, and about guns.
  • A room of one's own.

Saturday, September 6, 2025

How to Capture Solar Energy

Charge separation is handled totally differently by silicon solar cells and by photosynthetic organisms.

Everyone comes around sooner or later to the most abundant and renewable form of energy, which is the sun. The current administration may try to block the future, but solar power is the best power right now and will continue to gain on other sources. Likewise, life started by using some sort of geological energy, or pre-existing carbon compounds, but inevitably found that tapping the vast powers streaming in from the sun was the way to really take over the earth. But how does one tap solar energy? It is harder than it looks, since it so easily turns into heat and lost energy. Some kind of separation and control are required, to isolate the power (that is to say, the electron that was excited by the photon of light), and harness it to do useful work.

Silicon solar cells and photosynthesis represent two ways of doing this, and are fundamentally, even diametrically, different solutions to this problem. So I thought it would be interesting to compare them in detail. Silicon is a semiconductor, torn between trapping its valence electrons in silicon atoms, or distributing them around in a conduction band, as in metals. With elemental doping, silicon can be manipulated to bias these properties, and that is the basis of the solar cell.

Schematic of a silicon solar cell. A static voltage exists across the N-type to P-type boundary, sweeping electrons freed by the photoelectric effect (light) up to the conducting electrode layer.


Solar cells have one side doped to N status, and the bulk set to P doping status. While the bulk material is neutral on both sides, at the boundary, a static charge scheme is set up where electrons are attracted into the P-side, and removed from the N-side. This static voltage has very important effects on electrons that are excited by incoming light and freed from their silicon atoms. These high energy electrons enter the conduction band of the material, and can migrate. Due to the prevailing field, they get swept towards the N side, and thus are separated and can be siphoned off with wires. The current thus set up can exert a pressure of about 0.6 volt. That is not much, nor is it equivalent to the 2 to 3 electron volts received from each visible photon. So a great deal of energy is lost as heat.

Solar cells do not care about capturing each energized electron in detail. Their purpose is to harvest a bulk electrical voltage + current with which to do some work in our electrical grids. Photosynthesis takes an entirely different approach, however. This may be mostly for historical and technical reasons, but also because part of its purpose is to do chemical work with the captured electrons. Biology tends to take a highly controlling approach to chemistry, using precise shapes, functional groups, and electrical environments to guide reactions to exact ends. While some of the power of photosynthesis goes toward pumping protons out of the membrane, setting up a gradient later used to make ATP, about half is used for other things like splitting water to replace lost electrons, and making reducing chemicals like NADPH.

A portion of a poster about the core processes of photosynthesis. It provides a highly accurate portrayal of the two photosystems and their transactions with electrons and protons.

In plants, photosynthesis is a chain of processes focused around two main complexes, photosystems I and II, and all occurring within membranes- the thylakoid membranes of the chloroplast. Confusingly, photosystem II comes first, accepting light, splitting water, pumping some protons, and sending out a pair of electrons on mobile plastoquinones, which eventually find their way to photosystem I, which jacks up their energy again using another quantum of light, to produce NADPH. 

Photosystem II is full of chlorophyll pigments, which are what get excited by visible photons. But most of them are "antenna" chlorophylls, passing the excitation along to a pair of centrally located chlorophylls. Note that the light energy is at this point passed as a molecular excitation, not as a free electron. This passage may happen by Förster resonance energy transfer, but is so fast and efficient that stronger Redfield coupling may be involved as well. Charge separation only happens at the reaction center, where an excited electron is popped out to a chain of recipients. The chlorophylls are organized so that the pair at the reaction center have a slightly lower energy of excitation, thus serve as a funnel for excitation energy from the antenna system. These transfers are extremely rapid, on the picosecond time scale.

It is interesting to note tangentially that only red light energy is used. Chlorophylls have two excitation states, excited by red light (680 nm = 1.82 eV) and blue light (400-450 nm, 2.76 eV) (note the absence of green absorbance). The significant extra energy from blue light is wasted, radiated away to let it (the excited electron) relax to the lower excitation state, which is then passed though the antenna complex as though it had come from red light. 

Charge separation is managed precisely at the photosystem II reaction center through a series of pigments of graded energy capacity, sending the excited electron first to a neighboring chlorophyll, then to a pheophytin, then to a pair of iron-coordinated quinones, which then pass two electrons to a plastoquinone that is released to the local membrane, to float off to the cytochrome b6f complex. In photosystem II, another two photons of light are separately used to power the splitting of one water molecule, (giving two electrons and pumping two protons). So the whole process, just within photosystem II, yields, per four light quanta, four protons pumped from one side of the membrane to the other. Since the ATP sythetase uses about three protons per ATP, this nets just over one ATP per four photons. 

Some of the energetics of photosystem II. The orientations and structures of the reaction center paired chlorophylls (Pd1, Pd2), the neighboring chlorophyll (Chl), and then the pheophytin (Ph) and quinones (Qa, Qb) are shown in the inset. Energy of the excited electron is sacrifice gradually to accomplish the charge separation and channeling, down to the final quinone pairing, after which the electrons are released to a plastoquinone and send to another complex in the chain.

So the principles of silicon and biological solar cells are totally different in detail, though each gives rise to a delocalized field, one of electrons flowing with a low potential, and the other of protons used later for ATP generation. Each energy system must have a way to pop off an excited electron in a controlled, useful way that prevents it from recombining with the positive ion it came from. That is why there is such an ornate conduction pathway in photosystem II to carry that electron away. Overall, points go to the silicon cell for elegance and simplicity, and we in our climate crisis are the beneficiaries, if we care to use it. 

But the photosynthetic enzymes are far, far older. A recent paper pointed out that no only are photosystems II and I clearly cousins of each other, but it is likely that, contrary to the consensus heretofore, photosystem II is the original version, at least of the various photosystems that currently exist. All the other photosystems (including those in bacteria that lack oxygen stripping ability) carry traces of the oxygen evolving center. It makes sense that getting electrons is a fundamental part of the whole process, even though that chemistry is quite challenging. 

That in turn raises a big question- if oxygen evolving photosystems are primitive (originating very roughly with the last common ancestor of all life, about four billion years ago) then why was earth's atmosphere oxygenated only from two billion years ago onward? It had been assumed that this turn in Earth history marked the evolution of photosystem II. The authors point out additionally that there is also evidence for the respiratory use of oxygen from these extremely early times as well, despite the lack of free oxygen. Quite perplexing, (and the authors decline to speculate), but one gets the distinct sense that possibly life, while surprisingly complex and advanced from early times, was not operating at the scale it does today. For example, colonization of land had to await the buildup of sufficient oxygen in the atmosphere to provide a protective ozone layer against UV light. It may have taken the advent of eukaryotes, including cyanobacterial-harnessing plants, to raise overall biological productivity sufficiently to overcome the vast reductive capacity of the early earth. On the other hand, speculation about the evolution of early life based on sequence comparisons (as these authors do) is notoriously prone to artifacts, since what evolves at vanishingly slow rates today (such as the photosystem core proteins) must have originally evolved at quite a rapid clip to attain the functions now so well conserved. We simply can not project ancient ages (at the four billion year time scales) from current rates of change.


Saturday, August 23, 2025

Why Would a Bacterium Commit Suicide?

Our innate immune system, including suicide of infected cells, has antecedents in bacteria.

We have a wide variety of defense from pathogens, from our skin and its coating of RNase and antimicrobial peptides, to the infinite combinatorial firepower of the adaptive immune system, which is primed by vaccines. In between is something called the innate immune system, which is built-in and static rather than adaptive, but is very powerful nonetheless. It is largely built around particular proteins that recognize common themes in pathogens, like the free RNA and DNA of viral genomes, or lipopolysaccharide that coats most bacteria. There are also internal damage signals, such as cellular proteins that have leaked out and are visible to wandering immune cells, that raise similar alarms. The alarms lead to inflammation, the gathering of immune cells, and hopefully to resolution of the problem. 

One powerful defensive strategy our cells have is apoptosis, or cellular suicide. If the signals from an incoming infection are too intense, a cell, in addition to activating its specific antiviral defenses, goes a few steps further and generates a massive inflammasome that rounds up and turns on a battery of proteases that chew up the cell, destroying it from inside. The pieces are then strewn around to be picked up by the macrophages and other cleanup crews, which hopefully can learn something from the debris about the invading pathogen. One particular target of these proteases are gasdermins, which are activated via this proteolysis and then assemble into huge pores that plant themselves into the plasma membrane and mitochondrial membranes, rapidly killing the cell by collapsing all the ion gradients across these membranes. 

A human cell committing apoptosis, and falling apart.

A recent paper showed that key parts of this apparatus is present in bacteria as well. It was both technically interesting, since they relied on a lot of AI tools to discern the rather distant relations between pathogen (that is to say, phages- the viruses of bacteria) receptors from bacteria, and generally intriguing, because suicide is generally something thought to be a civilized behavior of cells in multicellular organisms, protecting the rest of the body from spread of the pathogen. Bacteria, despite living in mucky biofilms and other kinds of colonies, are generally thought to be loners, only out for their own reproduction. Why would they kill themselves? Well, anytime they are in a community, that community is almost certainly composed of relatives, probably identical clones of a single founding cell. So it would be a highly related community indeed, and well worth protecting in this way. 

A bacterial gasdermin outruns phages infecting the cell. Two kinds of cells are mixed together here, ones without a gasdermin (green) and ones with (black). All are infected at zero time, and a vital dye is added (pink) that only gets into cells through large pores, like the gasdermin pore. At 45 minutes and after, the green (control) cells are dying and getting blown apart by escaping phages. On the other hand, the gasdermin+ cells develop pores and get stained pink, showing that they are dead too. But they don't blow up, indicating that they have shut down phage propagation.

The researchers heard that some bacteria have gasdermins, so they wondered whether they have the other parts of the system- the proteases and the sensor proteins. And indeed, they do. While traditional sequence similarity analysis didn't say so, structural comparison courtesy of the AlphaFold program showed that a protease in the same operon as gasdermin had CARD domains. These domains are signatures of caspases and of caspase interacting proteins, like the sensor proteins in the human innate immune system. They bind other CARD domains, thus mediating assembly of the large complexes that lead to inflammation and apoptosis.

Structure of the bacterial CARD domain, courtesy of AlphaFold, showing some similarity with a human CARD domain, which was not very apparent on the sequence level.

The operon of this bacterium, which encodes the whole system- gasdermin, protease (two of them), and sensor.

The researchers then raised their AI game by using another flavor of AlphaFold to predict interactions that the bacterial CARD/protease protein might have. This showed an interaction with another protein in the same operon, with similarity to NLR sensor proteins in humans, which they later confirmed happened in vitro as well. This suggests that this bacterium, and many bacteria, have the full circuit of sensor for incoming phage, activatable caspase-like protease, and then cleavable gasdermin as the effector of cell suicide.

A comparison of related operons from several other bacteria.

Looking at other bacteria, they found that many have similar systems. Some link to other effectors, rather than a pore-forming gasdermin. But most share a similar sensor-to-protease circuit that is the core of this defense system. Lastly, they also asked what triggers this whole system from the incoming phage. The answer, in this case, is a phage protein called rIIB. Unfortunately, it is not clear either what rIIB does for the phage or whether it triggers the CARD/gasdermin system by activating the bacterial NLR sensor protein, as would be assumed. What is known, however, is that rIIB has a function in defending phage against another bacterial defense system called RexAB. This it looks as though this particular arms race has ramified into a complicated back and forth as bacteria try as best they can to insure themselves against mass infection.