Biophilia: evolution

Showing posts with label evolution. Show all posts

Saturday, November 22, 2025

Ground Truth for Genetic Mutations

Saturation mutagenasis shows that our estimates of the functional effect of uncharacterized mutations are not so great.

Human genomes can now be sequenced for less than $1,000. This technological revolution has enabled a large expansion of genetic testing, used for cancer tissue diagnosis and tracking, and for genetic syndrome analysis both of embryos before birth and affected people after birth. But just because a base among the 3 billion of the genome is different from the "reference" genome, that does not mean it is bad. Judging whether a variant (the modern, more neutral term for mutation) is bad takes a lot of educated guesswork.

A recent paper described a deep dive into one gene, where the authors created and characterized the functional consequence of every possible coding variant. Then they evaluated how well our current rules of thumb and prediction programs for variant analysis compare with what they found. It was a mediocre performance. The gene is CDKN2A, one of our more curious oddities. This is an important tumor suppressor gene that inhibits cell cycle progression and promotes DNA repair- it is often mutated in cancers. But it encodes not one, but two entirely different proteins, by virtue of a complex mRNA splicing pattern that uses distinct exons in some coding portions, and parts of one sequence in two different frames, to encode these two proteins, called p16 and p14.

One gene, two proteins. CDKN2A has a splicing pattern (mRNA exons shown as boxes at top, with pink segments leading to the p14 product, and the blue segments leading the p16 product) that generates two entirely different proteins from one gene. Each product has tumor suppressing effects, though via distinct mechanisms.

Regardless of the complex splicing and protein coding characteristics, the authors generated all possible variants in every possible coded amino acid (156 amino acids in all, as both produced proteins are relatively short). Since the primary roles of these proteins are in cell cycle and proliferation control, it was possible to assay function by their effect when expressed in cultured pancreatic cells. A deleterious effect on the protein was revealed as, paradoxically, increased growth of these cells. They found that about 600 of the 3,000 different variants in their catalog had such an effect, or 20%.

This is an expected rate of effect, on the whole. Most positions in proteins are not that important, and can be substituted by several similar amino acids. For a typical enzyme, for instance, the active site may be made up of a few amino acids in a particular orientation, and the rest of the protein is there to fold into the required shape to form that active site. Similar folding can be facilitated by numerous amino acids at most positions, as has been richly documented in evolutionary studies of closely-related proteins. These p16 and p14 proteins interact with a few partners, so they need to maintain those key interfacial surfaces to be fully functional. Additionally, the assay these researchers ran, of a few generations of growth, is far less sensitive than a long-term true evolutionary setting, which can sift out very small effects on a protein, so they were setting a relatively high bar for seeing a deleterious effect. They did a selective replication of their own study, and found a reproducibility rate of about 80%, which is not great, frankly.

"Of variants identified in patients with cancer and previously reported to be functionally deleterious in published literature and/or reported in ClinVar as pathogenic or likely pathogenic (benchmark pathogenic variants), 27 of 32 (84.4%) were functionally deleterious in our assay"
"Of 156 synonymous variants and six missense variants previously reported to be functionally neutral in published literature and/or reported in ClinVar as benign or likely benign (benchmark benign variants), all were characterized as functionally neutral in our assay "
"Of 31 VUSs previously reported to be functionally deleterious, 28 (90.3%) were functionally deleterious and 3 (9.7%) were of indeterminate function in our assay."
"Similarly, of 18 VUSs previously reported to be functionally neutral, 16 (88.9%) were functionally neutral and 2 (11.1%) were of indeterminate function in our assay"

Here we get to the key issues. Variants are generally classified as benign, pathogenic/deleterious, or "variant of unknown/uncertain significance". The latter are particularly vexing to clinical geneticists. The whole point of sequencing a patient's tumor or genomic DNA is to find causal variants that can illuminate their condition, and possibly direct treatment. Seeing lots of "VUS" in the report leaves everyone in the dark. The authors pulled in all the common prediction programs that are officially sanctioned by the ACMG- Americal College of Medical Genetics, which is the foremost guide to clinical genetics, including the functional prediction of otherwise uncharacterized sequence variants. There are seven such programs, including one driven by AI, AlphaMissense that is related to the Nobel prize-winning AlphaFold.

These programs strain to classify uncharacterized mutations as "likely pathogenic", "likely benign", or, if unable to make a conclusion, VUS/indeterminate. They rely on many kinds of data, like amino acid similarity, protein structure, evolutionary conservation, and known effects in proteins of related structure. They can be extensively validated against known mutations, and against new experimental work as it comes out, so we have a pretty good idea of how they perform. Thus they are trusted to some extent to provide clinical judgements, in the absence of better data.

Each of seven programs (on bottom) gives estimations of variant effect over the same pool of mutations generated in this paper. This was a weird way to present simple data, but each bar contains the functional results the authors developed in their own data (numbers at the bottom, in parentheses, vertical). The bars were then colored with the rate of deleterious (black) vs benign (white) prediction from the program. The ideal case would be total black for the first bar in each set of three (deleterious) and total white in the third bar in each set (benign). The overall lineup/accuracy of all program predictions vs the author data was then overlaid by a red bar (right axis). The PrimateAI program was specially derived from comparison of homologous genes from primates only, yielding a high-quality dataset about the importance of each coded amino acid. However, it only gave estimates for 906 out of the whole set of 2964 variants. On the other hand, cruder programs like PolyPhen-2 gave less than 40% accuracy, which is quite disappointing for clinical use.

As shown above, the algorithms gave highly variable results, from under 40% accurate to over 80%. It is pretty clear that some of the lesser programs should be phased out. Of programs that fielded all the variants, the best were AlphaMissense and VEST, which each achieved about 70% accuracy. This is still not great. The issue is that, if a whole genome sequence is run for a patient with an obscure disease or syndrome, and variants vs the reference sequence are seen in several hundred genes, then a gene like CDKN2A could easily be pulled into the list of pathogenic (and possibly causal) variants, or be left out, on very shaky evidence. That is why even small increments in accuracy are critically important in this field. Genetic testing is a classic needle-in-a-haystack problem- a quest to find the one mutation (out of millions) that is driving a patient's cancer, or a child's inherited syndrome.

Still outstanding is the issue of non-coding variants. Genes are not just affected by mutations in their protein coding regions (indeed many important genes do not code for proteins at all), but by regulatory regions nearby and far. This is a huge area of mutation effects that are not really algorithmically accessible yet. As a prediction problem, it is far more difficult than predicting effects on a coded protein. It will requiring modeling of the entire gene expression apparatus, much of which remains shrouded in mystery.

Oh, no! Not the shopping cart!
Jevons and Malthus will have their say, if we do not get our act together.
It is time to scrap GDP, and the mania for growth.
Blaming the victim.
Everything crypto will be just perfect.
Can I play piano with my mind?
We are killers.

Saturday, October 18, 2025

When the Battery Goes Dead

How do mitochondria know when to die?

Mitochondria are the energy centers within our cells, but they are so much more. They are primordial bacteria that joined with archaea to collaborate in the creation of eukaryotes. They still have their own genomes, RNA transcription and protein translation. They play central roles in the life and death of cells, they divide and coalesce, they motor around the cell as needed, kiss other organelles to share membranes, and they can get old and die. When mitochondria die, they are sent to the great garbage disposal in the sky, the autophagosome, which is a vesicle that is constructed as needed, and joins with a lysosome to digest large bits of the cell, or of food particles from the outside.

The mitochondrion spends its life (only a few months) doing a lot of dangerous reactions and keeping an electric charge elevated over its inner membrane. It is this charge, built up from metabolic breakdown of sugars and other molecules, that powers the ATP-producing rotary enzyme. And the decline of this charge is a sign that the mitochondrion is getting old and tired. A recent paper described how one key sensor protein, PINK1, detects this condition and sets off the disposal process. It turns out that the membrane charge does not only power ATP synthesis, but it powers protein import to the mitochondrion as well. Over the eons, most of the mitochondrion's genes have been taken over by the nucleus, so all but a few of the mitochondrion's proteins arrive via import- about 1500 different proteins in all. And this is a complicated process, since mitochondria have inner and outer membranes, (just as many bacteria do), and proteins can be destined to any of these four compartments- in either membrane, in the inside (matrix), or in the inter-membrane space.

Figure 12-26. Protein import by mitochondria.

Textbook representation of mitochondrial protein import, with a signal sequence (red) at the front (N-terminus) of the incoming protein (green), helping it bind successively to the TOM and TIM translocators.

The outer membrane carries a protein import complex called TOM, while the inner membrane carries an import complex called TIM. These can dock to each other, easing the whole transport process. The PINK1 protein is a somewhat weird product of evolution, spending its life being synthesized, transported across both mitochondrial membranes, and then partially chopped up in the mitochondrial matrix before its remains are exported again and fully degraded. That is when everything is working correctly! When the mitochondrial charge declines, PINK1 gets stuck, threaded through TOM, but unable to transit the TIM complex. PINK1 is a kinase, which phosphorylates itself as well as ubiquitin, so when it is stuck, two PINK1 kinases meet on the outside of the outer membrane, activate each other, and ultimately activate another protein, PARKIN, whose name derives from its importance in parkinson's disease, which can be caused by an excess of defective mitochondria in sensitive tissues, specifically certain regions and neurons of the brain. PARKIN is a ubiquitin ligase, which attaches the degradation signal ubiquitin to many proteins on the surface of the aged mitochondrion, thus signaling the whole mess to be gobbled up by an autophagosome.

A data-rich figure 1 from the paper shows purification of the tagged complex (top), and then the EM structure at bottom. While the purification (B, C) show the presence of TIM subunits, they did not show up in the EM structures, perhaps becuase they were not stable enough or frequent enough in proportion to the TOM subunits. But the PINK1+TOM_VDAC2 structures are stunning, helping explain how PINK1 dimerized so easily when it translocation is blocked.

The current authors found that PINK1 had convenient cysteine residues that allowed it to be experimentally crosslinked in the paired state, and thus freeze the PARKIN-activating conformation. They isolated large amounts of such arrested complexes from human cells, and used electon microscopy to determine the structure. They were amazed to see, not just PINK1 and the associated TOM complex, but also VDAC2, which is the major transporter that lets smaller molecules easily cross the outer membrane. The TOM complexes were beautifully laid out, showing the front end (N-terminus) of PINK1 threaded through each TOM complex, specifically the TOM40 ring structure.

What was missing, unfortunately, was any of the TIM complex, though some TIM subunits did co-purify with the whole complex. Nor was PARKIN or ubiquitin present, leaving out a good bit of the story. So what is VDAC2 doing there? The authors really don't know, though they note that reactive oxygen byproducts of mitochondrial metabolism would build up during loss of charge, acting as a second signal of mitochondrial age. These byproducts are known to encourage dimerization of VDAC channels, which naturally leads by the complex seen here to dimerization and activation of the PINK1 protein. Additionally, VDACs are very prevalent in the outer membrane and prominent ubiquitination targets for autophagy signaling.

To actually activate PARKIN ubiquitination, PINK1 needs to dissociate again, a process that the authors speculate may be driven by binding of ubiquitin by PINK1, which might be bulky enough to drive the VDACs apart. This part was quite speculative, and the authors promise further structural studies to figure out this process in more detail. In any case, what is known is quite significant- that the VDACs template the joining of two PINK1 kinases in mid-translocation, which, when the inner membrane charge dies away, prompts the stranded PINK1 kinases to activate and start the whole disposal cascade.

Summary figure from the authors, indicating some speculative steps, such as where the reactive oxygen species excreted by VDAC2 sensitise PINK1, perhaps by dimerizing the VDAC channel itself. And where ubiquitin binding by PINK1 and/or VDAC prompts dissociation, allowing PARKIN to come in and get activated by PINK1 and spread the death signal around the surface of the mitochondrion.

It is worth returning briefly to the PINK1 life cycle. This is a protein whose whole purpose, as far as we know, is to signal that mitochondria are old and need to be given last rites. But it has a curiously inefficient way of doing that, being synthesized, transported, and degraded continuously in a futile and wasteful cycle. Evolution could hardly have come up with a more cumbersome, convoluted way to sense the vitality of mitochondria. Yet there we are, doubtless trapped by some early decision which was surely convenient at the time, but results today in a constant waste of energy, only made possible by the otherwise amazingly efficient and finely tuned metabolic operations of PINK1's target, the mitochondrion.

More on mitochondria.
The Hitler thing is becoming.. a thing.
What Mormonism is all about.
Injustice reigns supreme.
Death on the roads.
Graph of the week- sea level over the last four million years.

Note that at the glacial maxima, sea levels were almost 500 feet (150 meters) lower than today. And today, we are hitting a 3 million year peak level.

Saturday, October 11, 2025

The Role of Empathy in Science

Jane Goodall's career was not just a watershed in ethology and primate psychology, but in the way science is done.

I vividly remember reading Jane Goodall's descriptions of the chimpanzees in her Gombe project. Here we had been looking for intelligent alien life with SETI, and wondering about life on Mars. But she revealed that intelligent, curious personalities exist right here, on Earth, in the African forest. Alien, but not so alien. Indeed, they loved their families, suffered heartbreaking losses, and fought vicious battles. They had cultures, and tools, deviousness and generosity.

What was striking was not just the implications of all this for us as humans and as conservationists, but also what it overturned about scientific attitudes. Science had traditionally had a buttoned-up attitude- "hard science", as it were. This reached a crescendo with behaviorism, where nothing was imputed to the psychology of others, whether animals or children, other than machine-like input/output reflexes. Machines were the reigning model, as though we had learned nothing since Descartes.

Ask a simple question, get a simple answer.

This was appalling enough on its own terms, but it really impoverished scientific progress as well. Goodall helped break open this box by showing in a particularly dramatic way the payoff possible from having deep empathy with one's scientific object. Scientists have always engaged with their questions out of interest and imagination. It is a process of feeling one's way through essentially a fantasy world, until one proves that the rules you have divined actually are provable via some concrete demonstration- doing an experiment, or observing the evidence of tool use by chimpanzees. It is intrinsically an empathetic process, even if the object of that empathy is a geological formation, or a sub-atomic particle.

But discipline is needed too. Mathematics reigns supreme in physics, because, luckily, physics follows extremely regular rules. That is what is so irritating and uncomfortable about quantum mechanics. That is a field where empathy sort of fails- notoriously, no one really "understands" quantum mechanics, even though the math certainly works out. But in most fields, it is understanding we are after, led by empathy and followed by systematization of the rules at work, if any. This use of empathy has methodological implications. We become attached to the objects of our work, and to our ideas about them. So discipline involves doing things like double-blind trials to insulate a truth-finding process from bias. And transparency with open publication followed by open critique.

In the 20th century, science was being overwhelmed by the discipline and the adulation of physics, and losing the spark of inspiration. Jane Goodall helped to right that ship, reminding us that scientific methods and attitudes need to match the objects we are working with. Sure, math might be the right approach to electrons. But our fellow animals are an entirely different kettle of fish. For example, all animals follow their desires. The complexities of mating among animals means that they are all driven just as we are- by emotions, by desire, by pain, by love. The complexity may differ, but the intensity of these emotions can not possibly be anything but universal.

"He will pretend false votes, foul play, hold possession of the reins of government."

Saturday, September 13, 2025

Action at the Heart of Action

How myosin works as a motor against actin to generate motion.

We use our muscles a lot, but do we know how they work? No one does, fully, but quite a bit is known. At the core is a myosin motor protein, which levers against actin filaments that are ordered in almost crystalline arrays inside muscle cells. This system long predates the advent of muscles, however, since all of our cells contain actin and myosin, which jointly help cells move around, and move cargoes around within cells. Vesicles, for instance, often traffic to where they are needed on roads of actin. The human genome encodes forty different forms of myosin, specialized for all sorts of different tasks. For example, hearing (and balance) depends in tiny rod-like hair cells filled with tight bundles of actin. Several myosin genes have variants associated with severe hearing loss, because they have important developmental roles in helping these structures form. Actin/myosin is one of the ancient transportation systems of life (the other is the dynein motor and microtubules).

Myosin uses ATP to power motion, and a great deal of work has gone into figuring how this happens. A recent paper took things to a new level by slowing down the action significantly. They used a mutant form of myosin that is specifically slower in the power stroke. And they used a quick mix and spray method that cut times between adding actin to the cocked myosin, and getting it frozen in a state ready for cryo-electron microscopy, down to 10 milliseconds. The cycle of the myosin motor goes like this:

End of power stroke, myosin bound to actin
ATP binds to myosin, unbinds from actin
Lever arm of myosin cocks back to a primed state, as ATP is hydolyzed to ADP + Pi
ADP is present, and myosin binds to actin again
Actin binding triggers both power stroke of the lever, and release of Pi and ADP
End of power stroke, myosin bound to actin

A schematic of the myosin/actin cycle. Actin is in pink, myosin in gray and green, with cargoes (if any, or bundle of other myosins as in muscle) linked below the green lever.

The structure that these researchers came up with is:

Basic structure of myosin (colors) with actin (gray), in two conformations- primed or post-power stroke. The blue domain at top (converter) is where the lever extension is attached and is the place with the motion / force is focused. But note how the rest of the myosin structure (lavender, green, yellow, red) also shifts subtly to assist the motion.

They also provide a video of these transformations, based on molecular dynamics simulations.

Sampling times between 10 milliseconds and 120 milliseconds, they saw structures in each of the before and after configurations, but none in intermediate states. That indicates that the motor action is very fast, and the cocking/priming event puts the enzyme in an unstable configuration. The power stroke may not look like much, but the converter domain is typically hitched to a long element that binds to cargos, leading (below) to quite a bit of motion per stroke and per ATP. About 13 actin units can be traversed along the filament in a single bound, in fact. It is also noteworthy that this mechanism is very linear. The converter domain flips in the power stroke without twisting much, so that cargoes progress linearly along the actin road, without much loss of energy from side-to-side motion.

Fuller picture of how myosin (colored) with its lever extensions (blue) walks along actin (gray) by large steps, that cross up to 13 actin subunits at a time. The inset describes the very small amount of twist that happens, small enough that myosin walks in a rather straight line and easily finds the next actin landing spot without a lot of feeling about.

Finally, these authors delved into a few more details about the big structural transition of the power stroke. Each of these show subtle shifts in the structure that help the main transition along. In f/g the HCM loop dips down to bind actin more tightly. In h/i the black segment already bound to actin squinches down into a new loop, probably swinging myosin slightly over to the right. This segment is at the base of the green segment, so has strong transmission effects on the power stroke. In j/k the ATP binding site, now holding ADP and Pi, loses the phosphate Pi, and there are big re-arrangements of all the surrounding loops- green, lavender, and blue. These images do not really do justice to the whole motion, nor really communicate how the ATP site sends power through the green domain to the converter (top, blue) domain which flips for the power stroke. The video referenced above gives more details, though without much annotation.

Detailed closeups of the before/after power stroke structures. Coloring is consistent with the strucutres above.

Reaping what one sows.
Oh, and about guns.
A room of one's own.

Saturday, September 6, 2025

How to Capture Solar Energy

Charge separation is handled totally differently by silicon solar cells and by photosynthetic organisms.

Everyone comes around sooner or later to the most abundant and renewable form of energy, which is the sun. The current administration may try to block the future, but solar power is the best power right now and will continue to gain on other sources. Likewise, life started by using some sort of geological energy, or pre-existing carbon compounds, but inevitably found that tapping the vast powers streaming in from the sun was the way to really take over the earth. But how does one tap solar energy? It is harder than it looks, since it so easily turns into heat and lost energy. Some kind of separation and control are required, to isolate the power (that is to say, the electron that was excited by the photon of light), and harness it to do useful work.

Silicon solar cells and photosynthesis represent two ways of doing this, and are fundamentally, even diametrically, different solutions to this problem. So I thought it would be interesting to compare them in detail. Silicon is a semiconductor, torn between trapping its valence electrons in silicon atoms, or distributing them around in a conduction band, as in metals. With elemental doping, silicon can be manipulated to bias these properties, and that is the basis of the solar cell.

Schematic of a silicon solar cell. A static voltage exists across the N-type to P-type boundary, sweeping electrons freed by the photoelectric effect (light) up to the conducting electrode layer.

Solar cells have one side doped to N status, and the bulk set to P doping status. While the bulk material is neutral on both sides, at the boundary, a static charge scheme is set up where electrons are attracted into the P-side, and removed from the N-side. This static voltage has very important effects on electrons that are excited by incoming light and freed from their silicon atoms. These high energy electrons enter the conduction band of the material, and can migrate. Due to the prevailing field, they get swept towards the N side, and thus are separated and can be siphoned off with wires. The current thus set up can exert a pressure of about 0.6 volt. That is not much, nor is it equivalent to the 2 to 3 electron volts received from each visible photon. So a great deal of energy is lost as heat.

Solar cells do not care about capturing each energized electron in detail. Their purpose is to harvest a bulk electrical voltage + current with which to do some work in our electrical grids. Photosynthesis takes an entirely different approach, however. This may be mostly for historical and technical reasons, but also because part of its purpose is to do chemical work with the captured electrons. Biology tends to take a highly controlling approach to chemistry, using precise shapes, functional groups, and electrical environments to guide reactions to exact ends. While some of the power of photosynthesis goes toward pumping protons out of the membrane, setting up a gradient later used to make ATP, about half is used for other things like splitting water to replace lost electrons, and making reducing chemicals like NADPH.

A portion of a poster about the core processes of photosynthesis. It provides a highly accurate portrayal of the two photosystems and their transactions with electrons and protons.

In plants, photosynthesis is a chain of processes focused around two main complexes, photosystems I and II, and all occurring within membranes- the thylakoid membranes of the chloroplast. Confusingly, photosystem II comes first, accepting light, splitting water, pumping some protons, and sending out a pair of electrons on mobile plastoquinones, which eventually find their way to photosystem I, which jacks up their energy again using another quantum of light, to produce NADPH.

Photosystem II is full of chlorophyll pigments, which are what get excited by visible photons. But most of them are "antenna" chlorophylls, passing the excitation along to a pair of centrally located chlorophylls. Note that the light energy is at this point passed as a molecular excitation, not as a free electron. This passage may happen by Förster resonance energy transfer, but is so fast and efficient that stronger Redfield coupling may be involved as well. Charge separation only happens at the reaction center, where an excited electron is popped out to a chain of recipients. The chlorophylls are organized so that the pair at the reaction center have a slightly lower energy of excitation, thus serve as a funnel for excitation energy from the antenna system. These transfers are extremely rapid, on the picosecond time scale.

It is interesting to note tangentially that only red light energy is used. Chlorophylls have two excitation states, excited by red light (680 nm = 1.82 eV) and blue light (400-450 nm, 2.76 eV) (note the absence of green absorbance). The significant extra energy from blue light is wasted, radiated away to let it (the excited electron) relax to the lower excitation state, which is then passed though the antenna complex as though it had come from red light.

Charge separation is managed precisely at the photosystem II reaction center through a series of pigments of graded energy capacity, sending the excited electron first to a neighboring chlorophyll, then to a pheophytin, then to a pair of iron-coordinated quinones, which then pass two electrons to a plastoquinone that is released to the local membrane, to float off to the cytochrome b6f complex. In photosystem II, another two photons of light are separately used to power the splitting of one water molecule, (giving two electrons and pumping two protons). So the whole process, just within photosystem II, yields, per four light quanta, four protons pumped from one side of the membrane to the other. Since the ATP sythetase uses about three protons per ATP, this nets just over one ATP per four photons.

Some of the energetics of photosystem II. The orientations and structures of the reaction center paired chlorophylls (Pd1, Pd2), the neighboring chlorophyll (Chl), and then the pheophytin (Ph) and quinones (Qa, Qb) are shown in the inset. Energy of the excited electron is sacrifice gradually to accomplish the charge separation and channeling, down to the final quinone pairing, after which the electrons are released to a plastoquinone and send to another complex in the chain.

So the principles of silicon and biological solar cells are totally different in detail, though each gives rise to a delocalized field, one of electrons flowing with a low potential, and the other of protons used later for ATP generation. Each energy system must have a way to pop off an excited electron in a controlled, useful way that prevents it from recombining with the positive ion it came from. That is why there is such an ornate conduction pathway in photosystem II to carry that electron away. Overall, points go to the silicon cell for elegance and simplicity, and we in our climate crisis are the beneficiaries, if we care to use it.

But the photosynthetic enzymes are far, far older. A recent paper pointed out that no only are photosystems II and I clearly cousins of each other, but it is likely that, contrary to the consensus heretofore, photosystem II is the original version, at least of the various photosystems that currently exist. All the other photosystems (including those in bacteria that lack oxygen stripping ability) carry traces of the oxygen evolving center. It makes sense that getting electrons is a fundamental part of the whole process, even though that chemistry is quite challenging.

That in turn raises a big question- if oxygen evolving photosystems are primitive (originating very roughly with the last common ancestor of all life, about four billion years ago) then why was earth's atmosphere oxygenated only from two billion years ago onward? It had been assumed that this turn in Earth history marked the evolution of photosystem II. The authors point out additionally that there is also evidence for the respiratory use of oxygen from these extremely early times as well, despite the lack of free oxygen. Quite perplexing, (and the authors decline to speculate), but one gets the distinct sense that possibly life, while surprisingly complex and advanced from early times, was not operating at the scale it does today. For example, colonization of land had to await the buildup of sufficient oxygen in the atmosphere to provide a protective ozone layer against UV light. It may have taken the advent of eukaryotes, including cyanobacterial-harnessing plants, to raise overall biological productivity sufficiently to overcome the vast reductive capacity of the early earth. On the other hand, speculation about the evolution of early life based on sequence comparisons (as these authors do) is notoriously prone to artifacts, since what evolves at vanishingly slow rates today (such as the photosystem core proteins) must have originally evolved at quite a rapid clip to attain the functions now so well conserved. We simply can not project ancient ages (at the four billion year time scales) from current rates of change.

Light can do other wonderful things for us.
Congressional acts, Official acts, and criminal acts.

Saturday, August 23, 2025

Why Would a Bacterium Commit Suicide?

Our innate immune system, including suicide of infected cells, has antecedents in bacteria.

We have a wide variety of defense from pathogens, from our skin and its coating of RNase and antimicrobial peptides, to the infinite combinatorial firepower of the adaptive immune system, which is primed by vaccines. In between is something called the innate immune system, which is built-in and static rather than adaptive, but is very powerful nonetheless. It is largely built around particular proteins that recognize common themes in pathogens, like the free RNA and DNA of viral genomes, or lipopolysaccharide that coats most bacteria. There are also internal damage signals, such as cellular proteins that have leaked out and are visible to wandering immune cells, that raise similar alarms. The alarms lead to inflammation, the gathering of immune cells, and hopefully to resolution of the problem.

One powerful defensive strategy our cells have is apoptosis, or cellular suicide. If the signals from an incoming infection are too intense, a cell, in addition to activating its specific antiviral defenses, goes a few steps further and generates a massive inflammasome that rounds up and turns on a battery of proteases that chew up the cell, destroying it from inside. The pieces are then strewn around to be picked up by the macrophages and other cleanup crews, which hopefully can learn something from the debris about the invading pathogen. One particular target of these proteases are gasdermins, which are activated via this proteolysis and then assemble into huge pores that plant themselves into the plasma membrane and mitochondrial membranes, rapidly killing the cell by collapsing all the ion gradients across these membranes.

A human cell committing apoptosis, and falling apart.

A recent paper showed that key parts of this apparatus is present in bacteria as well. It was both technically interesting, since they relied on a lot of AI tools to discern the rather distant relations between pathogen (that is to say, phages- the viruses of bacteria) receptors from bacteria, and generally intriguing, because suicide is generally something thought to be a civilized behavior of cells in multicellular organisms, protecting the rest of the body from spread of the pathogen. Bacteria, despite living in mucky biofilms and other kinds of colonies, are generally thought to be loners, only out for their own reproduction. Why would they kill themselves? Well, anytime they are in a community, that community is almost certainly composed of relatives, probably identical clones of a single founding cell. So it would be a highly related community indeed, and well worth protecting in this way.

A bacterial gasdermin outruns phages infecting the cell. Two kinds of cells are mixed together here, ones without a gasdermin (green) and ones with (black). All are infected at zero time, and a vital dye is added (pink) that only gets into cells through large pores, like the gasdermin pore. At 45 minutes and after, the green (control) cells are dying and getting blown apart by escaping phages. On the other hand, the gasdermin+ cells develop pores and get stained pink, showing that they are dead too. But they don't blow up, indicating that they have shut down phage propagation.

The researchers heard that some bacteria have gasdermins, so they wondered whether they have the other parts of the system- the proteases and the sensor proteins. And indeed, they do. While traditional sequence similarity analysis didn't say so, structural comparison courtesy of the AlphaFold program showed that a protease in the same operon as gasdermin had CARD domains. These domains are signatures of caspases and of caspase interacting proteins, like the sensor proteins in the human innate immune system. They bind other CARD domains, thus mediating assembly of the large complexes that lead to inflammation and apoptosis.

Structure of the bacterial CARD domain, courtesy of AlphaFold, showing some similarity with a human CARD domain, which was not very apparent on the sequence level.

The operon of this bacterium, which encodes the whole system- gasdermin, protease (two of them), and sensor.

The researchers then raised their AI game by using another flavor of AlphaFold to predict interactions that the bacterial CARD/protease protein might have. This showed an interaction with another protein in the same operon, with similarity to NLR sensor proteins in humans, which they later confirmed happened in vitro as well. This suggests that this bacterium, and many bacteria, have the full circuit of sensor for incoming phage, activatable caspase-like protease, and then cleavable gasdermin as the effector of cell suicide.

A comparison of related operons from several other bacteria.

Looking at other bacteria, they found that many have similar systems. Some link to other effectors, rather than a pore-forming gasdermin. But most share a similar sensor-to-protease circuit that is the core of this defense system. Lastly, they also asked what triggers this whole system from the incoming phage. The answer, in this case, is a phage protein called rIIB. Unfortunately, it is not clear either what rIIB does for the phage or whether it triggers the CARD/gasdermin system by activating the bacterial NLR sensor protein, as would be assumed. What is known, however, is that rIIB has a function in defending phage against another bacterial defense system called RexAB. This it looks as though this particular arms race has ramified into a complicated back and forth as bacteria try as best they can to insure themselves against mass infection.

The move to quash mRNA vaccines is reprehensible. A black day.
A craven show of idiocy, weakness, and moral ignorance. But better than talking about Epstein, amiright?
Followed by a craven, treacly exercise in begging.
Who runs things around here?
Injustice on the court.
Absolutely, no way.
Swampy Don.
Not a quid pro quo, at all.
Climate (and coal) graphs of the week, electricity sources of US vs China. Everyone has a very long way to go, especially China:

Saturday, August 9, 2025

A Wonderland of RNA

A snoRNA mates with the 7SL RNA and mRNA to promote protein secretion.

As molecular biologists wander through the wilderness of the cell, they keep stumbling across RNAs. From early on, the ribosomal RNA (rRNA) and amino acid transfer RNA (tRNA) were obviously incredibly abundant, in their somewhat inefficient job of carrying on translation. Messenger RNAs (mRNA) were less abundant, but recognized from the start for their key role relaying information from the genome. But over the decades, more and more types of RNA kept popping up. Here is one tabulation of genes by type in humans:

22,700 protein-coding (along with 19,000 derelict "pseudogenes")
820 rRNA
659 tRNA
1,960 miRNA
2,100 snRNA
1,390 snoRNA
51,000 ncRNA
65 scaRNA, scRNA, piRNA

One big step in the realization of the prevalence of RNA was the ENCODE project, done as part of the human genome project. They found that most of the genome is transcribed to RNA, one way or another. Not all those products are important, or abundant, but just the fact that all this RNA is floating around was startling. This does not mean that there isn't junk DNA, (or junk RNA), but it does mean that a lot of potential function lurks waiting to be found. And the last couple of decades have seen many such finds.

From the list above, microRNAs are small fragments that bind to matching mRNAs and repress their translation to protein. They have wide-ranging networks of regulation, mostly of a fine-tuning nature, but sometimes quite decisive and relevant to human biology and pathology. snRNAs are small nuclear DNAs, some of which function in RNA splicing. snoRNAs are small nucleolar RNAs, some of which mate with various sections of the ribosomal RNA as it is being assembled in the nucleolus, and guide chemical modifications made by enzymes, such as attachment of methyl and uridine groups. The non-coding (nc) RNAs are typically products of protein coding genes that, due to splicing or altered start sites, happen to not code for anything, and occasionally have significant regulatory roles.

In general, RNAs may have a few different mechanisms of action: guide characteristics, where they mate with their antisense sequence in a target RNA and direct some other process like sequestration, cleavage, or chemical modification. Or they may bind to specific proteins, such as the RNAs that bind to chromatin and regulate X-linked dosage compensation. Or they have structural, even catalytic roles, like the ribosomal and spliceosomal RNAs.

What should be clear that there are many more genes are recognizable by sequence than we understand. Only a couple hundred snoRNA genes are understood by their targets and activity. But there are well over a thousand in the genome. What do the rest do? A recent paper took on this quest, devising a novel way to isolate these snoRNAs and their partners from the welter of other material. They did this by crosslinking everything, ligating the RNAs locally to each other (which linked the snoRNAs to their targets) and then reverse-transcribing the RNAs before trying to capture them individually by custom anti-sense DNA probes, one per gene. It was a complicated procedure, but far more productive than trying to capture them directly as RNA with antisense RNA probes, since these snoRNAs are intensely structured (lots of hairpins and other duplexes) and expected to be tightly bound to other things.

Taking the most abundant snoRNAs, these researchers then looked for novel partners and functions. After seeing that they recovered plenty of the known interactions, the most interesting novel interaction they came up with was of a gene called SNORA73. This was found linked to two other RNAs, 7SL RNA and various mRNAs.

Just another holdover from the RNA world. The SRP particle (in red) is built around the 7SL snRNA (helix). This particle detects the signal peptide (green) of the nascent protein emerging from the ribosome (beige, blue), and clamps on (right) to arrest translation. Translation is later resumed after the whole complex has successfully docked with the membrane receptor, allowing the SRP to be released, and the peptide to be threaded through the membrane.

Funny story ... 7SL RNA is yet another snRNA that has a key role in translation. It is the core of the signal recognition particle (SRP), which binds to "signal" sequences in proteins as they come off the ribosome. These are a special code segement at the start that says "I want to be secreted across (or into) a membrane, not just located in the cytoplasm". The SRP captures this signal segment, and then sticks its head into the ribosome, stalling its translation. Then the whole mess goes off to the membrane (endoplasmic reticulum in eukaryotes, or plasma membrane in bacteria) where it docks with the SRP receptor complex. This is the signal for translation to restart, the SRP to come off, and the nascent protein to thread its way through the membrane to the other side.

Incidentally, it is notable also that SRP is scaffolded by a large RNA, with a few proteins stuck on for decoration / specificity. This makes sense as an echo of early evolution, where not only did RNAs likely arise before proteins ever existed, but those RNAs had gotten quite large while the earliest proteins were still relatively small. The genetic code appears to have started as a two letter code, before the third letter was munged onto the end, vastly expanding the chemical repertoire of proteins and making them premier catalysts.

A few results, indicating that knockdown of SNORA73 (with the anti-RNAs LNA-1 and LNA-2) dramatically decreases secretion of the proteins CLU and LGAL3BP. On left are signals from proteins isolated from inside and outside the cells, as indicated. On right is a graph of the same data. The mRNA levels are not changed nor the protein levels. Only the level of secretion is altered.

So the implication of all this was that SNORA73 affects protein translation/secretion. This is indeed the case, when these authors assayed the secretion of one of the SNORA73-bound mRNA-encoded proteins in the presence of an inhibitor of SNORA73 (above). The mechanism is that SNORA73 serves as a special glue between the 7SL snRNA and the translating mRNA, with parts of its RNA sequence complementary to both a segment of the 7SL snRNA, and also to a small 10 base-long segment of the mRNA. The mRNA segment is hanging off the ribosome while the beginning of the message is being translated. The whole setup helps the SRP find these mRNAs efficiently and hold on to them effectively, increasing not their translation rate, but their secretion rate.

Models of the structures of SNORA73 (which is made by a pair of similar genes, A and B), as they bind to the 7SL snRNA, and the target mRNAs. These binding areas are far apart, to allow the mRNA tail (that is not yet in the ribosome) to reach the MBM binding site. The psi pocket is of uncertain function, but in other snoRNAs directs the uridine addition to target rRNA.

The mRNAs that have this 10 base (MBM) signal that binds to SNORA73 are a subset of those that express secreted proteins, though it is not really clear from this work what kind of a subset this is. Perhaps this mechanism makes up for weak signal sequences, or some other defect in the protein's access to the secretion machinery. Whatever that logic, we have here a conjunction of four RNAs, (7 SL snRNA, the SNORA73 snoRNA, the mRNA target, and the ribosomal RNA structure) all collaborating to promote the secretion of a target protein. This is just one of thousands of uncharacterized and conserved RNAs visible in our genome. It is startling to think what else might be going on.

PLOS goes on a mole hunt.
We are going to be without key vaccines quite soon.
Is lithium a vital nutrient?
Doctors are fed up with a system they created.
Credit isn't free- Krugman on predatory finance.
Like a battered wife, Trump keeps going back for more abuse.

Saturday, August 2, 2025

The Origin of Life

What do we know about how it all began? Will we ever know for sure?

Of all the great mysteries of science, the origin of life is maybe the one least likely to ever be solved. It is a singular event that happened four billion years ago in a world vastly different from ours. Scientists have developed a lot of ideas about it and increased knowledge of this original environment, but in the end, despite intense interest, the best we will be able to do is informed speculation. Which is, sure, better than uninformed speculation, (aka theology), but still unsatisfying.

A recent paper about sugars and early metabolism (and a more fully argued precursor) piqued my interest in this area. It claimed that there are non-enzymatic ways to generate most or all of the core carbohydrates of glycolysis and CO2 fixation around pentose sugars, which are at the core of metabolism and the supply of sugars like ribose that form RNA, ATP, and other key compounds. The general idea is that at the very beginning of life, there were no enzymes and proteins, so our metabolism is patterned on reactions that originally happened naturally, with some kind of kick from environmental energy sources and mineral catalysts, like iron, which was very abundant.

That is wonderful, but first, we had better define what we mean by life, and figure out what the logical steps are to cross this momentous threshold. Life is any chemical process that can accomplish Darwinian evolution. That is, it replicates in some fashion, and it has to encode those replicated descendants in some way that is subject to mutation and selection. With those two ingredients, we are off to the races. Without them, we are merely complex minerals. Crystals replicate, sometimes quite quickly, but they do not encode descendent crystals in a way that is complex at all- you either get the parent crystal, or you get a mess. This general theory is why the RNA world hypothesis was, and remains, so powerful.

The RNA world hypothesis is that RNA is likely the first genetic material, before DNA (which is about 200 times more stable) was devised. RNA also has catalytic capabilities, so it could encode in its own structure some of the key mechanisms of life, therefore embodying both of the critical characteristics of life specified above. The fact that some key processes remain catalyzed by RNA today, such as ribosomal synthesis of proteins, spliceosomal re-arrangement of RNAs, and cutting of RNAs by RNAse P, suggest that proteins (as well as DNA) were the Johnny-come-latelies of the chemistry of life, after RNA had, in its lumbering, inefficient way, blazed the trail.

In this image of the ribosome, RNA is gray, proteins are yellow. The active site is marked with a bright light. Which came first here-
protein or RNA?

But what kind of setting would have been needed for RNA to appear? Was metabolism needed? Does genetics come first, or does metabolism come first? If one means a cyclic system of organic transformations encoded by protein or RNA enzymes, then obviously genetics had to come first. But if one means a mess of organic chemicals that allowed some RNA to be made and provide modest direction to its own chemical fate, and to a few other reactions, then yes, those chemicals had to come first. A great deal of work has been done speculating what kind of peculiar early earth conditions might have been conducive to such chemistries. Hydrothermal vents, with their constant input of energy, and rich environment of metallic catalysts? Clay particles, with their helpful surfaces that can faux-crystalize formation of RNAs? Warm ponds, hot ponds, UV light.... the suggestions are legion. The main thing to realize is that early earth was surely highly diverse, had a lot of energy, and had lots of carbon, with a CO2-rich atmosphere. UV would have created a fair amount of carbon monoxide, which is the feedstock of the Fischer-Tropsch reactions that create complex organic compounds, including lipids, which are critical for formation of cells. Early earth very likely had pockets that could produce abundant complex organic molecules.

Thus early life was surely heterotrophic, taking in organic chemicals that were given by the ambient conditions for free. And before life really got going, there was no competition- there was nothing else to break those chemicals down, so in a sort of chemical pre-Darwinian setting, life could progress very slowly (though RNA has some instability in water, so there are limits). Later, when some of the scarcer chemicals were eaten up by other already-replicating life forms, then the race was on to develop those enzymes, of what we now recognize as metabolism, which could furnish those chemicals out of more common ingredients. Onwards the process then went, hammering out ever more extensive metabolic sequences to take in what was common and make what was precious- those ribose sugars, or nucleoside rings that originally had arrived for free. The first enzymes would have been made of RNA, or metals, or whatever was at hand. It was only much later that proteins, first short, then longer, came on the scene as superior catalysts, extensively assisted by metals, RNAs, vitamins, and other cofactors.

Where did the energy for all this come from? To cross the first threshold, only chemicals (which embodied outside energy cycles) were needed, not energy. Energy requirements accompanied the development of metabolism, as the complex chemicals become scarcer and they needed to be made internally. Only when the problem of making complex organic chemicals from simpler ones presented itself did it also become important to find some separate energy source to do that organic chemistry. Of course, the first complex chemicals absolutely needed were copies of the original RNA molecules. How that process was promoted, through some kind of activated intermediates, remains particularly unclear.

All this happened long before the last universal common ancestor, termed "LUCA", which was already an advanced cell just prior to the split into the archaeal and bacterial lineages, (much later to rejoin to create the most amazing form of life- eukaryotes). There has been quite a bit of analysis of LUCA to attempt to figure out the basic requirements of life, and what happened at the origin. But this ("top-down") approach is not useful. The original form of life was vastly more primitive, and was wholly re-written in countless ways before it became the true bacterial cell, and later still, LUCA. Only the faintest traces remain in our RNA-rich biochemistry. Just think about the complexity of the ribosome as an RNA catalyst, and one can appreciate the ragged nature of the RNA world, which was probably full of similar lumbering catalysts for other processes, each inefficient and absurdly wasteful of resources. But it could reproduce in Darwinian fashion, and thus it could improve.

Today we find on earth a diversity of environments, from the bizarre mineral-driven hydrothermal vents under the ocean to the hot springs of Yellowstone. The geology of earth is wondrously varied, making it quite possible to credit one or more of the many theories of how complex organic molecules may have become a "soup" somewhere on the early Earth. When that soup produces ribose sugars and the other rudiments of RNA, we have the makings of life. The many other things that have come to characterize it, such as lipid membranes and metabolism of compounds are fundamentally secondary, though critically important for progress beyond that so-pregnant moment.

Notes on US history.
The fascist playbook.
Chemosynthetic life is doing very well.
Decimation leads to decline.
FYI, a discussion of phosphate.
FYI, a discussion of hypothetical steps toward RNA self-replication.

Saturday, July 5, 2025

Water Sensing by WNKs

WNK kinases sense osmotic condition as well as chloride concentration to keep us hydrated.

"Water, water, everywhere, nor any drop to drink." This line from Coleridge evokes the horror of thirst on the ghost ship, as its crew can not drink salt water. Other species can, but ocean water is too strong for us, roughly four times as salty as our blood. Nevertheless, our bodies have exquisite mechanisms to manage salt concentrations, with each cell managing its own traffic, and the kidneys managing most electrolytes in the blood. It is a very difficult task that has led to clever evolutionary solutions like counter-current exchange across the nephron loops, and stark differences in those nephron cell membranes, over water or salt permeability, to maximize use of passive ion gradients. But at the heart of the system, one has to know what is going on- one has to monitor all of the electrolyte levels and overall osmotic stress.

One such monitoring thermostat for chemical balances turns out to be the WNK kinases- a family of four proteins in humans that control (by phosphorylating them) a secondary set of regulators, which in turn control many salt transporters, such as SLC12A2 and SLC12A4. These latter are passive, though regulated, co-transporters that allow chloride across the membrane when combined with a matching cation like sodium or potassium. The cations drive the process, because they are normally kept (pumped) to strong gradients across cell membranes, with high sodium outside, and high potassium inside. Thus when these co-transporters are turned on (or off), they use the cation gradients to control the chloride level in the cell, in either direction, depending on the particular transporter involved. Since the sodium and potassium levels are held at relatively static, pumped levels, it is the chloride level that helps control the overall osmotic pressure in a finely tuned way.

A few of the ionic transactions done in the kidney.

The WNK kinases were discovered genetically, in families that showed hypertension and raised levels of chloride and potassium in the blood. These syndromes mirrored complementary syndromes caused by mutations in SLC12A2, the Na/Cl co-transporter, indicating the WNK kinases inhibit SLC12A2. It turns out that WNK, which are named for an unusual catalytic site (with no lysine [K]) are sensors for both chloride, which inhibit them, and for osmotic pressure, which activates them. They are expressed in different locations and have slightly different activities, (and control many more transporters and processes than discussed here), but I will treat them interchangeably here. The logic of all this is that, if osmotic pressure is low, that means that internal salt levels are low, and chloride needs to be let into the cell, by activating the cation/chloride co-transporters. Likewise, if chloride levels inside the cell are high, the WNK kinase needs to be inhibited, reducing chloride influx.

A recent paper (and prior work from the same lab) discussed structures of the WNK regulators that explain some of this behavior. WNK kinases are dimers at rest, and in that state mutually inhibit their auto-phosphorylation. It is separation and auto-phosphorylation that turns them on, after which they can then phosphorylate their target proteins, such as the secondary kinases STK39 and OSR1. The authors had previously found a chloride binding site right at the active site of the enzyme that promotes dimerization. In the current paper, they reveal a couple of clusters of water molecules which similarly affect the dimerization, and thus activity, of the enzyme.

Location of the inhibitory chloride (green) binding site in WNK1. This is right in the heart of the protein, near the active kinase site and dimerization interface with the other WNK1 partner.

While X-ray crystal structures rarely show or care much about water molecules, (they are extremely small and hard to track), here, those waters were hypothesized to be important, since WNK kinases are responsive to osmotic pressure. One way to test this is to add PEG400 to the reaction. This is a polymer (400 molecular weight) that is water-like and inert, but large in a way that crowds out water molecules from the solution. At 15% or 25% of the volume, PEG400 displaces a lot of water, lowers the water activity of a solution, and thus increases the osmotic pressure- that is its tendency to draw water in from outside. Plants use osmotic pressure as turgor pressure, and our cells, not having cells walls, need to always be at an osmotic pressure similar to the outside, lest they swell up, or conversely shrink away. Anyhow, WKN kinases can be switched from an inactive to active state just by adding PEG400- a sure sign that they are sensors for osmotic pressure.

Water network (blue dots) within the WNK1 kinase protein. Most of the protein is colored teal, while the active site kinase area is red, and a tiny amount of the dimer partner is colored green. When this crystal is osmotically challenged, the water network collapses from 14 waters to 5, changing the structure and promoting dissociation of the dimer. In B is show a sequence alignment over a wide evolutionary range where the amino acids that coordinate the water network (yellow) are clearly very well conserved, thus quite important.

Above is shown a closeup of the WNK1 protein, showing in teal the main backbone, including the catalytic loop. In red is the activation loop of the kinase, and in green is a little bit from the other WNK1 protein in the dimer pair. The chloride, if bound, would be located right at top center, at K375. Shown in blue are a series of fourteen water molecules that make up one so-called water network. Another smaller one was found at the interface between the two WNK1 proteins. The key finding was that, if crystalized with PEG400, this water network collapsed to only five water molecules, thereby changing the structure of the protein significantly and accounting for the dissolution of the dimer.

Superposition of WNK1 with PEG400 (purple) and activated vs WNK1 without, in an inactive state (teal). Most of the blue waters would be gone in the purple state as well. This shows the significant structural transition, particularly in the helixes above the active site, which induce (in the purple state) dissociation of the dimer, auto-phosphorylation, and activation.

Thus there is a delicate network of water molecules tentatively held together within this protein that is highly sensitive to the ambient water activity (aka osmotic pressure). This dynamic network provides the mechanism by which the WNK proteins sense and transmit the signal that the cell requires a change in ionic flows. Generally the point is to restore homeostatic balance, but in the kidney these kinases are also used to control flows for the benefit of the organism as a whole, by regulating different transporters in different parts of the same cell- either on the blood side, or the urine side.

A little respect for the Americas.
California is not alone. Housing is going sclerotic everywhere.
Remember climate change? It is still getting worse ever faster.
Zohran has a history.
So long, Ukraine. Bonus.