Showing posts with label cell biology. Show all posts
Showing posts with label cell biology. Show all posts

Saturday, November 9, 2024

Rings of Death

We make pore-forming proteins that poke holes in cells and kill them. Why?

Gasdermin proteins are parts of the immune system, and exist in bacteria as well. It was only in 2016 that their mechanism of action was discovered, as forming unusual pores. The function of these pores was originally assumed to be offensive, killing enemy cells. But it quickly became apparent that they more often kill the cells that make them, as the culmination of a process called pyroptosis, a form of (inflammatory) cell suicide. Further work has only deepened the complexity of this system, showing that gasdermin pores are more dynamic and tunable in their action than originally suspected.

The structure is quite striking. The protein starts as an auto-inhibited storage form, sitting around in the cell. When the cell comes under attack, a cascade of detection and signaling occurs that winds up expressing a family of proteases called caspases. Some of these caspases can cut the gasdermin proteins, removing their inhibitory domain and freeing them to assemble into multimers. About 26 to 32 of these activated proteins can form a ring on top of a membrane (let's say the plasma membrane), which then cooperatively jut down their tails into the membrane and make a massive hole in it.

Overall structure of assembled gasdermin protein pores.


Simulations of pore assembly, showing how the trapped membrane lipids would pop out of the center, once pore assembly is complete.


These holes, or pores, are big enough to allow small proteins through, and certainly all sorts of chemicals. So one can understand that researchers thought that these were lethal events. And gasdermins are known to directly attack bacterial cells, being responsible in part for defense against Shigella bacteria, among others. But then it was found that gasdermins are the main way that important cytokines like the highly pro-inflammatory IL-1β get out of the cell. This was certainly an unusual mode of secretion, and the gasdermin D pore seems specifically tailored, in terms of shape and charge, to conduct the mature form of IL-1β out of the cell. 

It also turned out that gasdermins don't always kill their host cells. Indeed, they are far more widely used for temporary secretion purposes than for cell killing. And this secretion can apparently be regulated, though the details of that remain unclear. In structural terms, gasdermins can apparently form partial and mini-pores that are far less lethal to their hosts, allowing, by way of their own expression levels, a sensitive titration of the level of response to whatever danger the cell is facing.

Schematic of how lower concentrations of gasdermin D (lower path, blue) allow smaller pores to form with less lethality.

Equally interesting, the bacterial forms of gasdermin have just begun to be studied. While they may have other functions, they certainly can kill their host cell in a suicide event, and researchers have shown that they can shut down phage infection of a colony or lawn of bacterial cells. That is, if a phage-infected cell can signal and activate its gasdermin proteins fast enough, it can commit suicide before the phage has time to fully replicate, beating the phage at its own race of infection and propagation. 

Bacteria committing suicide for the good of the colony or larger group? That introduces the theme of group selection, since committing suicide certainly doesn't do the individual bacterium any good. It is only in a family group, clonal colony, or similar community that suicide for the sake of the (genetically related) group makes sense. We, as multicellular organisms, are way past that point. Our cells are fully devoted to the good of the organism, not themselves. But to see this kind of heroism among bacteria is, frankly, remarkable.

Bacteria have even turned around to attack the attacker. The Shigella bacteria mentioned above, which are directly killed by gasdermins, have evolved an enzymatic activity that tags gasdermin with ubiquitin, sending it to the cellular garbage disposal and saving themselves from destruction. It is an interesting validation of the importance of gasdermins and the arms race that is afoot, within our bodies.


  • A tortured ballot.
  • Great again? Corruption and degradation is our lot.
  • We may be in a (lesser) Jacksonian age. Populism, bad taste, big hair, and mass deportation.
  • Beautiful Jupiter.
  • Bill Mitchell on our Depression job guarantee: "So for every $1 outlaid the total societal benefits were around $6 over the lifetime of the participant."
  • US sanctions are scrambling our alliances and the financial system.
  • Solar works for everyone.


Saturday, October 26, 2024

A Hunt for Causes of Atherosclerosis

Using the most advanced tools of molecular biology to sift through the sands of the genome for a little gold.

Blood vessels have a hard life. Every time you put on shoes, the vessels in your feet get smashed and smooshed, for hours on end. And do they complain? Generally, not much. They bounce back and make do with the room you give them. All through the body, vessels are subject to the pumping of the heart, and variations in blood volume brought on by our salt balance. They have to move when we do, and deal with it whenever we sit or lie on them. Curiously, it is the veins in our legs and calves, that are least likely to be crushed in daily life, that accumulate valve problems and go varicose. Atherosclerosis is another, much more serious problem in larger vessels, also brought on by age and injury, where injury and inflammation of the lining endothelial cells can lead to thickening, lipid/cholesterol accumulation, necrosis, calcification, and then flow restriction and fragmentation risk. 

Cross-section of a sclerotic blood vessel. LP stands for lipid pool, while the box shows necrotic and calcified bits of tissue.

The best-known risk factors for atherosclerosis are lipid-related, such as lack of liver re-capture of blood lipids, or lack of uptake around the body, keeping cholesterol and other lipid levels high in the blood. But genetic studies have found hundreds of areas of the genome with risk-conferring (or risk-reducing) variants, most of which are not related to lipid management. These genome-wide association studies (or GWAS) look for correlations between genetic markers and disease in large populations. So they pick up a lot of low-impact genetic variations that are difficult to study, due to their large number and low impact, which can often imply peripheral / indirect function. High-impact variations (mutations) tend to not survive in the population very long, but when found tend to be far more directly involved and informative.

A recent paper harnessed a variety of modern tools and methods to extract more from the poor information provided by GWAS. They come up with a fascinating tradeoff / link between atherosclerosis and cerebral cavernous malformation (CCM), which is distinct blood vessel syndrome that can also lead to rupture and death. The authors set up a program of analysis that was prodigious, and only possible with the latest tools. 

The first step was to select a cell line that could model the endothelial cells at issue. Then they loaded these cells with custom expression-reducing RNA regulators against each one of the ~1600 genes found in the neighborhood of the mutations uncovered by the GWAS analyses above, plus 600 control genes. Then they sequenced all the RNA messages from these single cells, each of which had received one of these "knock-down" RNA regulators. This involved a couple hundred thousand cells and billions of sequencing reads- no simple task! The point was to gather comprehensive data on what other genes were being affected by the genetic lesion found in the GWAS population, and then to (algorithmically) assemble them into coherent functional groups and pathways which could both identify which genes were actually being affected by the original mutations, and also connect them to the problems resulting in atherosclerosis.

Not to be outdone, they went on to harness the AlphaFold program to hunt for interactions among the proteins participating in some of the pathways they resolved through this vast pipeline, to confirm that the connections they found make sense.

They came up with about fifty different regulated molecular programs (or pathways), of which thirteen were endothelial cell specific. Things like angiogenesis, wound healing, flow response, cell migration, and osmoregulation came up, and are naturally of great relevance. Five of these latter programs were particularly strongly connected to coronary artery disease risk, and mostly concerned endothelial-specific programs of cell adhesion. Which makes sense, as the lack of strong adhesion contributes to injury and invasion by macrophages and other detritus from the blood, and adhesion among the endothelial cells plays a central role in their ability / desire to recover from injury, adjust to outside circumstances, reshape the vessel they are in, etc.

Genes near GWAS variations and found as regulators of other endothelial-related genes are mapped into a known pathway (a) of molecular signaling. The color code of changed expression refers to the effect that the marked gene had on other genes within the five most heavily disease-linked programs/pathways. The numbers refer to those programs, (8=angiogenesis and osmoregulation, 48=cell adhesion, 35=focal adhesion, related to cell adhesion, 39=basement membrane, related to cell polarity and adhesion, 47=angiogenesis, or growth of blood vessels). At bottom (c) is a layout of 41 regulated genes within the five disease-related programs, and how they are regulated by knockdown of the indicated genes on the X axis. Lastly, in d, some of these target genes have known effects on atherosclerosis or vascular barrier syndromes when mutated. And this appears to generally correlate with the regulatory effects of the highlighted pathway genes.

"Two regulators of this (CCM) pathway, CCM2 and TLNRD1, are each linked to a CAD (coronary artery disease) risk variant, regulate other CAD risk genes and affect atheroprotective processes in endothelial cells. ... Specifically, we show that knockdown of TLNRD1 or CCM2 mimics the effects of atheroprotective laminar blood flow, and that the poorly characterized gene TLNRD1 is a newly identified regulator in the CCM pathway."

On the other hand, excessive adhesiveness and angiogenesis can be a problem as well, as revealed by the reverse correlation they found with CCM syndrome. The interesting thing was that the gene CCM2 came up as one of strongest regulators of the five core programs associated with atherosclerosis risk mutations. As can be guessed from its name, it can harbor mutations that lead to CCM. CCM is a relatively rare syndrome (at least compared with coronary artery disease) of localized patches of malformed vessels in the brain, which are prone to rupture, which can be lethal. CCM2 is part of a protein complex, with KRIT1 and PDCD10, and part of a known pathway from fluid flow sensing receptors to transcription regulators (TFs) that turn on genes relevant to the endothelial cells. As shown in the diagram above, this pathway is full of genes that came up in this pathway analysis, from the atherosclerosis GWAS mutations. Note that there is a repression effect in the diagram above (a) between the CCM complex and the MAP kinase cascade that sends signals downstream, accounting for the color reversal at this stage of the diagram.

Not only did they find that this known set of three CCM gene are implicated in the atherosclerosis mutation results, but one of the genes they dug up through their pipeline, TLNRD1, turned out to be a fourth, hitherto unknown, member of the CCM complex, shown via the AlphaFold program to dock very neatly with the others. It is loss of function mutations of genes encoding this complex, which inhibits the expression of endothelial cell pro-cell adhesion and pro-angiogenesis sets of genes, that cause CCM, unleashing these angiogenesis genes to do too much. 

The logic of this pathway overall is that proper fluid flow at the cell surface, as expected in well-formed blood vessels, activates the pathway to the CCM complex, which then represses programs of new or corrective angiogenesis and cell adhesion- the tissue is OK as it is. Conversely, when turbulent flow is sensed, the CCM complex is turned down, and its target genes are turned up, activating repair, revision, and angiogenesis pathways that can presumably adjust the vessel shape to reduce turbulence, or simply strengthen it.

Under this model, malformations may occur during brain development when/where turbulent flow occurs, reducing CCM activation, which is abetted by mutations that help the CCM complex to fall apart, resulting (rarely) in run-away angiogenesis. The common variants dealt with in this paper, that decrease risk of cardiovascular disease / atherosclerosis, appear to have similar, but much weaker effects, promoting angiogenesis, including recovery from injury and adhesion between endothelial cells. In this way, they keep the endothelium tighter and more resistant to injury, invasion by macrophages, and all the downstream sequelae that result in atherosclerosis. Thus strong reduction of CCM gene function is dangerous in CCM syndrome, but more modest reductions are protective in atherosclerosis, setting up a sensitive evolutionary tradeoff that we are clearly still on the knife's edge of. I won't get into the nature of the causal mutations themselves, but they are likely to be diffuse and regulatory in the latter case.

Image of the CCM complex, which regulates response to blood flow, and whose mutations are relevant both to CCM and to atherosclerosis. The structures of TLNRD1 and the docking complex are provided by AlphaFold. 


This method is particularly powerful by being unbiased in its downstream gene and pattern finding, because it samples every expressed gene in the cell and automatically creates related pathways from this expression data, given the perturbations (knockdown of expression) of single target genes. It does not depend on using existing curated pathways and literature that would make it difficult to find new components of pathways. (Though in this case the "programs" it found align pretty closely with known pathways.) On the other hand, while these authors claim that this method is widely applicable, it is extremely arduous and costly, as evidenced by the contribution of 27 authors at top-flight institutions, an unusually large number in this field. So, for diseases and GWAS data sets that are highly significant, with plenty of funding, this may be a viable method of deeper analysis. Otherwise, it is beyond the means of a regular lab.

  • A backgrounder on sedition, treason, and insurrection.
  • And why it matters.
  • Jan 6 was an attempted putsch.
  • Trumpies for Putin.
  • Solar is a no-brainer.
  • NDAs are blatantly illegal and immoral. One would think we would value truth over lies.

Saturday, September 28, 2024

Dangerous Memories

Some memory formation involves extracellular structures, DNA damage, and immune component activation / inflammation.

The physical nature of memories in the brain is under intensive scrutiny. The leading general theory is that of positive reinforcement, where neurons that are co-activated strengthen their connections, enhancing their ability to co-fire and thus to express the same pattern again in the future. The nature of these connections has been somewhat nebulous, assumed to just be the size and stability of their synaptic touch-points. But it turns out that there is a great deal more going on.

A recent paper started with a fishing expedition, looking at changes in gene expression in neurons at various time points after the mice were subjected to a fear learning regimen. They took this out to much longer time points (up to a month) than had been contemplated previously. At short times, a bunch of well-known signals and growth-oriented gene expression happened. At the longest time points, organization of a structure called the perineural net (PNN) was read out of the gene expression signals. This is a extracellular matrix sheath that appears to stabilize neuronal connections and play a role in long-term memory and learning. 

But the real shocker came at the intermediate time point of about four days. Here, there was overexpression of TLR9, which is an immune system detector of broken / bacterial DNA, and inducer in turn of inflammatory responses. This led the authors down a long rabbit hole of investigating what kind of DNA fragmentation is activating this signal, how common this is, how influential it is for learning, and what the downstream pathways are. Apparently, neuronal excitation, particularly over-excitation that might be experienced under intense fear conditions, isn't just stressful in a semiotic sense, but is highly stressful to the participating neurons. There are signs of mitochondrial over-activity and oxidative stress, which lead to DNA breakage in the nucleus, and even nuclear perforation. It is a shocking situation for cells that need to survive for the lifetime of the animal. Granted, these are not germ cells that prioritize genomic stability above all else, but getting your DNA broken just for the purpose of signaling a stress response that feeds into memory formation? That is weird.

Some neuronal cell bodies after fear learning. The red dye is against a marker of DNA repair proteins, which form tight dots around broken DNA. The blue is a general DNA stain, and the green is against a component of the nuclear envelope, showing here that nuclear envelopes have broken in many of these cells.

The researchers found that there are classic signs of DNA breakage, which are what is turning on the TLR9 protein, such as seeing concentrated double-strand DNA repair complexes. All this stress also turned on proteases called caspases, though not the cell suicide program that these caspases typically initiate. Many of the DNA break and repair complexes were, thanks to nuclear perforation, located diffusely at the centrosome, not in the nucleus. TLR9 turns on an inflammatory response via NFKB / RELA. This is clearly a huge event for these cells, not sending them into suicide, but all the alarms short of that are going off.

The interesting part was when the researchers asked whether, by deleting the TLR9 or related genes in the pathway, they could affect learning. Yes, indeed- the fear memory was dependent on the expression of this gene in neurons, and on this cell stress pathway, which appears to be the precondition of setting up the perineural net structures and overall stabilization. Additionally, the DNA damage still happened, but was not properly recognized and repaired in the absence of TLR9, creating an even more dangerous situation for the affected neurons- of genomic instability amidst unrepaired DNA.

When TRL9 is knocked out, DNA repair is cancelled. At bottom are wild-type cells, and at top are mouse neurons after fear learning that have had the gene TLR9 deleted. The red dye is against DNA repair proteins, as is the blue dye in the right-most frames. The top row is devoid of these repair activities.

This paper and its antecedent literature are making the case that memory formation (at least under these somewhat traumatic conditions- whether this is true for all kinds of memory formation remains to be seen) has commandeered ancient, diverse, and quite dangerous forms of cell stress response. It is no picnic in the park with madeleines. It is an all-hands-on-deck disaster scene that puts the cell into a permanently altered trajectory, and carries a variety of long-term risks, such as cancer formation from all the DNA breakage and end-joining repair, which is not very accurate. They mention in passing that some drugs have been recently developed against TLR9, which are being used to dampen inflammatory activities in the brain. But this new work indicates that such drugs are likely double-edged swords, that could impair both learning and the long-term health of treated neurons and brains.

Saturday, August 24, 2024

Aging and Death

Our fate was sealed a very long time ago.

Why do we die? It seems like a cruel and wasteful way to run a biosphere, not to mention a human life. After we have accumulated a lifetime of experience and knowledge, we age, decline, and sign off, whether to go to our just reward, or into oblivion. What is the biological rationale and defense for all this, which the biblical writers assigned to the fairy tale of the snake and the apple?

A recent paper ("A unified framework for evolutionary genetic and physiological theories of aging") discusses evolutionary theories of aging, but in typical French fashion, is both turgid and uninteresting. Aging is widely recognized as the consequence of natural selection, or more precisely, the lack thereof after organisms have finished reproducing. Thus we are at our prime in early adulthood, when we seek mates and raise young. Evolutionarily, it is all downhill from there. In professional sports, athletes are generally over the hill at 30, retiring around 35. Natural selection is increasingly irrelevant after we have done the essential tasks of life- surviving to mate and reproduce. We may participate in our communities, and do useful things, but from an evolutionary perspective, genetic problems at this phase of life have much less impact on reproductive success than those that hit earlier. 

All this is embodied in the "disposable soma" theory of aging, which is that our germ cells are the protected jewels of reproduction, while the rest of our bodies are, well, disposable, and thus experience all the indignities of age once their job of passing on the germ cells is done. The current authors try to push another "developmental" theory of aging, which posits that the tradeoffs between youth and age are not so much the resources or selective constraints focused on germ cell propagation vs the soma, but that developmental pathways are, by selection, optimized for the reproductive phase of life, and thus may be out of tune for later phases. Some pathways are over-functional, some under-functional for the aged body, and that imbalance is sadly uncorrected by evolution. Maybe I am not doing justice to these ideas, which maybe feed into therapeutic options against aging, but I find this distinction uncompelling, and won't discuss it further.

A series of unimpressive distinctions in the academic field studying aging from an evolutionary perspective.

Where did the soma arise? Single cell organisms are naturally unitary- the same cell that survives also mates and is the germ cell for the next generation. There are signs of aging in single cell organisms as well, however. In yeast, "mother" cells have a limited lifespan and ability to put out daughter buds. Even bacteria have "new" and "old" poles, the latter of which accumulate inclusion bodies of proteinaceous junk, which apparently doom the older cell to senescence and death. So all cells are faced with processes that fail over time, and the only sure bet is to start as a "fresh" cell, in some sense. Plants have taken a distinct path from animals, by having bodies and death, yes, but being able to generate germ cells from mature tissues instead of segregating them very early in development into stable and distinct gonads.

Multicellularity began innocently enough. Take slime molds, for example. They live as independent amoebae most of the time, but come together to put out spores, when they have used up the local food. They form a small slug-like body, which then grows a spore-bearing head. Some cells form the spores and get to reproduce, but most don't, being part of the body. The same thing happens with mushrooms, which leave a decaying mushroom body behind after releasing their spores. 

We don't shed alot of tears for the mushrooms of the world, which represent the death-throes of their once-youthful mycelia. But that was the pattern set at the beginning- that bodies are cells differentiated from the germ cells, that provide some useful, competitive function, at the cost of being terminal, and not reproducing. Bodies are forms of both lost energy and material, and lost reproductive potential from all those extra cells. Who could have imagined that they would become so ornate as to totally overwhelm, in mass and complexity, the germ cells that are the point of the whole exercise? Who could have imagined that they would gain feelings, purposes, and memories, and rage against the fate that evolution had in store for them?

On a more mechanistic level, aging appears to arise from many defects. One is the accumulation of mutations, which in soma cells lead to defective proteins being made and defective regulation of cell processes. An extreme form is cancer, as is progeria. Bad proteins and other junk like odd chemicals and chemically modified cell components can accumulate, which is another cause of aging. Cataracts are one example, where the proteins in our lenses wear out from UV exposure. We have quite intricate trash disposal processes, but they can't keep with everything, as we have learned from the advent of modern chemistry and its many toxins. Another cause is more programmatic: senescent cells, which are aged-out and have the virtue that they are blocked from dividing, but have the defect that they put out harmful signals to the immune system that promote inflammation, another general cause of aging.

Aging research has not found a single magic bullet, which makes sense from the evolutionary theory behind it. A few things may be fixable, but mostly the breakdowns were never meant to be remedied or fixed, nor can they be. In fact, our germ cells are not completely immune from aging either, as we learn from older fathers whose children have higher rates of autism. We as somatic bodies are as disposable as any form of packaging, getting those germ cells through a complicated, competitive world, and on to their destination.


Sunday, August 11, 2024

Modeling Cell Division

Is molecular biology ready to use modeling to inform experimental work?

The cell cycle is a holy grail of biology. The first mutants that dissected some of its regulatory apparatus, the CDC mutants of Saccharomyces cerevisiae (yeast), electrified the field and led to a Nobel prize. These were temperature sensitive mutants, making only small changes to the protein sequence that rendered that protein inactive at high temperature (thus inducing a cell cycle arrest phenotype), while allowing wild-type growth at normal temperatures. In the fifty years since, a great deal of the circuitry has been worked out, with the result that it is now possible, as a recent paper describes, to make a detailed mathematical model of the process that claims to be useful in the sense of explaining existing findings in a unified model and making predictions of places to look for additional actors.

At the center of this regulatory scheme are transcription activators, SBF/MBF, that are partly controlled by, and in turn control the synthesis of, a series of cyclins. Cyclins are proteins that were observed (another Nobel prize) to have striking variations in abundance during the cell cycle. There are characteristic cyclins for each phase of the cell cycle, which goes from G1, a resting phase, to S, which is DNA replication, to G2, a second resting phase, and then M, which is mitosis, which brings us back to G1. Cyclins work by binding to a central protein kinase, Cdc28, which, as regulated by each distinct cyclin, phosphorylates and thus regulates distinct sets of target proteins. The key decision a cell has to make is whether to commit to DNA replication, i.e. S phase. No cell wants to run out of energy during this process, so its size and metabolic state needs to be carefully monitored. That is done by Cyclin 3 (Cln3), Whi5, and Bck2, which each influence whether the SBF/MBF regulators are active. 

Some highly simplified elements of the yeast cell cycle. Cyclins (Cln and Clb) are regulators of a central protein kinase, Cdc28, that direct it to regulate appropriate targets at each stage of the cell cycle. Cyclins themselves are regulated by transcriptional control (here, the activators SBF and MBF), and then destroyed at appropriate times by proteolysis, rendering them abundant only at specific times during the cell cycle. Focusing on the "START" process that starts the process from rest (G1 phase) to new bud formation and DNA replication (S phase), Cln3 and Bck2 respond to upstream nutritional and size cues, and each activate the SBF/MBF transcription activator.

As outlined in the figure above, Cyclin 3 is the G1 cyclin, which, in complex with Cdc28 phosphorylates Whi5, turning it off. Whi5 is an inhibitor that binds to SBF/MBF, so the Cyclin 3 activation turns these regulators on, and thus starts off the cell cycle under the proper conditions. Incidentally, the mammalian version of Whi5, Rb (for retinoblastoma), is a notorious oncogene, that, when mutated, releases cells from regulatory control over cell division. SBF and MBF bind to genes for the next series of cyclins, Cln1, Cln2, Clb5, Clb6. The first two are further G1 cyclins that orchestrate the end of G1. They induce phosphorylation and inactivation of Sic1 and Cdc6, which are inhibitors of Clb5 and Clb6. These latter two are then the initiators of S phase and DNA replication. Meanwhile, Cln3 stays around till M phase, but is then degraded in definitive fashion by the proteases that end M phase. Starvation conditions lead to rapid degradation of Cln3 at all times, and thus to no chance of starting a new cell cycle.

Charts of the abundance of some cyclins through the cell cycle. Each one has its time to shine, after which it is ubiquitinated and sent off to the recycling center / proteasome.

Bck2 is another activator of SBF/MBF that is unrelated to the Cln3/Whi5 system, but also integrates cell size and metabolic status information. Null mutants of Cln3 (or Bck2) are viable, if altered in cell cycle, while double null mutants of Cln3 and Bck2 are dead, indicating that these regulators are each important, in a complementary way, in cell cycle control. Given that little is known about Bck2, the modelers in this paper assume various properties and hope for the best down the line, predicting that cell size (at the key transition to S phase) is more affected in the Cln3 null mutant than in the Bck2 null mutant, since in the former, excess active Whi5 soaks up most of the available SBF/MBF, and requiring extra-high and active levels of Bck2 to overcome this barrier and activate the G1 cyclins and other genes.

The modelers are working from the accumulated, mostly genetic data, and in turn validate their models against the same genetic data, plus a few extra mutants they or others have made. The models are mathematical representations of how each node (i.e protein, or gene) in the system responds to the others, but since there are a multitude of unknowns, (such as what really regulates Bck2 from upstream, to cite just one example), the system is not really able to make predictions, but rather fine-tunes/reconciles what knowledge there is, and, at best, points to gaps in knowledge. It is a bit like AI, which magically recombines and regurgitates material from a vast corpus based on piece-wise cues, but is not going to find new data, other than through its notorious hallucinations.

For example, a new paper came out after this modeling, which finds that Cln3 affects Cln2 abundance by mechanisms quite apart from its SBF/MBF transcriptional control, and that it regulates cell size in large part at M phase, not through its G1/S gating. All this comes from new experimental work, unanticipated by the modeling. So, in the end, experimental work always trumps modeling, which is a bit different than how things are in, say, physics, where sometimes the modeling can be so strong that it predicts new particles, forces, and other phenomena, to be validated later experimentally. Biology may have its master predictive model in the theory of evolution, but genetics and molecular biology remain much more of an empirical slog through the resulting glorious mess.


  • Bitcoin isn't a currency, but rather just another asset class, one without any fundamental or socially positive value. A little like gold, actually, except without gold's resilience against social / technological disruption.
  • The disastrous post-Soviet economic transition, on our advice.
  • The enormous labor drain, and resource drain, from global South to North.

Saturday, June 15, 2024

The Quest for the Perfect Message, in E. coli

Translation efficiency has some weird rules, and a tortured history.

One would think we know everything there is to know about the workhorse of bacterial molecular biology, Escherichia coli. And that would be especially true for its technological applications, like the expression of engineered genes, which is at the very heart of molecular biology and much of biotechnology. Getting genes you put into E. coli expressed at high levels is critical for making drugs, and for making enough for structural and biochemical studies. For decades, the wisdom of the field was to design introduced genes using the codon adaptation index (CAI). This is a measurement of the three-letter codes (codons of the genetic code) that are used in highly expressed genes. They tend to correspond to tRNAs that are more abundant in the cell. So, for example, the amino acid leucine is encoded by six different codons, any of which can be chosen at intended leucine positions in the intended protein. In E. coli, CTG is over ten times more frequently used than CTA, however. Thus, even though they code for the same amino acid, one is more common, perhaps because its cognate tRNA is more common and more easily used during translation. This is basically a diffusion-based argument, that translation will be easier if the tRNA that carries the next amino acid is easier to find.

A recent paper provides a remarkable review of this field. For one thing, it turns out that use of the CAI has virtually no effect on translation efficiency. Whether using rare or common codons, translation is equally efficient for introduced genes. Needless to say, this is quite surprising. It seems as though the role of common vs uncommon tRNAs/codons is more to manage the health of the cell by relieving bottlenecks to translation in a global sense and managing the free pool of ribosomes, rather than regulating the efficiency of translation of any particular mRNA message. tRNAs are highly abundant generally, so there are significant savings possible by managing their levels judiciously, and reducing investment in some versus others.

So what does affect the efficiency of translation? Some messages are better translated than others, after all. The authors point to a completely different mechanism, which is the melting stability of the first ten codons of the mRNA message. RNA can form hairpin and other secondary structures / shapes, and this can apparently strongly affect the ability of ribosomes to find initiation sites. While eukaryotic ribosomes scan in from the 5 prime cap of the mRNA, bacterial ribosomes bind directly to a sequence slightly upstream of the initiating AUG codon. And this can be inhibited by mRNAs that are not neatly ironed out, but knotted up in hairpins and loops. 

Ratio of occurrence of nucleosides in the third codon position of the first ten codons of high versus low expressing genes in E. coli. This was not run on native E. coli genes, but on a large panel of transgenes engineered from outside. The strong bias towards A at this position in high expressing genes shows a preference for initiating sequences to have weak secondary structure, allowing better ribosome access.


Use of A-rich sequences around the ribosomal initiation sites and the first ten codons, then, dramatically increases the translation efficiency, (via the initiation efficiency) of introduced genes, and provide a much more robust method to control their expression. But then the authors make another observation, which is that the bacteria themselves do not seem to use this mechanism for their own genes. In a massive analysis of data from other labs, (below), there is actually a negative correlation between the quality of the initiation region (X- axis) and the abundance of the respective protein (Y- axis). Again, quite a surprising result, which the authors can only speculate about. 

There is negative correlation between the initiation codon quality (X- axis), as shown above, and the native E. coli gene expression level (Y- axis). So these cells are not optimizing their translation at all in accordance with the findings above.

The picture that they paint is that highly expressed genes in E. coli benefit from consistent, smooth translation. This depends less on maximal initiation speed than on the holistic picture of translation. The CAI optimal codons (called translationally optimal in this paper, or TO) tend to be poor at initiation, but have good codon-anticodon pairing and thus low A content. So there are conflicting pressures at work, in basic chemical terms, where different codons are intrinsically good for initiation, and complementary ones for elongation. The obvious solution is to use the initiation-optimal codons for the first ten codons, and translationally optimal codons the rest of the way. But that is not what is found either. The authors claim that, for native proteins, lower levels of initiation are actually beneficial for smoother protein production with less noise from time to time and cell to cell. 

Additionally, lower initiation rates preserve free ribosome levels globally, another important goal for the cell, via evolutionary selection. The authors find, for instance, a correlation between low variability of initiation (low noise) and low initiation rate. This is a bit mystifying, since ribosomes should always be present in excess, and it is not immediately apparent why holdups to translation initiation would lend themselves to more even initiation. Perhaps the search process by which ribosomes find free mRNAs is inefficient, so that those with slower initiation sequences have a constant backlog of incoming, bound and poised ribosomes, while after they get past the initiation region, those ribosomes progress rapidly and rejoin the free pool. That would be one way of setting up a smooth production process, suitable for essential protein products, that is relatively insensitive to the free ribosome concentration and other variations in the cell.

Technologists trying to express some drug-associated protein in bacteria don't care about smoothness and noise, but just want to maximize production while not killing the cell (or before killing the cell). So all these subtle considerations that go into the evolution of the native gene complement of E. coli and its high or low expression levels don't apply. But for researchers trying to predict the expression level of a given natural gene, it is maddening, since it seems currently impossible to predict the expression level (via translation) of a gene from its sequence. It is one more case where modeling of what is going on inside cells is surprisingly difficult, even for a system we had thought we understood, in one of the simplest and most well-studied bacteria. As researchers never tire of saying ... more research is needed.


Saturday, June 8, 2024

A Membrane Transistor

Voltage sensitive domains can make switches out of ion channels, antiporters, and other enzymes.

The heart of modern electronics is the transistor. It is a valve or switch, using a small electrical signal to control the flow of other electrical signals. We have learned that the simple logic this mechanism enables can be elaborated into hugely complex, even putatively intelligent, computers, databases, applications, and other paraphernalia of modernity. The same mechanism has a very long history in biology, quite apart from its use in neurons and brains, since membranes are typically charged, well-poised to be sensitive to changes in charge for all sorts of signaling.

The voltage sensitive domain (VSD) in proteins is an ancient (going back to archaea) bundle of four alpha helices that were first found attached to voltage-sensitive ion channels, including sodium, potassium, and calcium channels. But later it became fascinatingly apparent that it can control other protein activities as well. A recent paper discussed the mechanism and structure of a sodium/hydrogen antiporter with a role in sperm navigation, which uses a VSD to control its signaling. But there are also voltage-sensitive phosphatases, and other kinds of effectors hooked up to VSD domains. 

Schematic of a basic VSD, with helix 4 in pink, moving against the other three helices colored teal. Imagine a membrane going horizontally over these embedded proteins. When voltage across the local membrane changes, (hyperpolarized or de-polarized), helix 4 can plunge by one helical repeat unit in either direction, up or down.

One of the helixes (#4) in the VSD bundle has positive charges, while the others have specifically positioned negative charges. This creates a structure where changes in the ambient voltage across the membrane it sits in can cause helix #4 to plunge down by one or two steps (that is, turns of the alpha helix) versus its partners. This movement can then be propagated out along extensions of helix #4 to other domains of the protein in order to switch on or off their activities.

The helices of numerous proteins that have a VSD domain (in red) are drawn out, showing the diversity of how this domain is used.

While the studied protein, SLC9C1, is essential in mammalian sperm for motility, the paper studied its workings in sea urchin sperm, a common model system. The logic (as illustrated below) is that (female) chemoattractants bind to receptors on the sperm surface. These receptors generate cyclic GMP, which turns on potassium channels that increase the voltage across the membrane. This broadcasts the signal locally, and is received by the SLC9C1 transporter, which does two things. It activates a linked soluble adenylate cyclase enzyme, making the further signaling molecule cAMP. And it also activates the transporter itself, pumping protons out (in return 1:1 for sodium ions in) and causing cytoplasmic alkalinization. The cAMP activates sodium ion channels to cancel the high membrane voltage (a fast process), and the alkalinization activates calcium channels that direct the sperm directional swimming responses- the ultimate response. The latter is relatively slow, so the whole cascade has timing characteristics that allow the signal to be dampened, but the response to persist a bit longer as the sperm moves through a variable and stochastic gradient.

A schematic of the logic of this pathway, and of the SLC9C1 anti-porter. At top, the transport mechanism is crudely illustrated as a rocking motion that ensures that only one H+ is exchanged for one Na+ for each cycle of transport. The transport is driven thermodynamically by the higher concentration of Na+ outside.


But these researchers weren't interested in what the sperm were thinking, but rather how this widely used protein domain became hitched to this unusual protein and how it works there, turning on a sodium/hydrogen antiporter rather than the usual ion channel. They estimate that the #4 helix of the VSD moves by 10 angstroms, or 1 nm, upon voltage activation, which is a substantial movement, roughly equivalent to the width of these helices. In their final model, this movement significantly reshapes the intracellular domain of the transporter, which in turn releases its hold on the transporter's throat, allowing it to move cyclically as it needs to exchange hydrogen ions for sodium ions. This protein is known to bind and activate an adenylyl cyclase, which produces cAMP, which is one key next actor in the signaling cascade. This activation may be physically direct, or it may be through the local change in pH- that part is as yet unknown. cAMP also, incidentally, binds to and turns up the activity of this transporter, providing a bit of positive feedback.

Model of the SLC9C1 protein, with the VSD in teal and a predicted activation mechanism illustrated (only the third panel is activated/open). Upon voltage activation, the very long helix 4 dips down and changes orientation, dramatically opening the intracellular portion of the transporter (purple and orange portion). This in turn lets go of the bottom of the actual transporter portion of the protein (gray), allowing alkalinization of the cytoplasm to go forth. At the bottom sides, in brown, is the cAMP binding domain, which lowers the voltage threshold for activation.

There are a variety of interesting lessons from this work. One is that useful protein domains like VSD are often duplicated and propagated to unexpected places to regulate new processes. Another is that the new cryo-electron microscopy methods have made structural biology like this far easier and more common than it used to be, especially for membrane proteins, which are exceedingly difficult to crystalize. A third is that signaling systems in biology are shockingly complex. One would think that getting sperm cells to where they are going would take a bare minimum of complexity, yet we are studying a five or more part cascade involving two cyclic nucleotides, four ions, intricate proteins to manage them all, and who knows what else into the mix. It is difficult to account for all this, other than to say that when you have a few billion years to tinker with things, and have eons of desperate races to the egg for selective pressure, they tend to get more ornate. And a fourth is that it is regulatory switches all the way down.


Saturday, May 25, 2024

Nascent Neurons in Early Animals

Some of the most primitive animals have no nerves or neurons... how do they know what is going on?

We often think of our brains as computers, but while human-made computers are (so far) strictly electrical, our brains have a significantly different basis. The electrical component is comparatively slow, and confined to conduction along the membranes of single cells. Each of these neurons communicate with others using chemicals, mostly at specialized synapses, but also via other small compounds, neuropeptides, and hormones. That is why drugs have so many interesting effects, from anesthesia to anti-depression and hallucination. These properties suggest that the brain and its neurons began, evolutionarily speaking, as chemically excitable cells, before they became somewhat reluctant electrical conductors.

Thankfully, a few examples of early stages of animal evolution still exist. The main branches of the early divergence of animals are sponges (porifera), jellies and corals (ctenophora, cnidiaria), bilaterians (us), and an extremely small family of placozoa. Neural-type functions appear to have evolved independently in each of these lineages, from origins that are clearest in what appears to be the most primitive of them, the placozoa. These are pancake-like organisms of three cell layers, hardly more complex than a single-celled paramecium. They have about six cell types in all, and glide around using cilia, engulfing edible detritus. They have no neurons, let alone synaptic connections between them, yet they have excitable cells that secrete what we would call neuropeptides, that tell nearby cells what to do. Substrances like enkephalins, vasopressin, neurotensin, and the famous glucagon-like peptide are part of the managerie of neuropeptides at work in our own brains and bodies.

A placozoan, about a millimeter wide. They are sort of a super-amoeba, attaching to and gliding over surfaces underwater and eating detritus. They are heavily ciliated, with only a few cell types divided in top, middle, and bottom cell layers. The proto-neural peptidergic cells make up ~13% of cells in this body.


The fact is that excitable cells long predate neurons. Even bacteria can sense things from outside, orient, and respond to them. As eukaryotes, placozoans inherited a complex repertoire of sense and response systems, such as G-protein coupled receptors (GPCRs) that link sensation of external chemicals with cascades of internal signaling. GPCRs are the dominant signaling platforms, along with activatable ion channels, in our nervous systems. So a natural hypothesis for the origin of nervous systems is that they began with chemical sensing and inter-cell chemical signaling systems that later gained electrical characteristics to speed things up, especially as more cells were added, body size increased, and local signaling could not keep up. Jellies, for instance, have neural nets that are quite unlike, and evolutionarily distinct from, the centralized systems of animals, yet use a similar molecular palette of signaling molecules, receptors, and excitation pathways. 

Placozoans, which date to maybe 800 million years ago, don't even have neurons, let alone neural nets or nervous systems. A recent paper labored to catalog what they do have, however, finding a number of pre-neural characteristics. For example, the peptidergic cell type, which secretes peptides that signal to neighboring cells, expresses 25 or more GPCRs, receptors for those same peptides and other environmental chemicals. They state that these GPCRs are not detectably related to those of animals, so placozoans underwent their own radiation, evolving/diversifying a primordial receptor into hundreds that exist in its genome today. The researchers even go so far as to employ the AI program Alpha Fold to model which GPCRs bind to which endogenously produced peptides, in an attempt to figure out the circuitry that these organisms employ.

This peptidergic cell type also expresses other neuron-like proteins, like neuropeptide processing enzymes, transcription regulators Sox, Pax, Jun, and Fos, a neural-specific RNA polyadenylation enzyme, a suite of calcium sensitive channels and signaling components, and many components of the presynaptic scaffold, which organizes the secretion of neuropeptides and other transmitters in neurons, and in placozoa presumably organizes its secretion of its quasi-neuropeptides. So of the six cell types, the peptidergic cell appears to be specialized for signaling, is present in low abundance, and expresses a bunch of proteins that in other lineages became far more elaborated into the neural system. Peptidergic cells do not make synapses or extended cell processes, for example. What they do is to offer this millimeter-sized organism a primitive signaling and response capacity that, in response to environmental cues, prompts it to alter its shape and movement by distributing neuropeptides to nearby effector cells that do the gliding and eating that the peptidergic cells can't do.

A schematic of neural-like proteins expressed in placozoa, characteristic of more advanced presynaptic secretory neural systems. These involve both secretion of neuropeptides (bottom left and middle), the expression of key ion channels used for cell activation (Ca++ channels), and the expression of cell-cell adhesion and signaling molecules (top right).

Why peptides? The workhorse of our brain synapses are simpler chemicals like serotonin, glutamate, and norepinephrine. Yet the chemical palette of such simple compounds is limited, and each one requires its own enzymatic machinery for synthesis. Neuropeptides, in contrast, are typically generated by cleavage of larger proteins encoded from the genome. Thus the same mechanism (translation and cleavage) can generate a virtually infinite variety of short and medium sized peptide sequences, each of which can have its own meaning, and have a GPCR or other receptor tailored to detecting it. The scope of experimentation is much greater, given normal mutation and duplication events through evolutionary time, and the synthetic pipeline much easier to manage. Our nervous systems use a wide variety of neuropeptides, as noted above, and our immune system uses an even larger palette of cytokines and chemokines, upwards of a hundred, each of which have particular regulatory meanings.


An evolutionary scheme describing the neural and proto-neural systems observed among primitive animals.


The placozoan relic lineages show that nervous systems arose in gradual fashion from already-complex systems of cell-cell signaling that focused on chemical rather than electrical signaling. But very quickly, with the advent of only slighly larger and more complex body plans, like those of hydra or jellies, the need for speed forced an additional mode of signaling- the propagation of electrical activity within cells, (the proto-neurons), and their physical extension to capitalize on that new mode of rapid conduction. But never did nervous systems leave behind their chemical roots, as the neurons in our brains still laboriously conduct signals from one neuron to the next via the chemical synapse, secreting a packet of chemicals from one side, and receiving that signal across the gap on the other side.


  • The mechanics of bombing a population back into the stone age.
  • The Saudis and 9/11.
  • Love above all.
  • The lower courts are starting to revolt.
  • Brain worms, Fox news, and delusion.
  • Notes on the origins of MMT, as a (somewhat tedious) film about it comes out.

Saturday, May 18, 2024

Emergency- Call UCP!

Uncoupling proteins in mitochondria provide a paradoxical safety valve.

One of the great insights of biochemistry in the last century was the chemiosmotic theory, which finally described the nature of power flows in the mitochondrion. Everyone knew that energetic electrons were spun off the metabolism (burning) of food via the electron transport chain, ending up re-united with oxygen (creating the CO2 we breathe out). But how was that power transmitted to ATP? The key turned out to be a battery-like state across the mitochondrial membrane, where protons are pumped out by the electron transport chain, and then come back in while turning the motor of the ATP synthase to phosphorylate ADP into ATP. It is the (proton) concentration and charge difference (that is, the chemiosmotic gradient) across the inner mitochondrial membrane that stores and transmits this power- a clever and flexible system for energizing the mitochondrion and, indirectly, the rest of the cell.

Schematic view of the electron transport chain proteins, as well as the consumer of its energy, the ATP synthase. The inside of the mitochondrial matrix is at top, where core metabolism takes place to generate electrons, resulting in protons pumped out towards the bottom. Protons return through the ATP synthase (right) to power the phosphorylation (so-called oxidative phosphorylation) of ADP to ATP.

Chemiosmotic theory taught us that mitochondria are always charged up, keeping a balance of metabolism and ATP production going, all dependent on the tightness of the inner mitochondrial membrane, which was the "plate" that keeps the protons and other ions sealed apart. But over the years, leaks kept cropping up. In the human genome, there are at least six uncoupling proteins, or UCPs, which let protons through this membrane, on purpose. What is the deal with that?

One use of these proteins is easy enough to understand- the generation of heat in brown fat. Brown fat is brown because it has a lot of mitochondria, which are brown because of the many metal- and iron-hosting enzymes that operate at the core of metabolism. UCP1 is present in brown fat to generate heat by letting the engine run free, as it were. It is as simple as that. But most of the time, inefficiency is not really the point. The other UCP proteins have very different roles. On the whole, however, it is estimated that proton leaks from all sources eat up about a fourth of our metabolic energy, and thus evidently play a role in making us warm blooded, even apart from specialized brown fat.

A more general schematic that adds UCP proteins to the view above. Leaks also happen through other channels, such as the membrane itself, and also the ANT protein, at low and non-regulated rates..

One big problem of mitochondria is that they are doing some quite dangerous chemistry. The electrons liberated from metabolism of food have a lot of energy, and the electron transport chain is really more like a high voltage power station. The proteins in this chain are all structured to squeeze all the power they can out of the electrons and into the proton gradient. But that runs the risk of squeezing too hard. If there is a holdup anywhere, things can back up and electrons leak out. If that happens, they are likely to combine with oxygen in an uncontrolled way that generates compounds like peroxide, superoxide, and hydroxy radicals. These are highly reactive (customarily termed ROS, for reactive oxygen species) and can do a great deal of damage in the cell. ROS is used in some signaling systems, such as the pathway by which glucose stimulates insulin secretion in the pancreas, but generally, ROS is very bad for the cell and rises exponentially with the severity of blockages in the electron transport chain. Many theories relating to aging and how to address it revolve around the ongoing damage from ROS.

Thus the more important role for the other UCP proteins is to function as a safety valve for overall power flow through mitochondrial metabolism- a metaphorical steam valve. UCP proteins are known to be inducible by ROS, and when activated, allow protons to run back into the matrix, which relieves the pressure upstream on all the electron transport chain proteins, which are furiously pumping out protons in response to the overall metabolic rate of fat/sugar usage. While metabolism is regulated at innumerable points, it is evident that, on a moment-to-moment basis, an extra level of regulation, i.e. relief, is needed at this UCP level to keep the system humming with minimal chemical damage to the rest of the cell.


Sunday, May 5, 2024

Neutrophils Ask: How Did I Get Here?

With apologies to the Talking Heads... how the amoeboid cells of our immune system travel around in response to outside cues like cytokines.

Amoeboid cells seem so alive and even conscious. They seek out prey, engulf, and kill it. How is that done, and what are they thinking? Molecular biologists naturally come at this from a molecular perspective, asking what the signals are, how are they received, what pathways relay them to the cytoskeleton, and so forth. No soul is assumed, and none has been found, despite the great complexity of these cells and their activities.

The story starts with receptors at the surface, which can sense many of the cytokines of the immune system, of which there are roughly a hundred. These have many roles, including pro-inflammatory and anti-inflammatory effects. Neutrophils, which are the subject of today's paper, also have receptors that directly sense pathogens, like bacterial cell coats, viral double stranded RNA, and also broken cells, like DNA out in the environment where it shouldn't be. One question is how these cells sense shallow gradients- they can orient properly with as little as two percent difference in concentration between back and front. This is thought to involve pretty strong feedback systems that accentuate the stronger signal and then keep strengthening it in concert with the cytoskeleton that the receptors ultimately organize and orient. But that then leads to the next question of what turns this feedback process off, preventing locking on one target, so that neutrophils can turn on a dime and pursue a new target, if needed?

The molecular basics of cell orientation in eukaryotes have taken a long time to establish. The cell surface receptors typically activate G proteins, specifically the beta/gamma subunit, which can activate an enzyme called PI3 kinase (PI3K). This enzyme puts a phosphate group on the membrane lipid inositol, generating inositol triphosphate, or IP3. This lipid is a sort of beacon, which attracts a variety of other proteins to come to the membrane, among which is DOCK2, and other members of its family of guanine exchange proteins, which in turn activate RAC, by encouraging it to release GDP and bind GTP. RAC is a key node here that is active with GTP. RAC then activates other proteins like WAVE and PAK1, which go on to activate ARP2 and its family members, which are, finally, the proteins which nucleate extension and branching of actin in filaments, which provide the actual power behind cell protrusions and movement.

A sketch of the signaling cascade from outside the cell to cytoskeletal re-orientation. R stands for receptor. One form of feedback is shown, which is positive reinforcement from locally active Rac and actin, back to PI3K. This helps the local front stay coherent in pursuit of prey or gradients of signals.

It has also been found that both RAC and actin have some kind of local positive feedback effect on neutrophils, allowing migrating cells to establish stable fronts that respond to gradients of stimulating molecules. At the same time, there is a global negative regulation system, mostly due to the tension from actin and on the cell membrane, which encourages retraction of cellular fronts that are not experiencing stimulating signals. All this obviously contributes to the ability of cells to go one way, and have their back ends follow. 

The current paper asked in a little more fine grained detail how the front mechanism works- how does it avoid locking up from positive feedback, and how does it allow other areas of the cell to take over if they see stimulation on their sides? They set up a remarkable system of light-activated PI3 kinase, where they could shine blue light on one side of the engineered cells and see them move in that direction, from the excess PI3K activity. This system derives from an obscure bacterial protein that rearranges a flavin cofactor under blue light, in a way that can allow binding surfaces to be hidden or revealed. 

In the key experiement, they shined light on one side of their cells, then turned it off for a bit, and the shined light on the entire cell. This tests whether there is a residual effect from the prior stimulation. Would the cells be entrained to keep going where they were going before? Or would they not care, or would they try something new? The answer clearly (and reproducibly) was that they struck off in a new direction. This shows that there is a habituation or inhibition mechanism at work, over some slow time period, which acts in activated regions. 


 The source for this video is the main paper behind this post. The dashed circle indicates where the researchers shined their blue light which induces local PI3K activity. Note how at first, they are leading the cell by just the front. When this cell gets to the midline, they switch to illuminating the whole cell, to ask whether there is residual activation or inhibition from the earlier illumination. The observation that the cell then veers off opposite to the original stimulation indicates that inhibition is the residual effect from the former activation.

 

Such habituation is a critical piece of behavior that follows gradients. It gets used to what it just saw, and if the next unit is the same intensity, it doesn't care that much (though probably will keep going). If the next unit of stimulation is increased, then it will keep going. But if it is decreased, then the inhibition kicks in and the front slows down, allowing other areas of the cell to expand if they are seeing increased gradients. Thus temporal and spatial gradients can both be negotiated, using a finely tuned mix of positive and negative feedbacks.


Saturday, April 6, 2024

Mopping up Around the Cell

What happens when proteins can't find their partners?

Cells have a lot of garbage disposal issues. There are lysosomes to digest large things like viruses, proteasomes to dispose of individual proteins, and lots of surveillance mechanisms to check that things are going as they should- that proteins coming off the ribosome are complete, that mRNAs are being spliced, that mitochondria are charged up as they should be, that the endoplasmic reticulum is making, folding, and secreting proteins as it should be, among many others. One basic problem that arises when cells have a lot of proteins that assemble and cooperate in the form of complexes, is that some of those subunits may be present in excess, or not join their intended complexes for other reasons such as misfolding. This can have very bad effects. Most protein binding makes use of hydrophobic surfaces, and having these floating around freely can lead to indiscriminate binding / agglomeration, like amyloid plaque formation, and cell death.

Bacteria have one partial solution, which is to encode proteins that are destined to the same complex from the same mRNA, made from what is called an "operon" of genes, like a train with successive gene-carriages. Each multi-protein-encoding message from such an operon is thus necessarily equally abundant, and, assuming simiar ribosomal rates of protein synthesis, the proteins should also be produced in equal quantities, providing at least one method to balance their abundance in the cell. But there are many other issues- proteins may have different life-spans, or different ribosomal production rates, or assembly into the complex may be slow and difficult, so bacteria still are not out of the woods. Eukaryotes do not use operons anyhow, so our more-finely regulated gene control mechanisms are called on to properly equalize (or adjust for) the ultimate subunit concentrations. 

But when all this fails, and there is more of some complex subunit than needed, what happens then? When experimenters over-produce some complex component in cells, it is typically short-lived. And if they impair its production, the rest of the complex tends to be short-lived. This implies mechanisms in the cell to dispose of incomplete complexes and their components. It turns out that there are some specific chaperone proteins that detect such orphan subunits, and tag them to be destroyed. Several prominent complexes, such as ribosomes and proteasomes, even have specifically dedicated mop-up chaperones. A recent paper described a chaperone protein dedicated to mopping up the excess or misfolded subunits of another large and abundant complex - the chaperonin complex. That makes this protein, ZNRD2, a sort of metachaperone.

Some structural (though not dynamic) views of the CCT complex. A shows top and side views, respectively. C shows a layout of how the equator of the complex looks, as coded by each of the subunits. At the ring-ring interfaces are the ATP binding sites (d). And lastly (e) a cut-away view of the inside show where substrate proteins are enclosed and encouraged to fold correctly.

The chaperonin complex, (also called CCT), is a large, hollow sphere that actively helps other proteins to fold correctly. The structural proteins actin and tubulin are the most prominent targets that need this help. When first synthesized, they are bound by adapters that ferry them to the chaperonin complex, which lifts its lid to allow the protein in. Then, ATP is used to induce dramatic cycling of the chaperonin structure, shifting from an internal hydrophobic structure to a more hydrophilic one. This allows the unfolded protein to alternately splay open over the hydrophobic surface, and then fold in piece-wise fashion, for as long as it takes till the barrel detects that it is fully folded and no longer sticking to the hydrophobic internal surfaces.

In the current work, the researchers drove the expression of several individual CCT subunits in cell lysates. Then they sent the products into a mass spectrometer to find out what was sticking to these "orphan" proteins. They found two major associated proteins, HERC2, and ZNRD2. HERC2 is known as a ubiquitin ligase, which is one of a large family of enzymes that tag proteins with ubiquitin, targeting them for disposal. But ZNRD2 was totally uncharacterized, known only as an auto-antigen reacted to by some people with Sjogren's syndrome or scleroderma. The question then was .. does HERC2 directly sense the presence of free-floating CCT subunits, or does it need a helper to do so, such as perhaps ZNRD2?

"... a sizable population of multiple CCT subunits are orphaned even under normal conditions, and the degradation of a subset of these can be stimulated by HERC2."

The researchers showed that deleting HERC2 strongly impaired the cleanup of most orphan CCT subunits. It is evident, however, that there are other chaperones not covered in this work that help clean up some of the other CCT subunits. Then they found that HERC2 interaction with the CCT proteins was dependent on ZNRD2, but that the reverse was not the case- ZNRD2 binds CCT subunits in any case. This, and other experiments, including mapping the location within the HERC2 protein that binds ZNRD2, showed that ZNRD2 is the adapter that does the detailed detection of orphaned CCT subunits. At only 199 amino acids, there is not much to it, and searches for domain signatures do not yield much. Its name reflects a structure that uses zinc ions for stabilization, but much of the protein is also disordered. It is notable for a high proportion of hydrophobic amino acids (alanine, leucine) and lots of prolines (15), which would contribute to a disordered structure. 

Thankfully, with the advent of AI and alpha-fold, these researchers could also investigate and model how ZNRD2 interacts with both the HERC2 ubiquitin ligase and with one of the CCT subunits, CCT4- all without doing any laborious structure determinations.


AI-calculated structures of the complex of the ubiquitin ligase HERC2 with the adaptor ZNRD2 and the target subunit CCT4. At right, the hydrophobic residues of CCT4 are colored yellow, showing that the ZNRD2 orphan subunit detector and adaptor binds to a hydrophobic pocket which would otherwise be completely buried with the full CCT structure. The interacting domain of HERC4 in green is termed a 7-bladed beta propeller.

"In the fully assembled CCT double ring, all potential ZNRD2 interaction sites are completely buried because they form the interface between the two individual rings."

 

They found that ZNRD2 binds to a hydrophobic pocket of CCT4, a pocket that is otherwise buried in the fully assembled CCT. This patch would also be exposed on partially assembled CCT complexes, indicating that this interaction is not only relevant for mopping up the individual subunit, but for several kinds of incomplete assembly of the entire complex, perhaps explaining why other subunits are also mopped up by this system. 

This kind of work is a good example of normal science. A gene about which nothing was previously known (ZNDR2) is now given a function in the cell, and a process circumstantially known to exist is fleshed out with actors and structures that explain it. Of the ~20,000 human protein-coding genes, roughly ten percent still have no annotation, and many more have only tenuous annotation, perhaps only drawn from structural analogy, not direct study. So there is a great deal more work needed to evaluate our parts list, even on the most basic level, even before getting into the complexities of how these proteins act and interact in tissues and pathways. 


  • What are the hippos thinking?
  • Vodka is apparently a thing.
  • Just how low is this grift going?
  • Who gets to reproduce, and who gets killed? Population control at the heart of the Jewish state.
  • Genetics and parenting.
  • No, absolutely not.. this can not be true.