Saturday, March 1, 2025

The Train Tracks of Synapsis

Structures that align and tether the chromosomes in meiosis are now understood in some molecular detail.

It has been one of the wonders of biology- the synaptonemal complex that aligns homologous chromosomes during meiosis. While chromosomes regularly line up in the middle of the cell during mitosis, so that they can be evenly divided between the daughter cells, in this process they only have to join at their centromeres, which get dragged to the midline of the cell, and then pulled back apart at cell division. In meiosis, on the other hand, not only do the sister chromosomes that have just replicated stick together at their centromeres, but the homologous chromosomes, which have never bothered about each other since sperm fused with egg, suddenly seek each other out and pair up in an elaborate dance of DNA breakage, alignment, cross-over, and repair. Then in the first division, these cross-over-joined homologs line up at the midline and get pulled apart as their crossovers are repaired. The second division follows, much more like mitosis, where the duplicated sister chromosomes line up at the midline based on their centromere attachments, and then separate into haploid gametes.

Comparison of mitosis vs meiosis, which goes through an extra division and alternate chrosomosome pairing and separation processes in the firsts division.

The two divisions are fundamentally different, with the first involving novel chromosome pairings and attachments. The opening act of all this, which I won't go into further, is a sprinkling of ~400 DNA strand breaks induced specifically all over the genome, which sets up a repair process at each site, where the chromosomes (using Rad51) seek out good copies of the damaged DNA- that is, another, matching, DNA molecule. There are specific processes that appear to prevent use of the recently replicated "sister", which would be the most closely identical copy that could be used. Instead, there is a bias to use the "homologous" copy from the other parent. But these homologous chromosomes have just been replicated as well. How to line all this up so that the chromosomes all line up neatly and separate neatly during the first meiotic division? The answer is the synaptonemal complex.

Schematic of the synaptonemal complex joining two homologous chromosomes. The lateral elements are on each side, and the central element lines up the center. Crossing the gap is the transverse elements, now known to be composed of the SYCP1 protein. At bottom is a diagram from its atomic structure of how SYCP1 coils together, and how its ends join to zip up the synaptonemal gap. 

This is a train track of connecting proteins between the homologous chromosomes. It is evident that the DNA breaks come first, followed by the search for matching homologs, followed by the radiating and progressive assembly of the synaptonemal complex out from the break repair sites. The components of its major structures have been mostly characterized- the lateral element where the DNA loops line up; the transverse element that spans the gap between the homologous chromosomes, and the central element, proteins at the midline that help the transverse elements assemble. A paper from 2023 characterized the transverse element protein, SYCP1, which is a long coil of a protein that dimerizes to make a strong coil, and then dimerizes again head-to-head to create the symmetric bridge over the whole width of the synaptonemal complex. Which is about 100 nanometers in width. 

These authors then focus on a series of experiments using key mutations at the dimer-dimer head-to-head interaction area, to demonstrate how this head-to-head zippering works in detail. Mutating just two amino acids in this contact region eliminates the head-to-head interaction, making synapsis impossible. In these cases, the homologous chromosomes (from mice) remain in proximity, especially at crossover sites, but are no longer zippered up and closely aligned.

Spreads of mouse meiotic chromosomes, labeled as shown with antibodies against two synaptonemal proteins. From the top, wild-type SYCP1, then single individual mutations in the end-joining region, and at bottom SYCP1 with two point mutations that eliminate its function entirely. The chromosomes at the bottom are aligned only by virtue of their crossover points, but not by a zippered up synaptonemal complex. Needless to say, mice like this are not fertile.


Thus what was once a hazy mystery in the highest power microscopes has been defined in molecular terms, highlighting once again the power of curiosity, and the essentially moral aim of truth-seeking- to reveal what is true, rather than dictate it. But who cares about all that? Truth, knowledge, science... these values are now not only in question, but under active attack. Who is making America great, and who is diminishing it? Those in our institutions of power who have a voice will hopefully see the consequences and act on them, before our history and values are entirely corrupted.


  • Sociopaths at work.
  • Evidently the model is that we become a version of China/Russia, and make a tripolar world. Not a little Orwellian. And who knows, perhaps we will offer Russia a deal to partition Canada. That is, after we get done partitioning Ukraine.
  • A black day.
  • Oh, wait, the next day was even worse.
  • Shades of Stalin, with a sad sartorial hat-tip to Steve Jobs.
  • Unlawful and vindictive destruction at the NIH, and of biological research in general.
  • And all for love.

Saturday, February 22, 2025

Impeachment is Inevitable

Whether congress wants to or not, it will be forced to defend its role in government.

Looking out over the incredible destruction the new president has already wrought at home and abroad, it is hard to see this continuing for a full four-year term. There is a honeymoon now, and a shock campaign. There is delirium in hard-right circles that their fondest dreams of rampant chaos in the bureaucracy, with racism and fascism ascendant, are coming true. But there will come a time when the costs begin to appear, the appetite for dysfunction will wane, and the tide turns. Congress has small Republican margins, and it won't take many members to face up to our rapidly expanding constitutional crisis.

Maybe I am spinning a fantasy here, but one thing seems certain. The current president is constitutionally (pardon the expression) unable to follow directions. His oath of office was barely out of his mouth before he started violating the constitution and running roughshod over the explicit authorizations and appropriations of Congress. Not to mention direct assertions that the constitution doesn't mean what it plainly says, about birthright citizenship. This is not going to stop, and the only way our system of government is going to survive is that the other branches, specifically congress, use their powerful tools to reset the balance.

Article 2

Harder to judge are the attitudes of the congresspeople who are on the spot. The Republicans have largely rolled over in approving the first, abysmal slate of cabinet nominees. Again, there is a honeymoon of sorts. Party discipline is particularly strong on the conservative side, and the president has eagerly used his tools of intimidation and hatred to obtain obedience. So it is hard to say when they will crack. But as the functions of government degrade, the country is laughed at and reviled around the world, the economic damage accumulates, and constituents line up to complain, the equation will change. And anyhow, they would merely be elevating the vice president, who is hardly an opponent of their ideological aims, and is part of the Senate community (however disliked on both sides). So impeachment becomes a much less imposing action than it might otherwise be. 

As they say, the third time's the charm!


  • Presidents day.
  • Oh the irony. Science comes up with a vaccine that saves millions, who turn into idiots.

Saturday, February 15, 2025

Cloudy, With a Chance of RNA

Long RNAs play structural and functional roles in regulation of chromosome replication and expression.

One of the wonderful properties of the fruit fly as a model system of genetics and molecular biology has been its polytene chromosomes. These are hugely expanded bundles of chromosomes, replicated thousands of times, which have been observed microscopically since the late 1800's. They exist in the larval salivary gland, where huge amounts of gene expression are needed, thus the curious evolutionary solution of expanding the number of templates, not only of the gene needed, but of the entire genome. 

These chromosomes where closely mapped and investigated, almost like runic keys to the biology of the fly, especially in the day before molecular biology. Genetic translocations, loops, and other structural variations could be directly observed. The banding patterns of light, dark, expanded, and compressed regions were mapped in excruciating detail, and mapped to genetic correlates and later to gene expression patterns. These chromosomes provided some of the first suggestions of heterochromatin- areas of the genome whose expression is shut down (repressed). They may have genes that are shut off, but they may also be structural components, such as centromeres and telomeres. These latter areas tend to have very repetitive DNA sequences, inherited from old transposons and other junk. 

A diagram of polytene chromosomes, bunched up by binding at the centromeres. The banding pattern is reproducible and represents differences in proteins bound to various areas of the genome, and gene activity.

It has become apparent that RNA plays a big role in managing these areas of our chromosomes. The classic case is the XIST RNA, which is a long (17,000 bases) non-coding RNA that forms a scaffold by binding to lots of "heterogeneous" RNA-binding proteins, and most importantly, stays bound near the site of its creation, on the X chromosome. Through a regulatory cascade that is only partly understood, the XIST RNA is turned off on one of the X chromosomes, and turned on the other one (in females), leading the XIST molecule to glue itself to its chromosome of origin, and then progressively coat the rest of that chromosome and turn it off. That is, one entire X is turned into heterochromatin by a process that requires XIST scaffolding all along its length. That results in "dosage compensation" in females, where one X is turned off in all their cells, allowing dosage (that is, the gene expression) of its expressed genes to approximate those of males, despite the presence of the extra X chromosome. Dosage is very important, as shown by Down Syndrome, which originates from a duplication of one of the smallest human chromosomes, creating imbalanced gene dosage.

A recent paper described work on "ASAR" RNAs, which similarly arise from highly repetitive areas of human chromosomes, are extremely long (180,000 bases), and control expression and chromosome replication in an allele-specific way on (at least) several non-X chromosomes. These RNAs, again, like XIST, specifically bind a bunch of heternuclear binding proteins, which is presumably central to their function. Indeed, these researchers dissected out the 7,000 base segment of ASAR6 that is densest in protein binding sites, and find that, when transplanted into a new location, this segment has dramatic effects on chromosome condensation and replication, as shown below.

The intact 7,000 base core of ASAR6 was transplanted into chromosome 5, and mitotic chromosomes were spread and stained. The blue is a general DNA stain. The green is a stain for newly synthesized DNA, and the red is a specific probe for the ASAR6 sequence. One can see on the left that this chromosome 5 is replicating more than any other chromosome, and shows delayed condensation. In contrast, the right frame shows a control experiment where an anti-sense version of the ASAR6 7,000 base core was transplanted to chromosome 5. The antisense sequence not only does not have the wild-type function, but also inhibits any molecule that does by tightly binding to it. Here, the chromosome it resides on (arrows) is splendidly condensed, and hardly replicating at all (no green color).


Why RNA? It has become clear over the last two decades that our cells, and particularly our nuclei, are swimming with RNAs. Most of the genome is transcribed in some way or other, despite a tiny proportion of it coding for anything. 95% of the RNAs that are transcribed never get out of the nucleus. There has been a growing zoo of different kinds of non-coding RNAs functioning in translational control, ribosomal maturation, enhancer function, and here, in chromosome management. While proteins tend to be compact bundles, RNAs can be (as these ASARs are) huge, especially in one dimension, and thus capable of physically scaffolding the kinds of structures that can control large regions of chromosomes.

Chromosomes are sort of cloudy regions in our cells, long a focus of observation and clearly also a focus of countless proteins and now RNAs that bind, wind, disentangle, transcribe, replicate, and congregate around them. What all these RNAs and especially the various heteronuclear proteins actually do remains pretty unclear. But they form a sort of organelle that, while it protects and manages our DNA, remarkably also allows access to it for sequence-specific binding proteins and the many processes that plow through it.

"In addition, recent studies have proposed that abundant nuclear proteins such as HNRNPU nonspecifically interact with ‘RNA debris’ that creates a dynamic nuclear mesh that regulates interphase chromatin structure."


Saturday, February 8, 2025

Sugar is the Enemy

Diabetes, cardiovascular health, and blood glucose monitoring.

Christmas brought a book titled "Outlive: The Science and Art of Longevity". Great, I thought- something light and quick, in the mode Gweneth Paltrow or Deepak Chopra. I have never been into self-help or health fad and diet books. Much to my surprise, however, it turned out to be a rather rigorous program of preventative medicine, with a side of critical commentary on our current medical system. A system that puts various thresholds, such as blood sugar and blood pressure, at levels that represent serious disease, and cares little about what led up to them. Among the many recommendations and areas of focus, blood glucose levels stand out, both for their pervasive impact on health and aging, and also because there are new technologies and science that can bring its dangers out of the shadows.

Reading: 

Where do cardiovascular problems, the biggest source of mortality, come from? Largely from metabolic problems in the control of blood sugar. Diabetics know that uncontrolled blood sugar is lethal, on both the acute and long-terms. But the rest of us need to realize that the damage done by swings in blood sugar are more insidious and pervasive than commonly appreciated. Both microvascular (what is commonly associated with diabetes, in the form of problems with the small vessels of the kidney, legs, and eyes) and macrovascular (atherosclerosis) are due to high and variable blood sugar. The molecular biology of this was impressively unified in 2005 in the paper above, which argues that excess glucose clogs the mitochondrial respiration mechanisms. Their membrane voltage maxes out, reactive forms of oxygen accumulate, and glucose intermediates pile up in the cell. This leads to at least four different and very damaging consequences for the cell, including glucose modification (glycation) of miscellaneous proteins, a reduction of redox damage repair capacity, inflammation, and increased fatty acid export from adipocytes to endothelial (blood vessel) cells. Not good!

Continuous glucose monitored concentrations from three representative subjects, over one day. These exemplify the low, moderate, and severe variability classes, as defined by the Stanford group. Line segments are individually classed as to whether they fall into those same categories. There were 57 subject in the study, of all ages, none with an existing diagnosis of diabetes. Yet five of them had diabetes by traditional criteria, and fourteen had pre-diabetes by those criteria. By this scheme, 25 had severe variability as their "glucotype", 25 had moderate variability, and only 7 had low variability. As these were otherwise random subjects selected to not have diabetes, this is not great news about our general public health, or the health system.

Additionally, a revolution has occurred in blood glucose monitoring, where anyone can now buy a relatively simple device (called a CGM) that gives continuous blood glucose monitoring to a cell phone, and associated analytical software. This means that the fasting blood glucose level that is the traditional test is obsolete. The recent paper from Stanford (and the literature it cites) suggests, indeed, that it is variability in blood glucose that is damaging to our tissues, more so than sustained high levels.

One might ask why, if blood glucose is such a damaging and important mechanism of aging, hasn't evolution developed tighter control over it. Other ions and metabolites are kept under much tighter ranges. Sodium ranges between 135 to 145 mM, and calcium from 8.8 to 10.7 mM. Well, glucose is our food, and our need for glucose internally is highly variable. Our livers are tiny brains that try very hard to predict what we need, based on our circadian rhythms, our stress levels, our activity both current and expected. It is a difficult job, especially now that stress rarely means physical activity, and nor does travel, in our automobiles. But mainly, this is a problem of old age, so evolution cares little about it. Getting a bigger spurt of energy for a stressful event when we, in our youth, are in crisis may, in the larger scheme of things, outweigh the slow decay of the cardiovascular system in old age. Not to mention that traditional diets were not very generous at all, certainly not in sugar and refined carbohydrates.


Saturday, February 1, 2025

Proving Evolution the Hard Way

Using genomes and codon ratios to estimate selective pressures was so easy... why is it not working?

The fruits of evolution surround us with abundance, from the tallest tree to the tiniest bacterium, and the viruses of that bacterium. But the process behind it is not immediately evident. It was relatively late in the enlightenment before Darwin came up with the stroke of insight that explained it all. Yet that mechanism of natural selection remains an abstract concept requiring an analytical mind and due respect for very inhuman scales of the time and space in play. Many people remain dumbfounded, and in denial, while evolutionary biology has forged ahead, powered by new discoveries in geology and molecular biology.

A recent paper (with review) offered a fascinating perspective, both critical and productive, on the study of evolutionary biology. It deals with the opsin protein that hosts the visual pigment 11-cis-retinal, by which we see. The retinal molecule is the same across all opsins, but different opsin proteins can "tune" the light wavelength of greatest sensitivity, creating the various retinal-opsin combinations for all visual needs, across the cone cells and rod cells. This paper considered the rhodopsin version of opsin, which we use in rod cells to perceive dim light. They observed that in fish species, the sensitivity of rhodopsin has been repeatedly adjusted to accommodate light at different depths of the water column. At shallow levels, sunlight is similar to what we see, and rhodopsin is tuned to about 500 nm, while deeper down, when the light is more blue-ish, rhodopsin is tuned towards about 480 nm maximum sensitivity. There are also special super-deep fish who see by their own red-tinged bioluminescence, and their rhodopsins are tuned to 526 nm. 

This "spectrum" of sensitivities of rhodopsin has a variety of useful scientific properties. First, the evolutionary logic is clear enough, matching the fish's vision to its environment. Second, the molecular structure of these opsins is well-understood, the genes are sequenced, and the history can be reconstructed. Third, the opsin properties can be objectively measured, unlike many sequence variations which affect more qualitative, difficult-to-observe, or impossible-to-observe biological properties. The authors used all this to carefully reconstruct exactly which amino acids in these rhodopsins were the important ones that changed between major fish lineages, going back about 500 million years.

The authors' phylogenetic tree of fish and other species they analyzed rhodopsin molecules from. Note how mammals occupy the bottom small branch, indicating how deeply the rest of the tree reaches. The numbers in the nodes indicate the wavelength sensitivity of each (current or imputed) rhodopsin. Many branches carry the author's inference, from a reconstructed and measured protein molecule, of what precise changes happened, via positive selection, to get that lineage.

An alternative approach to evolutionary inference is a second target of these authors. That is a codon-based method, that evaluates the rate of change of DNA sites under selection versus sites not under selection. In protein coding genes (such as rhodopsin), every amino acid is encoded by a triplet of DNA nucleotides, per the genetic code. With 64 codons for ~20 amino acids, it is a redundant code where many DNA changes do not change the protein sequence. These changes are called "synonymous". If one studies the rate of change of synonymous sites in the DNA, (which form sort of a control in the experiment), compared with the rate of change of non-synonymous sites, one can get a sense of evolution at work. Changing the protein sequence is something that is "seen" by natural selection, and especially at important positions in the protein, some of which are "conserved" over billions of years. Such sites are subject to "negative" selection, which to say rapid elimination due to the deleterious effect of that DNA and protein change.

Mutations in protein coding sequence can be synonymous, (bottom), with no effect, or non-synonymous (middle two cases), changing the resulting protein sequence and having some effect that may be biologically significant, thus visible to natural selection.


This analysis has been developed into a high art, also being harnessed to reveal "positive" selection. In this scenario, if the rate of change of the non-synonymous DNA sites is higher than that of the synonymous sites, or even just higher than one would expect by random chance, one can conclude that these non-synonymous sites were not just not being selected against, but were being selected for, an instance of evolution establishing change for the sake of improvement, instead of avoiding change, as usual.

Now back to the rhodopsin study. These authors found that a very small number of amino acids in this protein, only 15, were the ones that influenced changes to the spectral sensitivity of these protein complexes over evolutionary time. Typically only two or three changes occurred over a shift in sensitivity in a particular lineage, and would have been the ones subject to natural selection, with all the other changes seen in the sequence being unrelated, either neutral or selected for other purposes. It is a tour de force of structural analysis, biochemical measurement, and historical reconstruction to come up with this fully explanatory model of the history of piscene rhodopsins. 

But then they went on to compare what they found with what the codon-based methods had said about the matter. And they found that there was no overlap whatsover. The amino acids identified by the "positive selection" codon based methods were completely different than the ones they had found by spectral analysis and phylogenetic reconstruction over the history of fish rhodopsins. The accompanying review is particularly harsh about the pseudoscientific nature of this codon analysis, rubbishing the entire field. There have been other, less drastic, critiques as well.

But there is method to all this madness. The codon based methods were originally conceived in the analysis of closely related lineages. Specifically, various Drosophia (fly) species that might have diverged over a few million years. On this time scale, positive selection has two effects. One is that a desirable amino acid (or other) variation is selected for, and thus swept to fixation in the population. The other, and corresponding effect, is that all the other variations surrounding this desirable variation (that is, which are nearby on the same chromosome) are likewise swept to fixation (as part of what is called a haplotype). That dramatically reduces the neutral variation in this region of the genome. Indeed, the effect on neutral alleles (over millions of nearby base pairs) is going to vastly overwhelm the effect from the newly established single variant that was the object of positive selection, and this imbalance will be stronger the stronger the positive selection. In the limit case, the entire genomes of those without the new positive trait/allele will be eliminated, leaving no variation at all.

Yet, on the longer time scale, over hundreds of millions of years, as was the scope of visual variation in fish, all these effects on the neutral variation level wash out, as mutation and variation processes resume, after the positively selected allele is fixed in the population. So my view of this tempest in an evolutionary teapot is that these recent authors (and whatever other authors were deploying codon analysis against this rhodopsin problem) are barking up the wrong tree, mistaking the proper scope of these analyses. Which, after all, focus on the ratio between synonymous and non-synonymous change in the genome, and thus intrinsically on recent change, not deep change in genomes.


  • That all-American mix of religion, grift, and greed.
  • Christians are now in charge.
  • Mechanisms of control by the IMF and the old economic order.
  • A new pain med, thanks to people who know what they are doing.

Saturday, January 25, 2025

The Climate is Changing

Fires in LA, and a puff of smoke in DC.

An ill wind has blown into Washington, a government of whim and spite, eager to send out the winged monkeys to spread fear and kidnap the unfortunate. The order of the day is anything that dismays the little people. The wicked witch will probably have melted away by the time his most grievous actions come to their inevitable fruition, of besmirching and belittling our country, and impoverishing the world. Much may pass without too much harm, but the climate catastrophe is already here, burning many out of their homes, as though they were made of straw. Immoral and spiteful contrariness on this front will reap the judgement and hatred of future generations.

But hasn't the biosphere and the climate always been in flux? Such is the awful refrain from the right, in a heartless conservatism that parrots greedy, mindless propaganda. In truth, Earth has been blessed with slowness. The tectonic plates make glaciers look like race cars, and the slow dance of Earth's geology has ruled the evolution of life over the eons, allowing precious time for incredible biological diversification that covers the globe with its lush results.

A stretch of relatively unbroken rain forest, in the Amazon.

Past crises on earth have been instructive. Two of the worst were the end-Permian extinction event, about 252 million years ago (mya), and the end-Cretaceous extinction event, about 66 mya. The latter was caused by a meteor, so was a very sudden event- a shock to the whole biosphere. Following the initial impact and global fire, it is thought to have raised sun-shielding dust and sulfur, with possible acidification, lasting for years. However, it did not have very large effects on CO2, the main climate-influencing gas.

On the other hand, the end-Permian extinction event, which was significantly more severe than the end-Cretaceous event, was a more gradual affair, caused by intense volcanic eruptions in what is now Siberia. Recent findings show that this was a huge CO2 event, turning the climate of Earth upside down. CO2 went from about 400 ppm, roughly what we are at currently, to 2500 ppm. The only habitable regions were the poles, while the tropics were all desert. But the kicker is that this happened over the surprisingly short (geologically speaking) time of about 80,000 years. CO2 then stayed high for the next roughly 400,00 years, before returning slowly to its former equilibrium. This rate of rise was roughly 2.7 ppm per 100 years, yet that change killed off 90% of all life on Earth. 

The momentous analysis of the end-Permian extinction event, in terms of CO2, species, and other geological markers, including sea surface temperature (SST). This paper was when the geological brevity of the event was first revealed.

Compare this to our current trajectory, where atmospheric CO2 has risen from about 280 ppm at the dawn of the industrial age to 420 ppm now. That is rate of maybe 100 ppm per 100 years, and rising steeply. It is a rate far too high for many species, and certainly the process of evolution itself, to keep up with, tuned as it is to geologic time. As yet, this Anthropocene extinction event is not quite at the level of either the end-Permian or end-Cretaceous events. But we are getting there, going way faster than the former, and creating a more CO2-based long-term climate mess than the latter. While we may hope to forestall nuclear war and thus a closer approximation to the end-Cretaceous event, it is not looking good for the biosphere, purely from a CO2 and warming perspective, putting aside the many other plagues we have unleashed including invasive species, pervasive pollution by fertilizers, plastics and other forever chemicals, and the commandeering of all the best land for farming, urbanization, and other unnatural uses. 

CO2 concentrations, along with emissions, over recent time.

We are truly out of Eden now, and the only question is whether we have the social, spiritual, and political capacity to face up to it. For the moment, obviously not. Something disturbed about our media landscape, and perhaps our culture generally, has sent us for succor, not to the Wizard who makes things better, but to the Wicked Witch of the East, who delights in lies, cruelty and destruction.


Saturday, January 18, 2025

Eeking Out a Living on Ammonia

Some archaeal microorganisms have developed sophisticated nano-structures to capture their food: ammonia.

The earth's nitrogen cycle is a bit unheralded, but critical to life nonetheless. Gaseous nitrogen (N2) is all around us, but inert, given its extraordinary chemical stability. It can be broken down by lightning, but little else. It must have been very early in the history of life that the nascent chemical-biological life forms tapped out the geologically available forms of nitrogen, despite being dependent on nitrogen for countless critical aspects of organic chemistry, particularly of nucleic acids, proteins, and nucleotide cofactors. The race was then on to establish a way to capture it from the abundant, if tenaciously bound, dinitrogen of the air. It was thus very early bacteria that developed a way (heavily dependent, unsurprisingly, on catalytic metals like molybdenum and iron) to fix nitrogen, meaning breaking up the triple N≡N bond, and making ammonia, NH3 (or ammonium, NH4+). From there, the geochemical cycle of nitrogen is all down-hill, with organic nitrogen being oxidized to nitric oxide (NO), nitrite (NO2-), nitrate (NO3), and finally denitrification back to N2. Microorganisms obtain energy from all of these steps, some living exclusively on either nitrite or nitrate, oxidizing them as we oxidize carbon with oxygen to make CO2. 

Nitrosopumilus, as imaged by the authors, showing its corrugated exterior, a layer entirely composed of ammonia collecting elements (can be hexameric or pentameric). Insets show an individual hexagonal complex, in face-on and transverse views. Note also the amazing resolution of other molecules, such as the ribosomes floating about.

A recent paper looked at one of these denizens beneath our feet, an archaeal species that lives on ammonia, converting it to nitrite, NO2. It is a dominant microbe in its field, in the oceans, in soils, and in sewage treatment plants. The irony is that after we spend prodigious amounts of fossil fuels fixing huge amounts of nitrogen for fertilizer, most of which is wasted, and which today exceeds the entire global budget of naturally fixed nitrogen, we are faced with excess and damaging amounts of nitrogen in our effluent, which is then processed in complex treatment plants by our friends the microbes down the chain of oxidized states, back to gaseous N2.

Calculated structure of the ammonia-attracting pore. At right are various close-up views including the negatively charged amino acids (D, E) concentrated at the grooves of the structure, and the pores where ammonium can transit to the cell surface. 

The Nitrosopumilus genus is so successful because it has a remarkable way to capture ammonia from the environment, a way that is roughly two hundred times more efficient than that of its bacterial competitors. Its surface is covered by a curious array of hexagons, which turn out to be ammonia capture sites. In effect, its skin is an (relatively) enormous chemical antenna for ammonia, which is naturally at low concentration in sea water. These authors do a structural study, using the new methods of particle electron microscopy, to show that these hexagons have intensely negatively charged grooves and pores, to which positively charged ammonium ions are attracted. Within this outer shell, but still outside the cell membrane, enzymes at the cell surface transform the captured ammonium to other species such as hydroxylamine, which enforces the ammonium concentration gradient towards the cell surface, and which are then pumped inside.

Cartoon model of the ammonium attraction and transit mechanisms of this cell wall. 

It is a clever nano-material and micro-energetic system for concentrating a specific chemical- a method that might inspire human applications for other chemicals that we might need- chemicals whose isolation demands excessive energy, or whose geologic abundance may not last forever.


Saturday, January 11, 2025

A Housing Guarantee

A proposal for an updated poor house.

I agree with MMT economists who propose a job guarantee. That would put a floor on the labor market with an offer to anyone who wants to work for a low, but living wage, probably set below the minimum wage mandated for the private sector. State and local governments would run cleanups, environmental restoration, and care operations as needed, requiring basic discipline and effort, but no further skills. But they could use higher skilled workers as they come along for more beneficial, complex tasks.

Similarly, I think we could offer a housing guarantee, putting a floor on homelessness and misery. In the state of California, homelessness is out of control, and we have not found solutions, despite a great deal of money spent. Housing in the private market is extremely expensive, far out of reach of those with even median incomes. The next level down is housing vouchers and public housing, of which there are not enough to go around, and which is extremely expensive. And below that are shelters, which are heavily adverse settings. They are not private, chaotic, unpleasant, meant to be temporary, can be closed much of the time. And they also do not have enough space. 

A local encampment, temporarily approved during the pandemic under the freeway.

As uncompassionate as it sounds, it is unacceptable, and should be illegal, for public spaces to be commandeered by the homeless for their private needs. Public spaces have many purposes, specifically not including squatting and vagrancy. It is a problem in urban areas, because that is where people are, and where many services exist at the intersection of public and private spaces- food, bathrooms, opportunities to beg, get drugs, etc. Just because we have been, as governments and citizens, neglectful of our public spaces, does not mean we should give them over to anyone who wants to camp on them. I was recently at San Francisco city hall and the beautiful park surrounding it. But at lunch time, I realized that there was nowhere to sit. The plague of homelessness had rendered park benches untenable. We deserve to keep these public spaces functional, and that means outlawing the use of public spaces by the homeless. At the same time, provision must be made for the homeless, who by this policy would have nowhere to go in fully zoned areas. Putting them on busses to the next town, as some jurisdictions do, is also not a solution. As a rich country, we can do more for the homeless even while we preserve public spaces.

I think we need to rethink the whole lower end of housing / shelter to make it a more regular, accessible, and acceptable way to catch those who need housing at a very basic level. The model would be a sort of cross between a hostel, an SRO (single room occupancy hotels) and army barracks. It would be publicly funded, and provide a private room as well as food, all for free. It would not throw people out, or lock them in.

This poor house would not demand work, though it would offer centralized services for finding jobs and other places to live. It would be open to anyone, including runaway teens, battered women, tourists, etc. It would be a refuge for anyone for any reason, on an unlimited basis. The space and the food would be very basic, motivating clients to seek better accommodation. It would be well-policed and its clients would have to behave themselves. The next step down in the ladder of indigent care would not be homelessness, which would be outlawed in areas offering this kind of poorhouse, but would be institutionalization, in increasingly stringent settings for either criminal or mental issues. 

Such a poor house might become a community center, at least for the indigent. It would be quite expensive, but given the level of inequality and lack of care for people in various desperate straits, we need to furnish a humane level of existence between the market housing system and institutionalization. Why not give everyone a house? That is neither financially practical, nor would that co-exist well with the market housing system. Certainly, more housing needs to be built and everything done to bring prices down. But to address the current issues, stronger housing policy is needed.

Why not go back to a public housing model? It turned out that public housing was somewhat unrealistic, promising far more than it could deliver. It promised fully functional neighborhoods and housing, pretty much the equivalent of market housing, but without the ongoing discipline from the market via private financial responsibility by the residents or from the programs via their bureaucratic structures and funding, to follow through on the long term. The public authorities generally took a hands-off approach to residents and their environment, in line with the (respectful) illusion that this was the equivalent of market housing. And the long-term is what counts in housing, since it is ever in need of repair and renovation, not to mention careful use and protection by its residents. Building is one thing, but maintaining is something quite different, and requires carefully though-out incentives. 

With a public poorhouse model, the premises and residents are extensively policed. Individual rooms may descend to squalor, but the whole is built, run and maintained by the public authorities with intensive surveillance and intervention, keeping the institution as a whole functioning and growing as needed for its mission. There is going to be a sliding scale of freedom vs public involvement via financing and policing. The less functional a person is, the more control they will have to accept. We can not wash our hands of the homeless by granting them "freedom" to thrash about in squalor and make dumps of public spaces.


  • Or you could join the squid game.
  • Economic policy should not be about efficiency alone, let alone rewarding capital and management, but about long-term cultural and environmental sustainability.
  • Could AI do biology?
  • Carter was an evangelical. But that was a different time.

Saturday, January 4, 2025

Drilling Into the Transcriptional Core

Machine learning helps to tease out the patterns of DNA at promoters that initiate transcription.

One of the holy grails of molecular biology is the study of transcriptional initiation. While there are many levels of regulation in cells, the initiation of transcription is perhaps, of all of them, the most powerful. An organism's ability to keep the transcription of most genes off, and turn on genes that are needed to build particular tissues, and regulate others in response to other urgent needs, is the very soul of how multicellular organisms operate. The decision to transcribe a gene into its RNA message (mRNA) represents a large investment, as that transcript can last hours or more and during that time be translated into a great many protein copies. Additionally, this process identifies where, in the otherwise featureless landscape of genomic DNA, genes are located, which is another significant process, one that it took molecular biologists a long time to figure out.

Control over transcription is generally divided into two conceptual and physical regions- enhancers and promoters. Enhancers are typically far from the start site of transcription, and are modules of DNA sequences that bind innumerable regulatory proteins which collectively tune, in fine and rough ways, initiation. Promoters, in contrast, are at the core and straddle the start site of transcription (TSS, for short). They feature a much more limited set of motifs in the DNA sequence. The promoter is the site where the proteins bound to the various enhancers converge and encourage the formation of a "preinitiation complex", which includes the RNA polymerase that actually carries out transcription, plus a lot of ancillary proteins. The RNA polymerase can not initiate on its own or find a promoter on its own. It requires direction by the regulatory proteins and their promoter targets before finding its proper landing place. So the study of promoter initiation and regulation has a very long history, as a critical part of the central flow of information in molecular biology, from DNA to protein.

A schematic of a promoter, where initiation of transcription of Gene A, happens, with the start site (+1) right at the boundary of the orange and green colors. At this location, the RNA polymerase will melt the DNA strands, and start synthesizing an RNA strand using the (bottom) template strand of the DNA. Regulatory proteins bound to enhancers far away in the genomic DNA bend through space to activate proteins bound at the core promoter to load the polymerase and initiate this process.

A recent paper provided a novel analysis of promoter sequences, using machine learning to derive a relatively comprehensive account of the relevant sequences. Heretofore, many promoters had been dissected in detail and several key features found. But many human promoters had none of them, showing that our knowledge was incomplete. This new approach started strictly from empirical data- the genome sequence, plus large experimental compilations of nascent RNAs, as they are expressed in various cells, and mapped to the precise base where they initiated from- that is, their respective TSS. These were all loaded into a machine learning model that was supplemented with explanatory capabilities. That is, it was not just a black box, but gave interpretable results useful to science, in the form of small sequence signatures that it found are needed to make particular promoters work. These signatures presumably bind particular proteins that are the operational engines of regulatory integration and promoter function.

The TATA motif, found about 30 base pairs upstream of the transcription start site in many promoters. This is a motif view, where the statistical prevalence of the base is reflected in the height of the letter (top, in color) and its converse is reflected below in gray. Regular patterns like this found in DNA usually mean that some protein typically binds to this site, in this case TFIID.


For example, the grand-daddy of them all is the TATA box, which dates back to bacteria / archaea and was easily dug up by this machine learning system. The composition of the TATA box is shown above in a graphical form, where the probability of occurrence (of a base in the DNA) is reflected in height of the base over the axis line. A few G/C bases surround a central motif of T/A, and the TSS is typically 30 base pairs downstream. What happens here is that one of the central proteins of the RNA polymerase positioning complex, TFIID, binds strongly to this sequence, and bends the DNA here by ninety degrees, forming a launchpad of sorts for the polymerase, which later finds and opens DNA at the transcription start site. TFIID and the TATA box are well known, so it certainly is reassuring that this algorithmic method recovered it. TATA boxes are common at regulated promoters, being highly receptive to regulation by enhancer protein complexes. This is in contrast to more uniformly expressed (housekeeping) genes which typically use other promoter DNA motifs, and incidentally tend to have much less precise TSS positions. They might have start sites that range over a hundred base pairs, more or less stochastically.

The main advance of this paper was to find more DNA sites, and new types of sites, which collectively account for the positioning and activation of all promoters in humans. Instead of the previously known three or four factors, they found nine major DNA sequences, and a smattering of weaker patterns, which they combine into a predictive model that matches empirical data. Most of these DNA sequences were previously known, but not as part of core promoters. For example, one is called YY1, because it binds the YY1 protein, which has long been appreciated to be a transcriptional repressor, from enhancer positions. But now it turns out to also be core promoter participant, identifying and turning on a class of promoters that, as for most of the new-found sequence elements, tend to operate genes that are not heavily regulated, but rather universally expressed and with delocalized start sites. 

Motifs and initiator elements found by the current work. Each motif, presumably matched by a protein that binds it, gets its own graph of relation of the motif location (at 0 on the X axis) vs the start site of transcription that it directs, which for TATA is about 30 base pairs downstream. Most of the newly discovered motifs are bi-directional, directing start sites and transcription both upstream and downstream. This wastes a lot of effort, as the upstream transcripts are typically quickly discarded. The NFY motif has an interesting pattern of 10.5 bp periodicity of its directed start sites, which suggests that the protein complex that binds this site hugs one side of the DNA quite closely, setting up start sites on that side of the helix.

Secondly, these authors find that most of the new sequences they identify have bidirectional effects. That is, they set up promoters to fire in both directions, both typically about forty base pairs downstream and also upstream from their binding site. This explains a great deal of transcription data derived from new sequencing technologies, which shows that many promoters fire in both directions, even though the "upstream" or non-gene side transcript tends to be short-lived.


Overview of the new results, summarized by type of DNA sequence pattern. The total machine learning prediction was composed of predictions for larger motifs, which were the dominant pattern, plus a small contribution from "initiators", which comprise a few patterns right at the start site, plus a large but diffuse contribution from tiny trinucleotide patterns, such as the CG pattern known to mark active genes and carry activating DNA methylation marks.


A third finding was the set of trinucleotide motifs that serve as the sort of fudge factor for their machine learning model, filling in details to make the match to empirical data come out better. The length was set more or less arbitrarily, but they play a big part in the model fit. They note that one common example is the CG pattern, which is one of the stronger trinucleotide motifs. This pattern is known as CpG, and is the target of chemical methylation of DNA by regulatory enzymes, which helps to mark and regulate genes. The current work suggests that there may be more systems of this kind yet to be discovered, which play a modulating role in gene/promoter selection and activation.

The accuracy of this new learning and modeling system exemplifies some of the strengths of AI, of which machine learning is a sub-discipline. When there is a lot of data available, and a problem that is well defined and on the verge of solution (like the protein folding problem), then AI, or these machine learning methods, can push the field over the edge to a solution. AI / ML are powerful ways to explore a defined solution space for optimal results. They are not "intelligent" in the normal sense of the word, (at least not yet), which would imply having generalized world models that would allow them to range over large areas of knowledge, solve undefined problems, and exercise common sense.