Saturday, March 31, 2018

How Can Cells Divide When DNA Looks Like Spaghetti?

Topoisomerases untangle the mess, very carefully, with itty-bitty molecular scissors.

DNA is incredibly elegant as a solution to information storage and heredity. But it is also an enormous mess, with the genome of humans extending to five feet in combined length. (Imagine 9,000 miles of garden hose.) So each cell, which contains this whole amount, has a nucleus resembling an incredibly convoluted nest of spaghetti. Yet at mitosis, the chromosomes condense, separate, and neatly partition to each new cell. Some of the solution consists in how the DNA lies in the interphase cell- it is already somewhat pre-organized there. But most of the solution comes from enzymes that do not bother disentangling it- they cut the Gordian knot with enzymatic swords. Genomes were able to surmount the length problem over evolutionary time by the development of topoisomerases, which cut and religate DNA with extraordinary precision.

There are two main types of topoisomerase (named for altering topology, or the organization / twisting of DNA, without changing its energy, sequence, or composition). Topisomerase I cuts only one strand of the double-standed DNA, and can thus relieve coiling tension. Some forms can wind up the tension, using ATP. Topoisomerase II cuts both strands, and is the main enzyme that allows complete de-catenation / detangling of DNA during replication, transcription, meiosis, as well as mitosis. A recent paper looked deeply into the mechanism of this class of enzymes.

As one can imagine, the minimal requirements of a Topoisomerase II is that it hold on to both ends of the DNA it has cut, while passing through the other DNA strand which has, by virtue of general tangling, come up against it. This condition of collision needs to be detected, prior to being resolved, so that the enzyme is properly positioned. The complex also has to detect that the process has finished, and reset to the starting state, including religation of the cleaved strand of DNA. It is a tall order for a mere chemical confection to carry out, frankly.

But it turns out that enzymes can have hands, if not brains. The authors provide, on the basis of a great deal of past work as well as their own, a compelling model of how this topoisomerase performs its amazing feats.

Molecular structure of a typical topoisomerase II, composed of two copies of Top6A (green and red; second copy in gray) and two copies of Top6B (yellow, purple, and orange; second copy in gray). The G-segment DNA (~70 bp) is strung along the underside, and the T-segment DNA will shortly be accommodated in the middle, upon which the top portions clamp together. Cylinders represent alpha helices, the common secondary structure of proteins. Key domains for the activity of this protein complex are noted- the H2TH domain, which notifies Top6B that a G-segment is present- the KGRR domain, which notifies the same enzyme that a T-segment is present, and to keep the clamp closed. And lastly, the stalk/WKxY domain, which in addition to helping to bind the G-segment communicates between Top6B and Top6A that cleavage of the G-segment can happen. The G-segment will be cleaved in half at the bottom of the structure, later to be re-ligated after the T-segment has passed through. 

The enzyme (it is a tetramer made up of dimers of two separate proteins, Top6A and Top6B) forms a large hoop, with arms outstretched that will join during its action. One DNA segment, the one to be cut (the G-segment, for gate) is first bound by the underside of the hoop, centered at the middle active site which does the cutting, holding, and religation. The outstretched arms encompass the other DNA strand (the T-segment, for transit). Both top and bottom of the complex have ATPase activity, though for different purposes.

The key finding made by these authors is that the G-segment DNA is bound not only near the cleavage site (in Top 6A), but by the entire arm structure, up to a domain in Top6B that the authors call H2TH. About 70 basepairs of the G-segment DNA are bound, overall. This not only stabilizes and holds this DNA while it is being cut and the T-segment is being passed through, but it also allows the Top6B portion of the enzyme to sense the status of the whole complex, so that it can properly sequence its activities.

The KGRR feature functions to sense T-segment DNA and keep the clasp closed and ATP unhydrolyzed while the T-segment is present. The bottom graphs show ATP hydrolysis in the mutants diagrammed above, while the gel images show relaxation of the supercoiled DNA but the enzyme (moving it from the bottom to the top, left to right). ATP hydrolysis is increased to a free-wheeling state, while DNA relaxation fails to happen, in two mutant versions of the KGRR finger.

For example, the authors identify another feature near the top of Top6b, called KGRR, which is a finger that points inwards to touch the T-segment DNA. When they mutate it, they find that ATP is now freely digested in the presence of supercoiled DNA, much more actively than by the intact (wild-type) enzyme. But the enzyme is inactive ... no strand passage takes place, and supercoils are not relieved. The mutant enzyme is spinning its wheels, clasping and opening without doing anything. What this indicates is that in the normally functioning enzyme, the KGRR domain is a sensor that keeps the complex locked up till the T-segment passes out the other side, via the cleavage in the G-segment. Only then can ATP be digested by both halves of the enzyme, re-ligating the G-segment, and opening the Top6B arms to allow a new round of stress relief to take place.

Similarly, they conclude that the function of the H2TH sensor is in part to notify the Top6B part of the enzyme that a G-segment DNA is bound on the underside, allowing ATP to be bound and the clasp to close, if a T-segment also happens along. T-segments should not bind unless a G-segment is bound first. Secondly, the dramatic DNA bend adopted by the G-segment in this protein structure, especially in the locked-up conformation, draws on the supercoiling / torsional state of the DNA that is the target of action. Supercoiled DNA binds with 60-fold higher affinity than unstrained DNA.

Overall schematic of the mechanism of the enzyme- see text.

To recapitulate, the overall model is that G-segment DNA, in torsionally stressed condition, binds to the broad binding area of the underside of the enzyme. This notifies the Top6B domain that T-segment binding is acceptable. When that happens, the clasp is closed and ATP is bound, but not hydrolyzed, setting up the next step. A hinge between the two protein halves notifies Top6A that it can cut the G-segment. When that is done and the T-segment passes through, the KGRR sensor notifies the Top6B that the clasp is empty, so the ATP is hydrolyzed and the clasp releases, ready for another round.

It is an intricate molecular mouse-trap, built of ratchets and sensors of various kinds, using the jostling motion universal at this scale, plus key inputs of energy (ATP) to accomplish what on the large scale looks like an amazing feat of re-organization.


  • We have a sleaze bag as president, with sleazy personal, international, business, governing, and legal ethics. Is that news?
  • Why haven't we had self-driving cars for the last decade?
  • Some places have fewer guns, and better policy. Some have more, and utter corruption.
  • Should greed and profit be the highest societal values?
  • Another introduction to MMT economics.
  • Humanism.
  • The new class structure, and the new left.

Sunday, March 25, 2018

GWAS: Complexity in Genetic Variation and Selection

Genetic studies show that most traits have many influences, most genes affect many traits, and most variants have small effects.

Once the human genome was all sequenced, and once lots of alleles (aka variants, aka mutations) were collected from human populations, scientists started doing large scale genetic studies, called genome-wide association studies (GWAS). The dream was that now, at last, we could find the "genes for" schizophrenia, and alcoholism, and depression, and autism, and height, and cardiovascular disease, and countless other syndromes and traits which are known to be highly heritable.

But this project pretty much came to naught, for reasons that have gradually become clear, and which a recent paper (review) provides some more explicit modeling for. The variants that have been found through GWAS have generally had very low effects on the studied trait, and even adding all of them up, the heritability that is known by other genetic methods was not accounted for. This became known as "the missing heritability". Height is clearly heritable, as the path leading to Yao Ming shows. Yet add up all the known variants contributing to height, and they do not add up to that known heritability.

Firstly, these studies focused on common variants, necessarily because data was so hard to come by. If a 1000 genomes are sequenced, out of the human population, and the study requires that the variation occur more than once so that its association can be validated, that variation must be a common one. That implies in turn that it can not have a very strong selective effect, otherwise it would not be common. And that implies in turn that any effect it has on any trait has is likely to be weak.

Secondly, we have been somewhat blinded by the archetypal mendelian model of traits. The wrinkled peas, human eye color- these are simple traits, with one or a few alleles. Blue eye color is due to complete lack of the enzyme to make brown- it is either on or off. But most of our genes are more important than that, and can not be turned off without dire consequences. Most of our genes make products that participate in large pathways and networks where they intrinsically will affect many traits and have strong effects if significantly defective. Indeed, it is estimated that about 1/3 of amino acid positions in the coding genome have strongly deleterious effects if changed.

Network of genes with variants found to be genetically associated with autism. Each one, naturally, has very small effects.

This implies that most of the variation that exists around these genes will not have dramatic on/off effects, but rather be slight modifications of the sequence, or of expression- up or down, or in modestly altered locations or times- consistent with the high variability and degeneracy of that regulatory code/system. In addition, if a variant has an effect on the trait one is studying, it will likely also have effects elsewhere, given the complexity of most circuits (called pleiotropy). Thus its overall selective effect may be substantially larger than that focused solely on the trait of interest, dampening yet again one's ability to find such variation from studies on particular traits.

We are now in the world of "quantitative traits", as opposed to Mendelian traits. Not that they do not obey Mendel's laws, but that their complexity is such that a whole new form of statistics and analysis is needed to deal with them. Quantitative traits vary in a continuous way, (like height), and are composed genetically of many genes, whose many variants (at least those which occur commonly) each have small effects.

Modeling is now getting more accurate predictions of heritability explanation based on effect sizes of individual variants, and a study's ability to find them based on its size. The left panel shows how more heritability is explained (lower levels  unexplained) as the study threshold captures more variance (more alleles, with smaller effect sizes) towards the right. Overall heritability of height is supposed to be around 70%. The curve on the right, modeling how big studies (in terms of thousands of individual subjects) would have to be to get there, is unlikely to ever get there, so the modeling remains incomplete. This is even more true for BMI, whose total heritability is roughly 60%. Even with the statistics deployed here, they are not modeling the full heritability, even with extrapolation to infinite study size.

The conclusion from all this is that the missing heritability is not missing, just hidden. If we could sequence everyone, and analyze all their variants, we would find all the heritability that lineage and twin studies know is there. The paper makes the significant point that the problem is not epistasis- the non-linear interaction of different genes and variants. No, the large numbers of small effects tend to add up linearly, but just because they are so small and there are so many of them, which studies up till now are not powerful enough to find, they remain out of reach.

This is disappointing from a medical standpoint, but also biologically. One goal of finding key genes for common diseases was to understand them mechanistically, as well as to treat them. But if no one gene, or even a few, is the key to complex diseases and traits, then the climb to understand their biology, and gain practical insight to alter their course, gets that much steeper.


  • Florida's bridge/political/environmental/traffic/population disaster.
  • Evangelicalism as simple patrician politics.
  • Millions have been killed in Iraq.. was that OK?
  • My data is your data... as usual, the crime isn't what is illegal.
  • A philosophical memoir of science and physical therapy.
  • Corruption seems to know no bounds.

Saturday, March 17, 2018

Periplasmic Space: The Final Frontier

The outer layer of bacteria, as a complex sensory and protective skin.

A theme of evolution is that sensors are better than armor. Mammals have very thin skin, with hair that has both sensory and protective roles. The reptilian and dinosaurian armor has been left far behind, in favor of a wide variety of short-range sensors innervating the skin, and long-range sensors, all choreographed with large brains. Among bacteria, a loosely analogous process took place in the development of Gram negative lineages.

Gram positive bacteria (purple) have a very thick armor of peptidoglycan, while Gram negative bacteria (lighter pink) have a second outer membrane that keeps out the Gram stand, and surrounds a much thinner peptidoglycan wall.

The Gram stain is a complex antibacterial (and purple) molecule that binds to the outer peptidoglycan wall of bacteria. This is a meshwork which is constructed outside the plasma membrane, which is the key chemical, ionic, and electrical barrier between the cell and the outside. Like the lignin cell walls of plants, the peptidoglycan wall helps to protect the cell from physical abuse and from osmotic shock / swelling. Many antibiotics, like penicillin, impair the construction of this wall, causing the target cells to lyse. Since the construction takes place on the outside, access by antibiotics is easy, and a great deal of microbial warfare has happened at this interface.

Diagram of the periplasmic space. LPS is lipopolysaccharide, composing much of the outermost leaflet. IM is inner membrane, where sensors to its own stress as well as stresses on the peptidoglycan and outer membrane reside. LPP is Braun's lipoprotein. The porin is an example of a semi-selective outer membrane channel that lets in some nutrients and ions while keeping out other, larger chemicals.

But what if some bacteria came up with an innovation to construct another membrane on the outside, encasing the peptidoglycan layer within a protective semipermiable membrane that keeps (at least some) antibiotics out? These are the Gram negative bacteria. They have an outer membrane with a specially robust outward-facing lipopolysaccharide (LPS) surface, and a very thin peptidoglycan layer, before you get to the cell's critical plasma membrane. While the Gram positive bacteria have a huge, thick armor-like peptidoglycan layer, Gram negative bacteria have a much thinner but more complicated structure, whose construction, homeostasis, and sensory capabilities are still under study. Our mitochondria are descendents of Gram negative bacteria, and also have a double-membrane, which makes the transport of the many nuclear-encoded proteins that compose them a rather involved process.

https://www.ncbi.nlm.nih.gov/books/NBK26828/

What keeps the outer membrane in place? That turns out to be the most abundant protein in bacteria like Eschericia coli, which is the prototypical Gram negative bacterium- Braun's lipoprotin. This protein serves as the strut/rivet that spans between the outer membrane and the thin peptidoglycan layer. If this protein is engineered to be longer, this tiny space widens in proportion.

Increasing the length of Braun's lipoprotein (lpp) widens the periplasmic space, in this case from 25 nm to 28.5 nm.  In the second panel, a mutant which renders lpp incapable of attaching to the peptidoglycan layer renders the periplasmic space disorganized and the outer membrane prone to vesiculation.

The whole space between the inner and outer membranes is called the periplasm, or periplasmic space. It has evolved not only for protection, but to host many processes best kept outside the cytoplasm, like pre-digestion of some nutrients, ionic and redox control, scavanging for iron and other key nutrients, stabilization of the flagellum, not to mention the construction and maintenance of its own components, including the outer membrane and the peptidoglycan wall.

This all requires quite a bit of sensory capacity, for instance to sense when the outer membrane needs more lipids to accommodate cell growth, or how thick to make the peptidoglycan. Very little of these sensory capacities are understood, but a recent paper discussed one sensor, RcsF, that somehow senses stress on both the outer membrane and peptidoglycan. This protein sticks a finger into the outer membrane, and either spans the entire periplasm or moves across it upon stress events. It then interacts with proteins on the inner, or plasma, membrane- IgaA and RcsC- which transmit its stress signal to a cascade of interior protein phosphorylation events that end up reducing cell movement and activating protective programs against stresses such as acid and membrane insufficiency, and increasing biofilm production and synthesis of proteoglycan and lipopolysaccharide.

The authors showed specifically that if the Braun lipoprotein was lengthened, widening the periplasm, then RcsF would only work (protecting against antibiotic treatment) if it was similarly lengthened. That indicates that RcsF spans at least between the outer membrane and the peptidoglycan layer, if not all the way to the inner membrane. The latter would be a total distance of 25 nm, which is equivalent, in a protein alpha helix, to 166 amino acids, which is longer than the entire RcsF protein. So some kind of migration or transfer through the periplasm must be taking place during the sensing event. It also suggests that one thing RcsF may be sensing specifically is the distance between the peptidoglycan and the outer membrane. Other stress sensors and mediators exist, so there remains a great deal to learn here. While Gram negative bacteria may not have a brain, they have a very smart skin that actively protects and defends them.


Saturday, March 10, 2018

Americans, Plain and Simple

How about doing away with the term "African-American"?

It has taken me a while to realize that African Americans are far, far more American than I am. I am a naturalized citizen and immigrant. Yet the Protestant, white, suburban Boy-Scout culture fit like a glove- I was assimilated into 60's-70's America with plenty of personal and family issues, but no larger political or cultural issues.

How different that is from the black experience, where whole political parties remain dedicated to keeping black Americans down! A small part of that social antagonism and "othering" is furthered by the distinct names that have been applied to the black community. While the term "African American" is about as neutral as can be, in strict analogy to the many other ethnic terms like Irish-American, Jewish-American, German-American, Chinese-American, etc., there have in practice been some distinctions.

First, "Irish-American" is not frequently used. Most ethnic groups, especially those of such long vintage, have simply melted in to the pot of generic Americans- have assimilated or had America assimilate to them. So the continued intensive use of the term "African American" does not flow from a lack of assimilation, at least not from an African originating culture, but something quite different. Second, why is "African" lumped together so promiscuously, as if a continent as large as three Europes contained only one culture? "Latino" suffers from the same syndrome, hiding vast differences and diversity for the convenience of the dominant culture. It is a natural problem with naming and grouping of any kind, but is another sign that the "African" in "African American" doesn't really refer to Africa.

What all this does signify is continued segregation in all sorts of dimensions- social, physical, economic- based on a long cultural history of fear, disgust, hate, and social and economic oppression/powerlessness. Pride in an African heritage is admirable, but that seems so distant as to be mostly contrived; there is very little such heritage afoot in contemporary America, in any way that is distinct to one community, beyond genetics. (Though Wakanda may change all that!) A more accurate designation might be "formerly enslaved Americans", though that hardly trips off the tongue either. There have been many attempts at labels, more or less successful, (Negro, colored, minority, Urban, Afro-American, ghetto, racialized people, diverse, people of color). I would suggest the preferred usage just be "Americans" when and where possible, without further ado or elaboration.

A word-cloud of my own creation, text drawn from Wikipedia and other history sites focusing on the back experience. This  appears to militate against the thesis presented, showing "African" with high usage, and as perhaps the primary locus of identity. But the corpus was a very backward-looking, perhaps not reflective of the current cultural setting.

Obviously, from the very nature of this very article, some term is needed to refer to Americans descended from those were formerly under bondage and even more formerly kidnapped from West Equatorial Africa. "Black" seems to fit that best, if still very uncomfortably. Despite all the etymological / symbological freight, simplification, and label-i-fication, it is simple and widely used. It is also part of a deeply unifying symbology. The Ying/Yang symbol is an example, showing light and dark as part of all things, and all cycles and processes. Ebony, Jet, Black power, Black is beautiful... all have been ways to rectify the dominant-culture valence of this term.


Saturday, March 3, 2018

TP53: On a Knife Edge of Death

The difficult tradeoffs made by TP53, between tumor suppression and premature aging.

TP53 is a gene whose mutated forms are perhaps the most frequently found causes of human cancer. Its product is a transcription regulator and also interacts with a large number of other proteins to orchestrate a graded response to cellular stress. One job is to halt the cell cycle in the presence of DNA damage. But it can also order cell suicide, which makes it one of the key defenses against cancer, which is fundmantally caused by escape from these kinds of tight mechanisms of cellular surveillance and control.

Interestingly, some researchers have created over-active mutations of TP53, which in mice confer higher resistance to cancer, but also a rapid aging phenotype. By this point, there are many ways to make aging go faster, by eliminating various cell repair pathways. One lab has deleted SIRT6, a gene that is an upstream inhibitor of TP53, and has its own complex role in stress response and promoting proper DNA and other forms of cell repair. It was one of the regulators thought to be part of the "red wine" effect, such as it is. This deletion dramatically reduces the lifespan of mice, from three years to just under one year. Since some of this protein's effect goes through TP53, the researchers created mice with a half-dose of TP53, which substantially rescued the mutant mice's phenotype, increasing longevity to one and a half years and raising health and body weight.

Comparison of mutant mice. SIRT6 is part of several cellular repair pathways, and its deletion (Sirt6-/-) causes mice (middle) to have much shorter and less healthy lives, their cells going south at increased rates. This defect can be corrected in part by adding another mutation, partially reducing the amount of TP53, one key target of SIRT6 inhibition.

This all demonstates, in part, that there can be too much of a good thing, i.e. TP53. On the other hand, other studies have shown that simple over-expression of normal TP53 in otherwise normal backgrounds strongly decreases cancer rates while not affecting longevity. Thus if TP53 is under normal regulation, it does what it is supposed to do- signaling cell repair or suicide, while not attacking normal or slightly stressed cells, which seems to be the problem in aging, causing loss of tissue and especially stem cell reservoirs.

Some other clues come from naked / blind mole rats, which have evolved a subterranean and highly social existence, combined with great longevity (twenty years, which for such small animals is extremely unusual) and virtually complete resistance to cancer. Here, the TP53 genes have lost some function, becoming less capable of inducing cell suicide. This is thought to be connected to the longevity phenotype. Separate mechanisms have evolved to fight cancer, such as a dramatic and thorough necrosis of any cell population that over-proliferates, induced by interferon gamma.

The message at the end of all this is that there is great scope in human biology for manipulating our longevity and health in later life. The molecular mechanisms we currently have are good, since a life span of eighty-odd years is nothing to sneeze at. But there is room for improvement, though the complexity of the networks involved in our internal surveillance and repair processes is so high that it will be some time before we have a theoretical handle on what can be done, let alone practical interventions to implement such theories.