Saturday, November 30, 2019

Metrics of Muscles

How do the microstructures of the muscle work, and how do they develop to uniform sizes?

Muscles are gaining stature of late, from being humble servants for getting around, to being a core metabolic organ and playing a part in mental health. They are one of the most interesting tissues from a cell biology and cytoarchitectural standpoint, with their electrically-activated activity, and their complex and intensely regimented organization. How does that organization happen? There has been a lot of progress on this question, one example of which is a recent paper on the regulation of Z-disc formation, using flies as a model system.

A section of muscle, showing its regimented structure. Wedged in the left middle is a cell nucleus. The rest of these cells are given over to sarcomeres- the repeating structure of muscles, with dark myosin central zones, and the sharp Z-lines in the light regions that anchor actin and separate adjacent sarcomeres.

The basic repeating unit of muscle is the sarcomere, which occurs end-to-end within myofibrils, which are bundled together into muscle fibers, which constitute single muscle cells. Those cells are then in turn bundled into fascicles, bundles, and whole muscles. The sarcomere contains end-plates called the Z-disk, which attach actin filaments that travel lengthwise into the sarcomere (to variable distances depending on contraction). In the center of the sarcomere, interdigitated with the actin filaments, are myosin filaments, which look much thicker in the microscope. Myosin contains the ATP-driven motor which pulls along the actin, causing the whole sarcomere to contract. The two assemblies contact each other like two combs with interdigitated teeth.

Some molecular details of the sarcomere. Myosin is in green, actin in red. Titin is in blue, and nubulin in teal. The Z-disks are in light blue at the sides, where actin and titin attach. Note how the titin molecules extend from the Z-disks right through the myosin bundles, meet in the middle. Titin is highly elastic, unfolding like an accordion, and also has stress sensitivity, containing a protein kinase domain (located in the central M-band region) that can transmit mechanical stress signals. The diagram at bottom shows the domain structure of nebulin, which has the significant role of metering the length of the actin bundles. It is also typical in containing various domains that interact with numerous other proteins, in addition to repetitive elements that contribute to its length.

There are over a hundred other molecules involved in this structure, but some of more notable ones are huge structural proteins, the biggest in the genome, which provide key guides for the sizes of some sarcomeric dimensions. Nubulin is a ~800 kDa protein that wraps around the actin filaments as they are assembled out from the Z-disk and sets the length of the actin polymer. The sizes of all the components of the sarcomere are critical, so that the actin filaments don't run into each other during contraction, the myosins don't run into the Z-disk wall, etc. Everything naturally has to be carefully engineered. Conversely, titin is a protein of ~4,000 kDa (over 34,000 amino acids long) that is highly elastic and spans from the Z-disk, through the myosin bundles, and to a pairing site at the M-line. In addition to forming the core around which the myosin motors cluster, thus determining the length of the myosin region, it appears to set the size of the whole sarcomere, and forms a spring that stores elastic force, among much else.

Many of these proteins come together at the Z-disk. Actin attaches to alpha-actinin there, and to numerous other proteins. One of these is ZASP, the subject of the current paper. ZASP joins the Z-disk very early, and contains domains (PDZ) that bind to alpha actinin, a key protein that anchors actin filaments, and other domains that bind to each other (ZM and LIM). To make things interesting, ZASP comes in several forms, from a couple of gene duplications and also from alternative splicing that includes or discards various exons during the processing of transcripts from these genes. In humans, ZASP has 14 exons and at least 12 differently spliced forms. Some of these forms include more or fewer of the self-interacting LIM domains. These authors figured that if the ZASP protein plays an early and guiding role in controlling Z-disk size, it may do so by arriving in its full-length, fully interlocking version early in development, and then later arriving in  shorter "blocking" versions, lacking self-interacting domains, thereby terminating growth of the Z-disks.

Overexpression of the ZASP protein (bottom panels) causes visibly larger, yet also somewhat disorganized, Z-disks in fly muscles. Note how beautifully regular the control muscle tissue is, top. Left sides show fluorescence labels for both actin and ZASP, while right sides show fluorescence only from ZASP for the same field.

The authors show (above) that overexpressing ZASP makes Z-disks grow larger and somewhat disorganized, while conversely, overexpressing truncated versions of ZASP leads to smaller Z-disks. They then show (below) that in the wild-type state, the truncated forms (from a couple of diverged gene duplicates) tend to reside at the outsides of the Z-disks, relative to the full length forms. They also show in connection with this that the truncated forms are also expressed later in development in flies, in concordance with the theory.

Images of Z-disks, end-on. These were not mutant, but are expressing fluorescently labelled ZASP proteins from the major full length form (Zasp52, c and d), or from endogenous gene duplicates that express "blocking" shortened forms (Zasp66 and Zasp67, panels in d). They claim by their merged image analysis (right) to find that full length ZASP resides with higher probability near the centers of the disks, while the shorter forms reside more towards the outsides.

Compared with what else is known, (and unknown), this is a tiny step. It also begs a lot of questions- could gene expression be so finely controlled as to create the extremely regimented Z-disk pattern? (Unlikely) And if so, what controls all this gene expression and alternative splicing, both in normal development, and in wound repair and other times when muscle needs to be rebuilt, which can not be solely time-dependent, but appears, from the regularity of the pattern, to follow some independent metric of ideal Z-disk size? It is likely that there is far more to this story that will come out during further analysis.

It is notable that the Z-disk is a hotbed of genes that cause myopathies of various sorts when mutated. Thus the study of these structures, while fascinating in its own right and a window into the wonders of biology and our own bodies, is also informative in medical terms, and while unlikely to lead to significant treatments until the advent of gene therapy, may at least provide understanding of syndromes that might otherwise be though of as acts of a cruel god.


Saturday, November 23, 2019

Redistribution is Not Optional, it is Essential

Physics-inspired economic models of inequality.

Thomas Piketty marveled at the way wealth concentrates under normal capitalist conditions, as if by magic. He chalked it up to the maddening persistence of positive interest rates, even under conditions where capital is in vast excess. Once you have a certain amount of wealth, and given even modest interest, money just breeds on its own, certainly without labor, and almost without thinking.

A recent Scientific American article offered a different explanation, cast in a more physics-style framework. It recounts what is called a "yard sale" model of a perfectly free economic exchange, where each transaction is voluntary and transfers net wealth in a random direction. Even under such conditions, wealth concentrates inexorably, till one agent owns everything. Why? The treatment is a bit like statistical mechanics of gasses, that follow random walks of individual particles. But where gasses are subject to constant balancing force of pressure that strongly discourages undue concentrations, the economic system contains the opposite- ratchets by which each agent greedily holds on to what it has. At the same time, poorer agents can only transact from what little they have, but stand to lose more (relatively) when they do. They thus have a stricter limit on how often they can play the game, and are driven to penury long before wealthier players. Even a small wealth advantage insulates that player against random adversity. Put that through a lengthy random walk, and the inevitable result is that all the wealth ends up in one place.
"In the absence of any kind of wealth redistribution, Boghosian et al. proved that all of the wealth in the system is eventually held by a single agent. This is due to a subtle but inexorable bias in favor of the wealthy in the rules of the YSM [yard sale model]: Because a fraction of the poorer agent’s wealth is traded, the wealthy do not stake as large a fraction of their wealth in any given transaction, and therefore can lose more frequently without risking their status. This is ultimately due to the multiplicative nature of the transactions on the agents’ wealth, as pointed out by Moukarzel." - Boghosian, Devitt-Lee, Wang, 2016
"If we begin at the point 1/2, the initial step size is 1/4. Suppose the first move is to the right, reaching the point 3/4. Now the step size is 1/8. If we turn back to the left, we do not return to our starting point but instead stop at 5/8. Where will we wind up after n steps? The probability distribution for this process has an intricate fractal structure, so there is no simple answer, but the likeliest landing places get steadily closer to the end points of the interval as n increases. This skewed probability distribution is the ratchetlike mechanism that drives the yard-sale model to states of extreme imbalance." ... "If some mechanism like that of the yard-sale model is truly at work, then markets might very well be free and fair, and the playing field perfectly level, and yet the outcome would almost surely be that the rich get richer and the poor get poorer." - Hayes, 2002

It is important to emphasize that the yard sale model is a libertarian's dream. It models perfect freedom and voluntary economic activity, if on a very simplistic level. But its implications are profound. It describes why most people in a free economic system own little more than their labor. The authors supplement this model with three more parameters, to align it better with reality. First is a wealth advantage factor. Our free economic system is not free or fair as a matter of fact, and the wealthy have many economic advantages, from lower interest rates (on loans), better returns on investments, to better education and more political power. Obviously, this is hardly conducive to greater equality, but rather to sharper and faster inequality. Second is a redistribution factor, in recognition that taxes and other costs have a redistributing effect, however small. And third is an allowance for negative wealth, which characterizes a fair portion of most societies, given our addiction to debt. Using these extra factors, these researchers can easily model wealth distributions that match reality very closely.

Lorenz curves showing income inequality in the US, and its growth in recent decades. Higher income families are on the right bottom, and their cumulative share of income are dramatically higher than those of lower income families. This graph gives rise to the Gini coefficient. Since this graph is binned in quintiles, it hides even more dramatic acceleration of income at the highest 10%, 1% and 0.1% levels.

An example of a model curve. The teal area (C) represents negative wealth, a fact of life for much of the population. The intersection of curve B with the right axis represents a result where one person or family is has 40% of all wealth. We are not quite there in reality, but it is not an unrealistic outcome considering current trends. Gini coefficients are generally defined as the areas A/(A+B).

The article, and other work from this group, finds that the redistrubution factor is absolutely critical to the fate of society. Sufficiently high, it can perpetually forestall collapse to total inequality, or even oligarchy, which is the common human condition. But if left below that threshold, it may delay, but can not forestall the inevitable.

What is that threshold? Obviously, it depends quite a bit on the nature of the society- on its settings of wealth advantage and redistribution. But rather small amounts of redistribution, on the order of 1 or 2 %, prevent complete concentration in one person's or oligarchy's hands. To make a just society, however, one that mitigates all this accidental unfairness of distribution, would take a great deal more.

There have traditionally been several social solutions to gross inequality, after humanity gained the capacity to account and accumulate wealth. One is public works and the dole, which the Romans were partial to. In their heyday, the rich vied to attain high offices and fund great works which benefitted Roman society. Another is a debt jubilee, where debts were forgiven at some interval or on special occasions. Another, of course, is revolution and forcible reforms of land and other forms of wealth. Karl Marx, along with many others, clearly sensed that something was deeply wrong with the capitalist system when allowed to run unfettered. And despite all the ameliorating regulations and corrective programs since, we are back in a gilded age today, with all time highs of gross unequality. To make matters worse, we have been backsliding on the principle of inheritance taxes, which should prevent the transgenerational and wholly undeserved accumulation of wealth and power.

Redistribution turns out, on this analysis, to be essential to a sustainable and just society. It is not a pipe dream or violation of the natural order, or of "rights". Rather, it is the right of every member of a society to expect that society to function in a fair and sustainable way to provide the foundation for a flourishing life by building each member's talents and building the social and material structures that put them to effective use. Capitalism and free exchange is only one ingredient in this social system, not its purpose or its overriding mechanism. That is why the weath tax that has been proposed by Elizabeth Warren is so significant and has generated such interest and support. It speaks directly and effectively to one of the central problems of our time- how to make a sustainable system out of capitalism.

Saturday, November 16, 2019

Gene Duplication and Ramification

Using yeast to study the implications of gene duplication.

Genes duplicate all the time. Much of our forensic DNA technology relies (or at least used to rely) on repetitive, duplicated DNA features that are not under much selection, thus can vary rapidly in the human population due to segment duplication and recombination/elimination. Indeed, whole genomes duplicate with some (rare) frequency. Many plants are polyploid, having duplicated their genomes one, two, three or more times, attaining prodigious genome sizes. What are the consequences when a gene suddenly finds itself making products in competition with, or collaboration with, another copy?

A recent paper explored this issue to some small degree in the case of proteins that form dimeric protein complexes in yeast cells. Saccharomyces cerevisiae is known to have undergone a whole genome duplication in the distant past, which led to a large set of related proteins called paralogs, which is to say homologs (similar genes) that originated by gene duplication and have subsequently diverged. Even more specifically, they are termed ohnologs, since they arise from a known genome duplication event (this special class is interesting since for such organisms, it makes up a huge class of duplicates that all arose at the same time, making some aspects of evolutionary analysis easier). A question is whether that divergence is driven by neutral evolution, in which case their resemblance quickly degrades, or whether selection continues for one homodimer, for both homodimers, or even for the complex between the two partners, which is termed a heterodimer.

The authors go through simulations of several different selection regimes, done at atomic scale to known protein paralogs, to ask what effect selection on one feature has on the retention or degradation of other features of the system. Another term for this is genetic relationship is pleiotropy, which means the effects that one gene can have on multiple functions, in this case heterodimeric complexes in addition to homodimeric complexes, which often have different, even opposite, roles.
One example of a homodimeric protein (GPD1, glycerol-3-phosphate dehydrogenase) that has an ohnolog (GPD2) with which it can heterodimerize. At top is the structure- yellow marks the binding interface, while blue and pink mark the rest of each individual protein monomer. The X axis of each graph is time, as the simulation proceeds, adding mutations to the respective genes, and enforcing selection as the experimenters wish, based on binding energy of the protein-protein interface, as calculated from the chemistry. That binding energy is the Y-axis. Dark blue is one homodimer (GPD1-GPD1), pink is the other homodimer (GPD2-GPD2), and purple is the binding energy of the heterodimer (GPD1-GPD2).

In a neutral evolution regime lacking all selection, (top graph), obviously there is no maintenance of any function, and the ability of the molecules to form complexes of any kind steadily degrades with time- the binding energy of the dimers goes to zero, at the origin. But if selection is maintained for the ability of each gene product to form its own homodimer, then heterdimer formation is maintained as well, apparently for free (second graph). Similarly, if only selection for a heterodimer is maintained, the ability of each to form homodimers is also maintained for free. At bottom, if only one homodimer is under positive selection, then the formation of the other homodimer degrades most rapidly, and the heterodimer degrades a bit less rapidly.

All this is rather obvious from the fact that the binding interface (see the structure at the top of figure for the example of GPD1) is the same for both proteins, so the maintenance of this binding interface through positive selection will necessarily keep it relatively unchanged, which will keep it likewise functional for other interactions that occur on exactly the same face, which is to say the heterodimeric interaction, when the homodimeric interaction is selected for, or vice versa, etc.

So why keep these kinds of duplicates around? One reason is that, while preserving their binding interface with each other, they may diverge elsewhere in their sequence, adopting new functions over time. This kind of thing can lead to the formation of ever more elaborate complexes, which are quite common. Having two genes coding for related functions can also insulate the organism from mutational defects in either one, which would otherwise impair the homodimeric complex more fully. By the same token, this insulation can allow variational space for the development of novel functions, as in the first point.

So, nothing earthshaking in this paper, (which incidentally included a good bit of experimental work which I did not mention, to validate their computational findings), but it is nice to see yeast still serving as a key model system for basic questions in molecular biology. Its genomic history, which includes a whole genome duplication, and its exquisite genetic and molecular tool chest, make it ideal for this kind of study.


Saturday, November 9, 2019

Power

And lack of power.

The recent power shutdowns in California were maddening and disruptive. They also showed how utterly dependent we are on the oceans of fossil fuels we burn. With every convenience, gadget, trip, comfort, appliance, and delivery we get more enmeshed in this dependence, and become zombies when the juice is suddenly cut off. Not only is our society manifestly not robust, but every drop of fuel burned makes the problem still worse: the biosphere's decline to miserable uninhabitability. The children are right be be pissed off.

Do we have the power to kick this habit? This addiction makes opioids look like amateurs.  It won't be a matter of checking into rehab and going through a few weeks of detox. No, it is going to take decades, maybe centuries, of global detox to kick this problem from hell. Living without our fix of CO2 is impossible on any level- personal, social, political, economic, military. And the pushers have been doing their part to lull us even further into complacency, peddling lies about the risks and hazards they deal with as an industry, their own research into climate change and what our future looks like, not to mention our complicity in it.

Do we have the moral and political power to get off fossil fuels? Not when half of our political community is in denial, unwilling to take even one step along the 12 step path. I am studying the Civil War on the side, which exhibits a similar dynamic of one half of the US political system mired in, even reveling in, its moral turpitude. It took decades for the many compromises and denials to play themselves out, for the full horror to come clear enough that decent people had had enough, and were ready to stamp out the instution of slavery. Which was, somewhat like the fossil fuels of today, the muscular force behind the South's economy and wealth.

Do we have the technical and intellectual power to kick this habit? Absolutely. Solar and wind are already competitive with coal. The last remaining frontier is the storage problem- transforming intermittant and distributed forms of power into concentrated, dispatchable power. And that is largely a cost problem, with many possible solutions available, each at its price. So given a high enough price on fossil carbon, we could rapidly transition to other sources of power, for the majority of uses.

A 300 MW solar power plant in the Mojave.

Does the US have the power to affect climate change policy around the world? We don't have all the power, but have a great deal. If we were to switch from a regressive laggard to a leader in decarbonization, we would have a strong effect globally, both by our example and influence, and by the technical means and standards we would propagate. We could amplify those powers by making some of our trade policy and other relations more integrated with decarbonization policy.

Do individuals have the power to address these issues? The simple answer is no- all the virtuous recycling, biking, and light-bulb changing has little effect, and mostly liberates the unused fossil fuels for someone else to use at the currently criminally low prices. Individuals also have little power over the carbon intensity of the many products, services, and infrastructure they use. Maybe it is possible to eat less meat, and avoid fruit from Chile. But we can not unplug fully from this system- we need to rewire the system. It is fundamental economics that dictates this situation, which is why a stiff carbon tax and related regulation, with the associated political and moral will are so important.

Finally, does the State of California have the power to take responsibility for the PG&E mess? Absolutely, but probably not the will. The power shutdowns led to a common observation that the state should just buy PG&E at its bankrupt price and run it in the public interest. But keen observers have noted that the state's politicians would much rather have someone else to blame, than be saddled with a no-win institution that puts the blame on them. Power lines are going to cause fires in any case, unless we cough up the billions needed to put them underground. Customers will always complain about the price of utilities, so it is hard to see the state stepping up to this mess, or even reforming the public utilities commission, which has been so negligent as well.

  • Why did the GOP nominate, and the American people elect, a Russian asset to the White House?
  • Battle lines on health care.
  • Point to Bernie.
  • The church and psycho-social evolution.

Saturday, November 2, 2019

To Model is to Know

Getting signal out of the noise of living systems, by network modeling.

Biology is complex. That is as true on the molecular level as it is on the organismal and ecological levels. So despite all the physics envy, even something as elegant as the structure of DNA rapidly gets wound up in innumerable complexities as it meets the real world and needs reading, winding, cutting, packaging, synapsing, recombining, repairing, etc. This is particularly true of networks of interactions- the pathways of (typically) protein interactions that regulate what a cell is and does.

An article from several years ago discussed an interesting and influential way to learn about these interactions. The advent of "big data" in biology allowed us to do things like tabulate all the interactions of individual proteins in a cell, or sample the abundance of every transcript or protein in cells of a tissue. But it turned out that these alone did not lead directly to the elucidation of how things work. Where the genome was a part ordering list, offering one catalog number and description for each part, these experiments provided the actual parts list- how many screws, how many manifolds, how many fans, etc., and occasionally, what plugs into what else. These were all big steps ahead, but hardly enough to figure out how complicated machinery works. We still lack the blueprint, one that ideally is animated to show how the machine runs. But that is never going to happen unless we build it ourselves. We need to build a model, and we need more information to do so.

These authors added one more dimension to the equation- time, via engineered perturbations to the system. Geneticists and biochemists have been doing this (aka experiments- such as mutations, reconstitutions, and titrations) forever, including in gene expression panels and other omics data collections. But employing a perturbation method in a systematic and informative way on the big data level in biology remains a significant advance. The problem is called network inference- figuring out how a complex system works, (which is to say, making a model), now that we are given some but not all important information of its composition and activities. And the problem is difficult because biological networks are frequently very large and can be modeled in an astronomical number of ways, given that we have scanty information about key internal aspects. For instance, even if many individual interactions are known from experimental data, not all are known, many key conditions (like tissue, cell type, phase of the cell cycle, local nutrient conditions etc.) are unknown, and quantitative data is very rare.

One way to get around some of these specifics is to poke the system somewhere and track what happens thereafter. It is a lot like epistasis analysis in genetics, where if you, say, mutate a set of genes acting in a linear process, the ones towards the end can not be cured by supplying chemical intermediates that are made upstream- the later genes are "epistatic" to those earlier in the process. Such logic needs to be expanded exponentially to address inference over realistic biological networks, and gets the authors into some abstruse statistics and mathematics. Their goal is to simplify the modeling and search problem enough to make it computationally tractable, while still exploring the most promising parts of their parameter space to come up with reasonably accurate models. They also seek to iterate- to bring in new perturbation information and use it to update the model. One step is to discretize the parameters, rather than exploring continuous space. Second is to use preliminary calculations to get near optimal values for their model parameters, and thereafter explore those approximated local spaces, rather than all possible space.

Speed of this article's method (teal) compared with a conventional method for the same task (green).

All this is done over again for experimental cases, where some perturbation has been introduced into their real system, generating new data for input to the model. Their improved speed of calculation is critical here, enabling as many iterations and alterations as needed, to update and refine the model. If the model is correct, it will respond accurately to the alteration, giving the output that is also observed in real life. Such a model then makes it possible to perform virtual perturbations, such as simulating the effect of a drug on the model, which then predicts entirely in silico what the effects will be on the biological network.
"It is also useful, as an exercise, to evaluate the overall performance of the BP algorithm on data sets engineered from completely known networks. With such toy datasets we achieve the following: (i) demonstrate that BP converges quickly and correctly; (ii) compare BP network models to a known data-generating network; and (iii) evaluate performance in biologically realistic conditions of noisy data from sparse perturbations."
...
"Each model is then a set of differential equations describing the behavior of the system in response to perturbations."

The upshot of all this is models that are roughly correct, and influential on later work. The figure below shows (A) the smattering of false positives and missing (false negatives) interactions, but (B) accounts for most of this error as shortcuts of various kinds- the inference of regulation that is globally correct, but may be missing a step here or there. So they suggest that the scoring is actually better than the roughly 50 - 70% correct rate that they report.

An example pair of interconnecting pathways inferred from experimental protein abundance data and perturbed abundance data, with the protein molecules as nodes and their interactions as arrows. Where would a drug have the most effect if it inhibited one of these proteins?

They offer one pathway as an example, with an inferred pattern of activity, (above), and a few predictions about what proteins would be good drug targets. For example, PLK1 in this diagram is a key node, and has dramatic effects if perturbed. This came up automatically from their analysis, but PLK1 happens to already be an anticancer drug target with two drugs under development. Any biologist in the field could have told them about this target, but they went ahead with proof-of-principle experiments to show that yes, indeed, treatment of their RAF-inhibitor drug-resistant melanoma cells, which were the subject of modeling, with an experimental anti-PLK1 drug results in dramatic cell killing at quite low concentrations. They had used other drugs as perturbation agents in the physical experiments to develop this model, but not this one, so at least in their terms, this is a novel finding, arrived at out of their modeling work.

Given that these authors were working from scratch, not starting with manually curated pathway models that incorporate a lot of known individual interactions, this is impressive work (and they note parenthetically that using such curated data would be a big help to their modeling). Having computationally tractable ways to generate and refine large molecule networks based on typical experimentation is a recipe for advancement in the field- I hope these tools become more popular.



  • Ruminations on PGE. Minimally, the PUC needs to be publically elected. Maximally, the state needs to take over PGE entirely and take responsibility.
  • Study on the effect of automation on labor power ... which is minor.
  • What has happened to the Supreme Court?
  • Will bribery help?