Saturday, January 5, 2019

To Re-engineer a Bacterium

Computational modeling of E. coli regulatory circuitry suggests that some bloat has crept in over the years.

Are we at the point of redesigning life? So far, studies of biology have relied on observation, and on mutations, generally for the worse. We have also tinkered around the edges by introducing new modular functions to some species, most notoriously pesticide and herbicide resistance in crop plants, and antibiotic resistance in bacteria. But what about redesigning whole organisms? A paper from a few years back (2012) took a stab at redesigning the genome of the model bacterium, E. coli, for simplicity. The quest was pretty elementary- ask whether the genes of the organism could be re-organized to function as well as the wild-type genome, but in fewer operons, with simpler regulation. Operons are sets of protein-coding genes lined up like a multi-car train, all induced transcriptionally by one promoter at the upstream end. The more similar functions one can stuff into one operon, the simpler the overall regulatory system can be. On the other hand, the joined genes are harnessed together in mRNA/transcription terms, so any regulatory flexibility that might be useful at that level is lost.

A schematic operon, with a promoter and other regulatory sites, which drive transcription of a set of coding genes, which are transcribed into one mRNA message, which is then translated into a series of distinct proteins. What is gained from chaining many related genes into one operon?

It was, admittedly, a rather academic exercise, with limited criteria for "normal function" of the genome: that its genome should produce all the products of the wild-type organism under ~100 different environmenal conditions. And it was all computational, done by iterative, computer-based evolution but never translated into a lab test of actual organisms (though synthesizing a bacterial genome based on this data is probably quite practical at this point). What they did have was a set of differential equations expressing key regulatory activities of a normal cell, concerning its metabolism, environmental inducers, transcriptional regulators, and genome targets. This is not a full cell simulation, leaving out protein translational controls, degradation, cellular structure and other modes of regulation, but still covers a lot of territory.

Process of computational refinement of the target genome, making random variations, then assaying for modularity and transcriptional output, and then iterating again, many times. Top graph shows regulatory modularity, which increased almost monotonously due to the design of the genome manipulations. Bottom graph is the (computed/simulated) similarity of the transcriptional output vs the original genome, which takes a big hit at first, before climbing back to the original state.

They performed thousands of computerized steps of shuffling regulatory sites and genes around the imaginary genome, testing the result each time for its similarity to the wild-type case in terms of output, and for its modularity/simplicity. As one can see in the graph above, (bottom), the output was quite unlike the standard (wild-type) for a long time during this simulation, before regaining an approximation of the wild-type pattern later on. Clearly the first order of business was simplification, with accuracy of output secondary. The final results are impressive, given the limitations, reducing the total number of operon clusters to about 1/4 of the original.

Example loci from the work. After genome re-organization, several operons have many more genes (black arrows, text notes).

An example is given above. The arc operon is turned on in anerobic (low oxygen) conditions, and encodes factors that repress aerobic loci, such as those involved in oxidative phosphorylation- the use of oxygen to generate ATP efficiently. In the rewired cell, this promoter encodes not one, but 11 other genes as well, probably gaining time and parallel control for a lot of other functions that could benefit from coordinate regulation. But what is galS doing there? This is a key regulator that turns on galactose import, a function completely unconnected with anaerobic conditions. This is one instance (which the authors bring up themselves) where, due to the limited selective pressure these experimenters put on their models, they came up with an intuitively poor result. But overall, they document that, as expected, the functions of genes now coalesced into single operons are overwhelmingly similar as well.

This work, while abstract, and unlikely to have resulted in a bacterium as fit in the wild as its founding strain, is a very small example of computational cell and molecular modeling which has, like artificial intelligence, been the next big thing in biology for decades, but is becoming more powerful and may actually contribute something to biology and medicine in the coming decades.


  • An analogous simplification experiment in yeast cells.
  • A good diet is lots of activity.
  • The Fed is wrong.
  • It is hard for a fool not to look foolish.
  • How European banks (and the Euro) fostered the financial crisis: “Six European banks were pumping out “private label MBS” from their “US … affiliates.
  • Libraries are civic institutions full of wonder.
  • Just how dead is the Republican party?