Saturday, September 5, 2015

Orthologs and Homologs and Ohnologs, Oh My!

Long ago, vertebrate genomes underwent genome duplication, which unleashed great evolutionary opportunities.

Among the most ignorant arguments ever foisted on a gullible public by anti-evolutionary Creationists was that information can not increase in living systems by natural means. Put in its sophistical garb, with a intimation of high-power mathematics and quantum theory, it was hard to resist the idea that this PhD from the finest schools really knew what he was talking about. But, obviously not, since information is interchangeable with energy, as we do every day with our keyboards, and as life forms spend their time doing internally as well.

The clearest refutation of this appalling hypothesis is the phenomenon of gene duplication, when by some accident of replication or recombination, stretches of DNA can double, and an organism ends up with two genes where only one was before. As time goes on, the extra copy may disappear, recombining back out of existence, but on the other hand it may share functions with its brother (or homolog) and gain variations in its regulation or expressed sequence that lead to some new function and thus a selective reason to remain in the genome.

DNA duplications are simply accidents, no more mysterious or transgressive than the existence of DNA in the first place. And they can happen at any scale, from single nucleotides up to whole genomes. In fact, the duplication of whole genomes has been tremendously important in evolution and has been recorded in the histories and genomes of many lineages. A recent paper delved into two rounds of genome duplication that happened to the early vertebrate lineage, whose traces are peppered throughout the genomes of ourselves as well as all other vertebrates.

Incidentally, plants have experienced lots of gene duplication events, even triplications. This is the main reason why many plants have more genes than we do, despite not (apparently) being more significantly more complex.
Genome duplications and triplications in the history of angiosperms.

When a gene duplicates, a race begins between the forces of recombination, which tend to splice it back out due to the pair's identical sequences, and the forces of evolution (i.e. natural selection) which, if some kind of diversification happens, or some other rationale for keeping the extra copy, keeps it around. The rationale may simply be higher production of the same product from two, rather than one, gene. But over time, the reasons tend to get more interesting, with each gene diverging, and thus specializing, in the regulation of its own expression, or the activity of its expressed protein. Once both copies have specialized for different functions, the race is done and the gene is there to stay.

Since evolutionary biology is all about relatives and lineages, a complicated terminology has developed for describing related gene sequences, which are so very useful in finding interesting genes and deducing their function. Two genes with similar sequences are called homologs, which also means that the two sequences are related by descent. There are many families of genes, some with hundreds of members in a single species which are related in such a way, a testament to the profuseness and usefulness of gene duplication over the history of life.

If a gene is related 1:1 to a homolog in another species, such that they are fulfilling a similar role and came from the common ancestor by lineal descent, they are called orthologs. While many orthologs are relatively easy to recognize, the ability of genes to keep duplicating and diversifying in some lineages but not others can sometimes make it difficult to decide on orthologous relationships. Indeed, homologous family members present in one species are called paralogs, and whether each paralog has its separate ortholog in another species, or whether only one of them is the true ortholog and the rest are later sproutlings depends on the precise lineage history, which one has to figure out by lining up the relevant genes from several different species.

Lastly, the current paper studies homologs which descend not from single gene duplications, but from whole genome duplications, and are called Ohnologs, in honor of Susumo Ohno, who first proposed their existence. In some respects, whole genome duplications happen relatively easily. A cell just forgets the chromosome division process during mitosis, and voila- a duplicated genome. There are no dosage effects of one gene present at unusual amounts, since all are duplicated equally. There are problems mating with other organisms, however, as the chromosomes no longer match. But if parthengenesis is an option, even that issue can be overcome and a new species is founded into the bargain.

Whatever the short-term difficulties, the long-term implications of a successful genome duplication event are momentous. The new species has extra genes coming out of its ears, which lend themselves to countless opportunities for specialization and diversification. To new features that would not have been possible through the painstakingly slow accumulation of alterations in one gene. The current paper documents in the greatest detail yet the back-to-back duplication events that happened at the base of the vertebrate lineage, about 600 million years ago. The authors labor to detect as many of the remaining genes as possible, despite the ravages of time through which genes get shuffled, lost, reduplicated, garbled, etc.

They use whole genomes from several species, both inside and outside vertebrates, to look for traces of related duplicates or quadruplicates. The main piece of evidence is synteny, which is the similar ordered location of genes along the genome, comparing two species, or in the case of ohnologs, homologous genes within one species. The problem is that smaller gene duplication events can show very similar signatures to ancient whole-genome duplications that were later shuffled around by recombination. So the authors need to use a collection of species that are broad enough to place the hypothesized duplication back in time at the putative vertebrate duplication events. At the highest stringency of analysis, they come up with 1381 ohnolog sets, i.e. two or more genes traceable to a single originating gene. At lower stringency, they find up to 2642 sets, comprising almost 8,000 genes. This is over a quarter of the human genome complement, so is not bad work for events that were so long ago. It is remarkable to see so many duplicate genes preserved at a detectable level.

A pre-eminent example of such gene duplication is the Hox genes, which control broad tissue identity, especially by segment, of the body plan. The diagram below shows a classic case of ohnolog inheritance, where, using alignments of the relevant genome regions of several species, four versions of the Hox cluster are present in humans where only one is present in flies and in the chordate (but not vertebrate) lancelet. Individual copies of the Hox genes were lost in the vertebrate lineage, presumably before their diversification had established new and essential functions, but most stuck around in some form. The diversification of Hox genes allowed greater body plan complexity, as extra copies contribute to identity of novel tissues all over the body, such as limbs, digits, and parts of the head, going substantially beyond the strictly segmental identity system used in flies.

Genes controlling segmental identity in the body plan of animal species, the Hox genes, show clear evidence of the ancient genome duplications that happened at the base of the vertebrate lineage. Though some of the Hox genes were subsequently lost, the rest comprise sets of Ohnologs.


  • Bonus paper on the ancient genome duplication in yeast.
  • The Fed is getting ready to make a mistake.
  • Rank Christianity... the sacred liberty to cram my religion down your throat.
  • Annals of the class war.. who wins from QE? Mostly the financial sector, which wouldn't be the case if we had used fiscal policy instead. And more on public debt.
  • The Stepford mistress-bots.
  • Excess global savings is the new normal, and doesn't mix well with a free capital and investment market.
  • Wag the dog.. markets have very long time horizons, which can whipsaw current valuations.
  • Socialism and worker power.. scaring capitalists straight.
  • Charity is closely related to feudalism. Governments do it better and fairer.
  • Annals of denial.. ISIS and its brand of Islam.
  • The garbage patch is worse than ever.
  • How MMT differs from Keynes.
  • Image of the week.. it's full of stars! A Hubble field of numerous lensed galaxy images.