Saturday, March 27, 2010

Driven to speciate- meiosis and PRDM9

Many crossover sites in meiosis are selected by the protein PRDM9, which has been evolving extraordinarily rapidly, contributing to the speciation of humans. (Warning... this post is unusually technical.)


Analysis of the human genome is picking up speed. A recent issue of Science provided two examples, one paper describing dramatic advances in the search for human genes with variants whose prevalence in the population has risen recently due to strong selection. Another set of papers each found the same gene, PRDM9, to act in crossover hotspot selection, converging from two separate mouse genetics studies and one computational study which I will focus on.

One of the more subtle processes of reproduction among sexual organisms is chromosome crossing-over at meiosis. Typically each chromosome arm, as it goes through the meiotic division that reduces the diploid genome inherited from the two parents into one haploid genome found in gametes, ends up not as a pure copy of either parent's chromosome, but as a patchwork, with some portions from one parent and the rest from the other. Sort of a contra dance, with a choreographed do-si-do of DNA recombination.

Getting there requires one or more crossover events per chromosome whose core is direct recombination between DNA from the two parents somewhere along the chromosome arm, and involves complicated chromosomal dynamics which can also influence cell division. To whit, if any single chromosome fails to have such a connecting crossover, it also fails to align at the center of the cell during metaphase I and floats off in the cytoplasm, stopping cell division at a point called the pachytene checkpoint, usually fatally. You don't want those gametes!

One might imagine that these crossovers occur randomly across the genome, but they don't. They occur primarily at what are called "hotspots", distributed unevenly over each chromosome. Hotspot locations change dramatically from species to species. We have almost completely different ones from chimpanzees, even though most of our DNA is the same. Why is that?

The reason seems to be an even more subtle process called "biased gene conversion", which is the engine behind all the phenomena of this article, causing in this case something called "meiotic drive". Crossover events begin with breaks in the DNA: double-stand breaks that are repaired using the information from the undamaged, homologous strand of DNA. The sound DNA physically invades the duplex of the damaged DNA, allowing both duplexes to be filled out by polymerases, i.e. repaired, (as diagrammed below) based on the sequence of the undamaged DNA.

Diagram of gene conversion- the repair of one homolog suffering a break with information from the other DNA/chromosome homolog.
But some of the time, this repair resolves not a clean invasion and retreat of the "good" DNA, (which process is called gene conversion), rather it resolves by the crossed DNA getting cut the other way, one DNA strand of the "good" chromosome staying with the opposite arm/strand of the damaged chromosome. In the diagram, this takes place after the "New DNA Synthesis" stage, where, if you just swing the arms around and cut the products a little differently (up and down, in line with the gray arrows), you end up with a crossover rather than two cleanly repaired original chromosomes. These crossover events are the ones that meiosis relies on (and regulates) to end up with properly inter-joined homologous chromosomes in meiosis I, where one homolog comes from each parent. However meiosis involves plenty of gene conversion as well- only a minority of induced double strand breaks end up as cross-overs.

If an organism has a mechanism to cause double-strand breaks during meosis, (as they do), and that mechanism relies on signals on the local DNA as targets of these breaks, (as it does- those are the hotspots), then it follows that the activating signal will tend to get erased when it is heterozygous and attracts repair by undamaged DNA lacking that particular local hotspot sequence. This process of eliminating the very markers that start the events of DNA repair / gene conversion / crossing over represents a bias, since one sequence variant tends to lose out over time and get erased- not just from the gametes, but from the population as a whole. Thus the term "biased gene conversion". Thus also the term "meiotic drive", since it is an example of selection based on the peculiarities of the molecular system, which "drives" the genetic composition of the population in a non-random direction.

In this paper (with review), the hotspot of interest contains the sequence CCTCCCTNNCCAC, where N stands for any nucleotide. They claim that this sequence is at the core of 40% of human meiotic recombination hotspots, while not being involved in chimpanzee hotspots at all. While other papers in the issue arrive at the protein that binds this sequence from painstaking genetic mapping of loci that affect hotspot usage in mice, the current paper gets there from computational analysis of zinc finger proteins.

Humans have about 691 zinc-finger proteins, which typically bind DNA with their zinc-finger domains, and do something else with the rest of the protein, like regulate transcription, or as in this case, direct the double-strand break apparatus. One reason the family is so large is that its members have a modular design where each finger, which is a protein loop that coordinates one zinc ion inside while its outward-facing amino acid residues touch the DNA, touches about four nucleotides in the major groove. Zinc-finger proteins typically have multiple fingers, up to thirty in some cases, enabling them to recognize lengthy and specific sequences. This modularity allows them evolve easily by shuffling around pieces of their genes. They are also interesting from a biotech standpoint, in the quest to develop technical ways to regulate arbitrary DNA sequences with injectable or gene therapeutic agents.

The modularity is also a boon to bioinformaticists, who, as this paper demonstrates, can predict from a target sequence what zinc-finger protein might bind to it. Given the target sequence mentioned above (which is set within an additional 30 base pairs of influential context), these authors estimated that the binding protein had about 12 fingers, and could also estimate what the key residues of these fingers probably were. Scanning the human genome data, they came up with one protein that closely fit the bill- PRDM9. Below is their diagram of the critical fingers/residues of this protein, aligned along the human target DNA sequence to which it binds, also aligned with its homologs from other species.
"In silico prediction of the binding consensus for PRDM9 ... Below the text is the sequence of four predicted DNA-contacting amino acids for the 13 successive human PRDM9 zinc fingers (one oval per finger, differing colors for differing fingers, and the separated finger is gapped N-terminal from others) and their predicted base contacts within the motif. (C) Sequence of four predicted DNA-contacting amino acids for the PRDM9 zinc fingers in seven mammalian species, presented as in (B). Distinct fingers are given different colors; fingers present in at least two species have a black border."
Note that the same protein in different species has almost completely different DNA-binding residues and thus target specificity. This rate of change far surpasses that in the rest of the genome, where very little change typically separates us from any of these other mammalian species, and  most of that change is random. This data (combined with other work that confirms the connection between PRDM9 and crossover hotspots) accounts for why hotspot locations differ so dramatically between humans and chimpanzees, at least those that are directed by this protein.

So here we have it- an inexorable genetic process by which the targets of meiotic recombination continually change under the pressure of biased gene conversion, matched with a targeting protein which seems to evolve rapidly in response, as though the actual sequence it binds to is of minor significance, just as long as it has something to bind to as the rug is continually pulled out from under it.

How does this relate to speciation? Speciation depends on mating/fertility barriers between populations, either geographic, behavioral, biological, etc. A typical example is hybrid infertility- the inability of individuals from different protospecies to mate and have offspring, or the infertility of those offspring. Mules come to mind, for instance.

Problems in meiosis result in infertility. Specifically, reduction of meiotic crossovers below the one-per chromosome level is fatal to the resulting gametes, as outlined above. As the PRDM9 gene races to keep up with gene conversion that erases its targets, it will diverge between populations, leading to changes in hotspot locations and sequence. The arrival of two incompatible parental genomes, one of which lacks the ability to recognize the crossover hotspots of the other, is a recipe for catastrophe- specifically, for meiosis I non-synapsis. This indeed is how one of the other papers in this issue found PRDM9- by locating a genetic variant in mice directly responsible for the hybrid infertility between Mus domesticus and Mus musculus.

These findings advance considerably knowledge and theory about speciation in animals whose genetic variation can be quite low (on average), whose generation times may be quite long, and whose populations can be quite small. It is very exciting to be able to synthesize, using the modern toolchest especially including the human genome, these strands of cell, molecular, and evolutionary biology.

  • Another discussion of the PRDM9 gene, authored before its molecular function was understood, and focusing on its role in speciation.
  • On girls, real girls, and equality.
  • Great news from Iraq.
  • Nice quote from Gregor.
"We have lost touch with the hurdles faced by our not-too-distant forbears who, in a world of wood and coal, found waterway transport a kind of miracle. What kind of a nut, for example, would blast through all that granite in upstate NY to build a canal? A nut who did not have oil, that’s who."
"This insight allows us to see another dimension of taxation which is lost in orthodox analysis. Given that the non-government sector requires fiat currency to pay its taxation liabilities, in the first instance, the imposition of taxes (without a concomitant injection of spending) by design creates unemployment (people seeking paid work) in the non-government sector. The unemployed or idle non-government resources can then be utilised through demand injections via government spending which amounts to a transfer of real goods and services from the non-government to the government sector.
...
So it is now possible to see why mass unemployment arises. It is the introduction of State Money (which we define as government taxing and spending) into a non-monetary economics that raises the spectre of involuntary unemployment."