Saturday, June 12, 2010

Four genomes

A nuclear family gets its nuclear DNA sequenced. What can it tell us?

With sequencing technology going dramatically down in price and up in speed, and routine individual genomes within sight as a matter of routine medical practice, one group of researchers decided to sequence the DNA of two parents and their two children to see what we can gleen from such ultimate-detail data.

One thing to note is that the sequence of an individual's genome is static data. Once done, for whatever price, that is it- it does not have to be redone again, unless the original error rate was excessive. This work claims accuracy of 99.999%, which seems sufficient for most needs. At three billion base pairs, this amounts to 3,000 errors, which isn't nothing, but is very small.

On the other hand, the interpretation of this data is highly dynamic. As we learn more about how human biology works, how genes turn into cells, organs, behaviors, and about how the astronomical numbers of human genetic variants affect these traits, we can keep going back to our genomes to learn more about ourselves.

The paper makes several basic observations, improving data that had been approximated by other means. For instance, the human mutation rate turns out to be ~70 mutations per generation. Considering that roughly 2% of the genome codes for proteins, and no more than 30% of the genome involves gene-related DNA at all, (introns, promoters, etc.), this is a pretty modest mutation rate. We are all mutants, but the chances of something disastrous occuring are relatively small, as is seen in the phenotypic real world as well.

The researchers are able to map the recombinations that happened on each chromosome in each gamete that produced each child, in detail. Just over half occurred at what are called recombination hotspots, sprinkled along each chromosome. This remarkable process of recombination mixes things up genetically, insuring that what each child gets is not just a roulette selection of whole chromosomes from the respective grandparents, (which is what would happen in the absence of recombination), but a unique patchwork assembled out of both grandparent's chromosomes, insuring that each human is highly unique, and that, over long periods of time, genes get selected on their own merits rather than living and dying based on the many other gene variants with which they co-habit their chromosome.

The putative reason to do all this sequencing was to track down two rare and severe genetic diseases afflicting both children, and this complete sequencing offers several technical advantages for such a hunt. Firstly, the accuracy of each of the full sequences is increased by having the others to compare with. Shared variants are likely to be real, while unique ones can be explicitly resequenced to double-check. Assembling these sequences from the small individual reads that form the basis of all DNA sequencing is a huge computational effort, always helped by having a comparison base of maximally similar data.

Secondly, the density of this data makes finding disease-causing genetic variants much easier. In typical studies of genetic associations, a population of affected patients is studied, with comparison to a control group, randomly selected from the population. There will be a very large number of genetic variants among all these people. Ferreting out the variants or regions of variation that happen more frequently in the affected group is thus very difficult. In a family group, on the other hand, while the parents presumably have no special genetic relationship to each other, the group as a whole will have far lower numbers of stray variants, decreasing the search space substantially.

In this case, the researchers find that since two diseases they are interested in, present in both children but absent in the parents, appear to show recessive genetics, they can throw out 78% of the sequences, and, from the recombination map, focus on those bits of the children's genomes that are identical between them and heterozygous in the parents (22%). The variants also must be rare, based on the rarity of the syndromes (primary ciliary dykinesia and Miller syndrome), allowing them to throw out variants known to be common from other population genetic variant studies. This allows them to winnow down the disease candidates to four variant genes, as opposed to the dozens that would have resulted from a more typical gentic disease association study.

Personalized medicine is coming our way- it will combine sequence analysis of our genomes (in addition to molecular analysis of cancer biopsies and the like) with the developing knowledge of just what the sequence means- knowledge that is only in its infancy right now.

"If I was in charge of one of these nations I would announce next weekend (when the banks are closed) that I was introducing a new currency, defining all Euro debt liabilities in the new currency and let the foreign exchange markets value that currency on the Monday morning.
I would withdraw any semblance of central bank independence (that is just another democratic insult) and I would expand the budget deficit immediately by introducing a Job Guarantee.
Then by Tuesday, I would start repairing the confidence of private spenders to get the economy rolling again.
And after a relatively short time I would notice that economic growth was gaining speed, private spending was returning, unemployment was low and … the budget deficit was falling.
Then I would send an open invitation to the citizens of Germany to abandon the teutonic ship and head south (or west)."

No comments:

Post a Comment