Saturday, December 16, 2017

Structure of the Polytene Chromosome

Fly researchers have had a special microscope on their genetic subject for a century. Now we know why.

The DNA of our cells is enormous, and at the same time it is microscopic. Our 3 billion basepair genome is six feet long. Yet each of our cells contains a copy that is exquisitely wrapped up and so difficult to observe that it took X-ray crystallography and decades of experiment and inference to divine its true nature. Fruit fly researchers had a significant head start, however, with their recognition that a few fly cells make a form of their genome that can be observed with relative ease.

A "squash" of Drosophila polytene chromosomes. The upper left inset shows condensed normal mitotic chromosomes for comparison- they are very small. What do the bands signify?

Polytene chromosomes are made by larval insects in their salivary glands, apparently for the purpose of amplification of gene expression. Rather than develop ways to super-express the salivary protein products they need so much of from the usual single DNA copy, these cells re-duplicate their DNA many times, (about 1000-fold), while keeping it joined in a sort of synaptic alignment, which would normally only be apparent during cell division. They can then express selected genes to high levels with less regulatory effort. These "on" genes are apparent as puffs in the chromosomes when they are prepared with special stains and visualized under a regular microscope.

A "puff" of opened and expressing DNA is visible at upper left. The three panels are 1: staining with a general fluorescent DNA dye (green) along with a specific red dye targeted to a gene of interest (red; see arrows- this was a DNA-primer-based detection, so quite direct). 2- the customary orcein / Giemsa stain for visible light microscopy. 3- a light filter specific for the fluorescent red dye in the first panel bring out its locations.

The great Red Book of Drosophila genetics offers a comprehensive mapping of genes with respect to the "cytology" of these polytene chromosomes. The figure above was prepared by modern molecular biology reagents, which makes locating a gene of interest easy. But before all that, genetic mapping was done by tracking visual changes in the polytene chromosomes and mutations that befell gross genetic loci and had phenotypic consequences, such as re-arrangements of DNA that cut a gene in half. Decades of such work, correlated with more standard genetic mapping by recombination, resulted in a dense roadmap of genetic markers distributed along the characteristic banded pattern of the Drosophila polytene chromosomes.

But why were the polytene chromosomes banded in the first place? What were these landmarks that everyone relied upon? A recent paper re-opens this issue and links the interband zones (the lighter areas in both the green-stained and the orcein-stained preparations above) to the specific molecule function of transcriptional insulation. "Bands" are richer in DNA, while the interbands have less DNA. What do they have instead? Evidently proteins, of a particular sort.

Genes are packed pretty tightly in the DNA of our genomes, and are regulated by sites in the DNA that are typically nearby and upstream with respect to the direction of transcription. But "nearby" could mean tens of thousands of basepairs away. Thus it has been found that a great deal of looping goes on to bring such distant regulatory sites close to the gene start site where they have their effect. This naturally raises the question of why such regulatory sites don't just loop over to some other gene distant on the chromosome, or even on a different chromosome. In the tight confines of the nucleus, such things are doubtless quite possible.

Enter the "insulator". These appear to be special proteins that bind DNA situated between gene regions, keeping the regulatory apparati of each functionally distinct. The authors of this work reviewed the field and then carried out new crosslinking experiments that track genome-wide which DNA segments are close to others, or to specific proteins. That means that they essentially "froze" the 3-D structure of the DNA relationships with a chemical that promiscuously crosslinks any DNA or proteins close to each other. They then cut and joined those proximate DNA segments to each other with ligase, and sequenced the junctions at large scale to determine all the junction points.

The fact is that polytene chromosomes are sort of blown-up representations of normal DNA in the nucleus, which is normally not just randomly jumbled about, but is arranged in loci and loops organized around gene regions. That means that this kind of experiment, which was conducted on early Drosophila embryo cells, and not on the salivary cells that generate polytene chromosomes, is looking at the native looping and regional structure in normal cells, at least at that particular developmental stage, which will have its unique pattern of genes turned on and off.

They found it was easy to discern topological structure in these nuclei, just as others have before. Specific regions of DNA are in close contact, while others are not. It is not a random jumble. More importantly, there are a set of about 5,000 zones that had high local interaction, but less interaction with others- they termed these topologically associated domains (TADs). The boundaries between these correlate with other work finding high accessibility to DNase, and finding proteins bound that are called insulators. Yet other work on the polytene chromosomes found that they exhibit about 5000-6000 visible bands, the borders of which are again highly accessible to DNase, and the sites of insulator protein binding.

The hallmark of insulator proteins is that when DNA sites for their binding are engineered between enhancers and the gene they typically regulate, they tend to cut that communication. The mechanism behind this is not clear yet. It could be because looping is a progressive process, starting locally and scanning out to the nearest gene, in a kind of sewing machine model.

In any case, the authors draw on all this work to put the pieces together, and claim that polytene banding, as detected by DNA and other stains, show this structure at the visible level, with topological gene units housed in the dark bands, and the light units housing intergeneic segments, with insulators in between. Since Drosophila has almost 16,000 genes, this indicates that many of these topological units house multiple genes, an example of which is the homeobox complex, where complex and coordinated regulation extends over several nearby genes.

Correlation between polytene banding and cross-link contacts, in one segment of the genome. Banding is on top, and cross-link contacts are show in color. The crosses form out of topological localities with high internal contact rates, and lower external contact. At bottom is shown a gene track of the region from Flybase. Blue vertical bits are exons.

Correlation between polytene banding and cross-link contacts, in one segment of the genome. Banding is on top, and cross-link contacts are show in color. The crosses form out of topological localities with high internal contact rates, and lower external contact. At bottom is shown a gene track of the region from Flybase. Blue vertical bits are exons.

This is problably not news to the field, which is why this paper was buried in an obscure journal, but it is nice to see new methods make sense of quite old historical problems, and to recognize that we were looking at significant and functional genomic features all the time, from the first staining of these giant chromosomes in 1881.