By sequencing extremely large numbers of people, two new studies add to the growing realization that the genetic component of common diseases comes in the form of many rare genetic variants, rather than a few common ones.
First author Matthew Nelson, who is director of the statistical genetics group at GlaxoSmithKline plc, told BioWorld Today that the data are, in a sense, the inverse of the variants that have been identified in the past few years through genomewide-association studies such as the Human Genome Project and the HapMap.
In that sense, they uncover one source of the "missing heritability" of diseases that cannot be explained by current understanding of the human genome. (See BioWorld Today, Oct. 28, 2010, and Nov. 5, 2010.)
Those studies, he told BioWorld Today, showed there are many common variants that affect common diseases, but the effects of such common variants tend to be small. Rare variants, on the other hand, are more likely to have a large effect than common ones, "but they are not going to be important for very many people overall."
Still, coauthor Stephanie Chissoe, who is the head of genetics at GSK, told BioWorld Today, the fact that variants are rare does not mean they are not useful for drug discovery. Cholesterol-lowering statins, for example, were identified because of rare variants that lead to extremely low cholesterol levels in their carriers.
In their paper, Chissoe, Nelson, co-corresponding author John Novembre of the University of California at Los Angeles, and their colleagues took an indepth look at the sequences of 202 genes that they considered likely drug targets. They sequenced those genes in roughly 14,000 individuals—a "uniquely large" sample, Novembre told BioWorld Today.
The authors identified, on the average, one variant every 17 base pairs. But even in such a large sample, three-quarters of the variants the team identified were specific to one or two individuals. The authors of the second paper, which was broadly similar in its conclusions about the overall frequency and geographic distribution of rare variants, sequenced the exomes of roughly 2,500 individuals. Both studies were published back-to-back in the May 24, 2012, advance online issue of Science.
The authors also looked at the ratio of synonymous SNPs, which do not change the amino acid composition of the protein and so, usually do not have an effect on protein function to variants which do affect amino acid sequence and so are likely to lead to changes in function. Comparing that information for rare and frequent variants, the authors estimated that about 70% of those rare variants would have a negative effect on gene function.
The results are "consistent with the idea that there are a lot of things that affect our biology," Novembre said. "But we don't see them when we use small sample sizes."
Nelson, Novembre and their team also found that those variants that were rare, but still present in more than one or two people, were not evenly distributed in the populations they looked at, but instead tended to cluster geographically.
Nor was the overall rate of mutations evenly distributed between populations. Nelson, Novembre and their colleagues found that European populations had a south-to-north gradient, with mutations being less frequent in northern European populations.
Novembre said that particular finding is important for another type of study that tries to identify disease genes: the case control study.
Genomewide-association studies sequence many people and correlate variants with disease frequencies, but do not have controls per se. A case control study, on the other hand, compares the genetic sequences of people with a disease to those without it to see whether there are more rare variants in the disease population.
A so-called excess of variants indicates that mutations in the gene under study affect the disease. But the new findings, Novembre said, indicated cases and controls need to be geographically matched: Comparing cases from northern Europe to controls from farther south will lead to an excess of variants in any gene, simply because the population has more variants overall.