You are currently viewing the abstract.View Full Text
Evolution, Gene Number, and Disease
Slight variations in the numbers of copies of genes influence human disease and other characters. Variants can be hard to detect when they lie in heavily duplicated and widely similar regions of sequence known as “dark matter.” Sudmant et al. (p. 641) have methods to tease apart the duplicated regions to reveal singly unique nucleotide identifiers. These have turned out to be among the most variable seen in different human population groups—most notably among genes for neurodevelopment and neurological diseases. Such polymorphisms can be genotyped with specificity and may help us understand how variation in copy number may affect human evolution and disease.
Copy number variants affect both disease and normal phenotypic variation, but those lying within heavily duplicated, highly identical sequence have been difficult to assay. By analyzing short-read mapping depth for 159 human genomes, we demonstrated accurate estimation of absolute copy number for duplications as small as 1.9 kilobase pairs, ranging from 0 to 48 copies. We identified 4.1 million “singly unique nucleotide” positions informative in distinguishing specific copies and used them to genotype the copy and content of specific paralogs within highly duplicated gene families. These data identify human-specific expansions in genes associated with brain development, reveal extensive population genetic diversity, and detect signatures consistent with gene conversion in the human species. Our approach makes ~1000 genes accessible to genetic studies of disease association.
↵† A full list of participants and institutions is available in the SOM online.