A Common Genetic Variant Is Associated with Adult and Childhood Obesity

See allHide authors and affiliations

Science  14 Apr 2006:
Vol. 312, Issue 5771, pp. 279-283
DOI: 10.1126/science.1124779


Obesity is a heritable trait and a risk factor for many common diseases such as type 2 diabetes, heart disease, and hypertension. We used a dense whole-genome scan of DNA samples from the Framingham Heart Study participants to identify a common genetic variant near the INSIG2 gene associated with obesity. We have replicated the finding in four separate samples composed of individuals of Western European ancestry, African Americans, and children. The obesity-predisposing genotype is present in 10% of individuals. Our study suggests that common genetic polymorphisms are important determinants of obesity.

Obesity is associated with an increased risk of type 2 diabetes mellitus, heart disease, metabolic syndrome, hypertension, stroke, and some forms of cancer (1). It is commonly assessed by calculating an individual's body mass index (BMI) [weight/(height)2 in kg/m2] as a surrogate measurement. Individuals with a BMI ≥ 25 kg/m2 are classified as overweight, and those with a BMI ≥ 30 kg/m2 are considered obese. Having a BMI over 25 kg/m2 increases the risk of death (2). Presently, 65% of Americans are overweight and 30% are obese (3). Genetic factors contribute significantly to the etiology of obesity (4, 5), with estimates of the heritability of BMI ranging from 30 to 70% (69).

To identify common genetic variants associated with elevated BMI, we have studied individuals from the National Heart, Lung, and Blood Institute (NHLBI)–Framingham Heart Study (FHS) (10). The participants were enrolled from the community without being selected for a particular trait or disease and were followed over 24 years (table S1). In this population, heritability estimates for BMI range between 37 and 54% (11, 12).

Using families from this sample, we performed a genome-wide association analysis, using a testing strategy for quantitative traits in family-based designs (13, 14). To bypass the multiple comparisons, we used a two-stage testing strategy implemented in the software package PBAT (15, 16). In the screening step, parental genotypes are used to select single-nucleotide polymorphisms (SNPs) and genetic models that best predict offspring phenotypes. Formally, the screening step estimates the power to detect an association in the second step (Fig. 1 and fig. S1). The second step uses a family-based association test (FBAT) (17) to test the selected SNPs for association with BMI levels using measured offspring genotypes. The FBAT is a generalization of the transmission disequilibrium test (18) and assesses whether the over- or undertransmission of an allele is correlated with offspring phenotypes. Because the allele that is transmitted from the parent to the offspring is selected stochastically, the test step is statistically independent of the screening step, which conditions only on parental genotypes (19, 20). Thus, the power estimates from the screening step do not bias the significance level of any subsequently computed FBAT statistic in the test step (13, 14, 19, 20). Hence, the FBAT results need to be adjusted only for the number of comparisons performed during the test step. A SNP that reaches significance after adjustment for the number of tests performed in the test step is considered significant at a genome-wide level (13). Following the guidelines of Van Steen et al. (13), we tested the 10 SNPs with the highest power estimates in the screening step for association, using the FBAT (21, 22).

Fig. 1.

Family-based design for the whole-genome SNP scan. The same data set was used to both screen and test SNP-phenotype associations (13). (A) In the screening step, the difference in offspring trait values due to a particular SNP allele is estimated based on the expected offspring genotypes, computed assuming Mendelian transmission of the parental genotypes. This information, along with the minor allele frequency, is used to compute the power of the FBAT to detect an association. (B) In the test step, the top 10–ranked SNPs are evaluated and the FBAT statistic is adjusted for 10 tests. SNPs that are significant after adjustment are also significant genome-wide. The screening and test steps use statistically independent components of the data; hence, the P values computed at the test step are not biased by the screening step (20).

We genotyped 116,204 SNPs in 694 participants from the FHS offspring cohort (20). After exclusions, 86,604 SNPs were tested for association with BMI. We used FBAT-PC, which incorporates BMI data across multiple exams and increases our power to detect a genetic effect (14). Of the top 10 SNPs tested under a recessive model [which we determined during the screening procedure had greatest power (Table 1)], only SNP rs7566605 reached overall significance (unadjusted FBAT-PC P value, 0.0026). Other genetic models that did not capture the underlying biology of the SNP were less robust. In such cases, rs7566605 was not ranked in the top 10 for power. The frequency of the rs7566605 C allele is 0.37, and the SNP is in Hardy-Weinberg equilibrium (Table 2). We found no association with BMI for Affymetrix SNPs close to rs7566605 (Fig. 2A and Table 2), This result was expected because none of these SNPs are good proxies for rs7566605 at a r2 threshold of 0.8 (Fig. 2B) (23).

Fig. 2.

SNPs present on the Affymetrix 100K chip near 2q14.1 Mapping of 100K SNPs in the chromosomal region from 2q14.1 to 2q14.2 is shown. (A) Known genes in the region of association. (B) A plot showing the pairwise r 2 values indicating the correlation between the SNP genotypes in our sample over this region prepared in Haploview (23). The rs7566605 SNP is shown in red. Black squares, r2 = 1; white squares, r2 = 0; squaresinshadesofgray, 0 < r2 < 1 (the intensity of the gray is proportional to r2) (for example, r2 between rs7566605 and rs3771935 equals 0.286).

Table 1.

Screening and testing of SNPs for association with BMI. Genome-wide SNPs (86,604) were screened using parental genotypes to find those likely to affect offspring BMI. The top 10 SNPs from the screening step (ranked by power from most likely to least likely) are shown. These SNPs were tested using offspring genotypes for association with BMI using the FBAT. The rs7566605 SNP is highlighted in bold.

Ranking from screenSNPChromosomeFrequencyInformative familiesP value FBAT
1 rs3897510 20p12.3 0.36 30 0.2934
2 rs722385 2q32.1 0.16 15 0.1520
3 rs3852352 8p12 0.33 34 0.7970
4 rs7566605 2q14.1 0.37 39 0.0026
5 rs4141822 13q33.3 0.29 27 0.0526
6 rs7149994 14q21.1 0.35 31 0.0695
7 rs1909459 14q21.1 0.39 38 0.2231
8 rs10520154 15q15.1 0.36 38 0.9256
9 rs440383 15q15.1 0.36 38 0.8860
10 rs9296117 6p24.1 0.40 44 0.3652
Table 2.

SNPs genotyped in the region of INSIG2 along with their minor allele frequency (MAF). Results for the FBAT are given for a recessive model without covariate adjustment. SNPs were also tested for Hardy-Weinberg equilibrium (HWE). The coordinates for INSIG2a are 118,562,280 to 118,583,824 and for INSIG2b are 118,570,224 to 118,582,624. The rs7566605 SNP is highlighted in bold.

FBAT (P value)Physical positionDistance from rs7566605HWE (P value)MAF
rs1385923 0.33562 118550089 2166 0.107 0.057
rs10490626 0.07777 118552071 184 0.118 0.087
rs7566605 0.00258 118552255 0 0.197 0.373
rs10490625 0.34606 118575786 23531 0.251 0.063
rs10490624 0.20954 118578722 26467 0.02 0.082

Another analysis based on a larger sample of 923 FHS individuals, including those already tested, indicated that the SNP is a significant predictor of BMI. For all exams, rs7566605 CC homozygotes are about 1 BMI unit heavier than individuals with GC or GG genotypes (P < 0.0001), regardless of sex or age (Fig. 3). CC homozygotes were also more likely to be obese than non-obese [for example, at exam 5, odds ratio (OR) = 1.33, 95% confidence interval (CI) (1.20 to 1.48)].

Fig. 3.

BMI as a function of age, sex, and rs7566605 genotype. Unadjusted mean values for BMI comparing individuals homozygous for the minor allele (CC) with those that are either heterozygous or homozygous for the major allele (CG or GG) are shown. Data are for 923 related offspring and combine all the measurements from the first five offspring exams. Panels are for the pooled sample and for sex-specific analysis. Data are shown as mean ± SEM for each age category. Data from individuals over age 60 are omitted because few of those individuals have the CC genotype.

Many reported associations with common gene polymorphisms have not been replicated, presumably because of factors such as population stratification, inadequate statistical power, and genotyping errors (24). Therefore, we sought to confirm the association between BMI and rs7566605 in five additional unrelated samples of varying ethnicity and age.

We examined the association of rs7566605 with obesity in 3996 participants of the KORA S4 cohort, a population of Western European ancestry in a town near Munich, Germany (table S2) (25). Genetic determinants of BMI have been studied in this cohort using candidate gene–based approaches (26). The mean BMI of rs7566605 CC homozygotes under a recessive model was 0.60 kg/m2 higher than for GC and GG genotypes combined (P = 0.008). The CC genotype was also significantly more frequent in obese individuals [BMI ≥ 30 kg/m2, OR = 1.32, 95% CI (1.05 to 1.66), P = 0.0167] than in non-obese individuals, again with a recessive model. These results confirmed the association that we found in the original genome scan of the FHS sample.

We also tested for an association in a case-control study consisting of self-described white or Caucasian subjects from Poland and the United States (1775 cases and 926 controls, table S3). Cases were drawn from the 90th to 97th percentile of the BMI distribution and controls were drawn from the 5th to 12th percentile, with distributions determined for each combination of gender, country of origin, and age by decade. The results for rs7566605 showed no evidence of heterogeneity between the Polish and U.S. subsamples, so we performed a combined analysis using a Mantel-Haenzsel procedure, as described (27). The CC genotype showed a significant association of the SNP with obesity under a recessive model [OR = 1.37, 95% CI (1.05 to 1.78), P = 0.02], with an odds ratio similar to that estimated from the KORA data.

The SNP was also genotyped in the Nurses Health Study cohort, using the 2726 control DNAs from two nested case-control studies examining diabetes (28) and breast cancer cases (29) (table S4). The participants are, by self-report, >95% Caucasian. No significant association or trend was observed between rs7566605 and BMI levels. There are fewer individuals with a high BMI in this sample as compared to the KORA and FHS samples (20). The lack of replication thus may reflect a different BMI distribution in the Nurses Health Study or differences in environment and lifestyle.

The rs7566605 SNP was also typed in 368 Western European parent-child trios in which either a child or adolescent offspring was obese (mean BMI percentile 98.4 ± 1.93) (30) (table S5). Analysis using the transmission disequilibrium test (18) revealed an overtransmission of the C allele to the obese offspring (P = 0.0017), indicating that rs7566605 is associated with BMI from an early age.

The finding was also replicated in two samples from a self-reported African American population from Maywood, Illinois (table S6). The samples consisted of 866 individuals from a family-based sample consisting of nuclear families and sibships and 402 unrelated individuals (186 chosen from the top and bottom quartiles of the BMI distribution, plus the parents from the family-based sample, whose phenotypic information is not used in the family-based analysis). The samples were dichotomized into obese (BMI ≥ 30 kg/m2) and lean (BMI < 30 kg/m2) individuals. The CC genotype was associated with obesity under recessive models in both the family-based (P = 0.009) and unrelated samples (OR = 2.36, P = 0.04 by Fisher's exact test), adding further confirmation to our finding.

The rs7566605 CC genotype was associated with obesity in three different family-based samples and three studies of unrelated individuals. A meta-analysis of all the case-control samples showed that the CC genotype was significantly associated with obesity under a recessive model, with an OR of 1.22 [95% CI (1.05 to 1.42), P = 0.008; Table 3].

Table 3.

Summary of study results and meta-analysis. All values given are for a recessive model. NHS, Nurses Health Study; TDT, transmission disequilibrium test; PBAT, tools for FBATs.

StudyDesignTotal genotypedObeseNon-obeseTestTotal number of familiesP value
FHS Family 694 PBAT 288 0.0030
Maywood Family 866 361 505 PBAT dichotomous 342 0.0090
Maywood Family 866 PBAT quantitative 342 0.0700
Essen children/adolescents Trios 1104 368 TDT 368 0.0020
KORA Cohort 3996 Linear regression Sex, age 0.0080
NHS Cohort 2726 Linear regression Age
Obese Non-obese Test OR 95% CI
KORA Cohort 3996 935 3061 Logistic regression 1.32 1.06-1.65 0.0167
NHS Cohort 2726 503 2223 Chi-squared test 0.81 0.58-1.13
American/Polish Caucasian Case-control 2761 1835 926 Chi-squared test 1.40 1.08-1.78 0.0200
Maywood Case-control 398 216 182 Fischer's exact test 2.36 0.0400
Pooled OR (all) 9881 3445 6426 2-tailed Mantel-Haenszel 1.22 1.05-1.42 0.0080

The result from both Western Europeans and African Americans suggests that the risk allele predates human migration out of Africa. By comparing chromosomes of individuals with different ancestry from the HapMap project (31) in a 750–kilobase (kb) region near INSIG2 (insulin-induced gene 2), it is apparent that rs7566605 is present on a single haplotype in the European samples (haplotype 2) and on two haplotypes in the Yoruba samples (haplotypes 2 and 8) (Fig. 4; the rs7566605 C allele is shown in red). All of these haplotypes extend to include the intergenic region near INSIG2 and the nearby FLJ10996 gene. We also typed the rs11684454 SNP on European haplotype 2, located within an intron of FLJ10996. It too was associated with BMI in KORA samples (P = 0.0093) and in Western European parent-child trio samples (P = 0.0015), confirming that the associated region in Europeans extends at least 70 kb. Nonsynonymous SNPs (rs2229616 and rs17512204) in FLJ10996 were not associated with BMI in Western European parent-child trio samples. We are not aware of any nonsynonymous SNPs in INSIG2. The region harboring the causal variant (Fig. 4) may be narrowed by detailed analysis of African haplotypes 2 and 8, because they are only similar close to rs7566605.

Fig. 4.

Linkage disequilibrium around rs7566605 Alignment of the phased HapMap genotypes (release 16c.1) for 120 chromosomes from Utah residents of Northern and Western European ancestry (CEU) and 120 chromosomes from Yoruba in Ibadan, Nigeria (YRI) over a 750-kb region surrounding rs7566605. A core haplotype block containing rs7566605 was identified using Haploview (23) and was used to organize the chromosomes from left to right. The region containing this core haplotype block is highlighted in blue, and each haplotype with a frequency greater than 1% is designated with a number at the bottom. Haplotypes shared between the CEU and YRI samples are circled. The SNP rs7566605 minor allele is present on one major haplotype in Utah residents (haplotype 2) and on two in Yoruba samples (haplotypes 2 and 8). Minor alleles are represented with dark grey boxes and major alleles with light gray boxes, with the exception of the rs7566605 minor allele (shown in red). SNPs not found to be polymorphic in one of the samples are shaded yellow. SNPs present on the Affymetrix Mapping 100K set are indicated in black between the CEU and YRI panels.

rs7566605 is 10 kb upstream of the transcription start site of INSIG2. INSIG2 is an attractive candidate gene for the quantitative trait locus affecting BMI because its protein product inhibits the synthesis of fatty acid and cholesterol (32). For example, in the ZDF (fa/fa) rat model, overexpression of insig2 in the liver reduces plasma triglyceride levels (33). A model in which altered insig2 activity leads to obesity by elevating plasma triglyceride levels with subsequent storage in adipose tissue is certainly plausible. Indeed, the INSIG2 region has been implicated as a factor in obesity by linkage studies in mice (34) and humans (35).

By studying a population not selected for a particular phenotype (the FHS Cohort), we hoped to identify common genetic variants affecting BMI levels. The high frequency of the rs7566605 C allele suggests that it is ancient and is consistent with the hypothesis that such alleles have become deleterious only in modern times (36). Although these variants are likely to carry low relative risk, their impact on health is substantial because of their prevalence in the population.

We expect that other common variants affecting BMI were not detected in our screen because of either lack of power in the screening step or insufficient coverage of the genome at the density provided by the Affymetrix 100K SNP chip. Taken together, the findings in four of the five samples from different populations using two different study designs confirm a consistent association between the rs7566605 polymorphism and obesity.

Supporting Online Material

Materials and Methods

Fig. S1

Tables S1 to S6


References and Notes

View Abstract

Stay Connected to Science

Navigate This Article