Rare Variants of IFIH1, a Gene Implicated in Antiviral Responses, Protect Against Type 1 Diabetes

See allHide authors and affiliations

Science  17 Apr 2009:
Vol. 324, Issue 5925, pp. 387-389
DOI: 10.1126/science.1167728


Genome-wide association studies (GWASs) are regularly used to map genomic regions contributing to common human diseases, but they often do not identify the precise causative genes and sequence variants. To identify causative type 1 diabetes (T1D) variants, we resequenced exons and splice sites of 10 candidate genes in pools of DNA from 480 patients and 480 controls and tested their disease association in over 30,000 participants. We discovered four rare variants that lowered T1D risk independently of each other (odds ratio = 0.51 to 0.74; P = 1.3 × 10–3 to 2.1 × 10–16) in IFIH1 (interferon induced with helicase C domain 1), a gene located in a region previously associated with T1D by GWASs. These variants are predicted to alter the expression and structure of IFIH1 [MDA5 (melanoma differentiation-associated protein 5)], a cytoplasmic helicase that mediates induction of interferon response to viral RNA. This finding firmly establishes the role of IFIH1 in T1D and demonstrates that resequencing studies can pinpoint disease-causing genes in genomic regions initially identified by GWASs.

Genome-wide association studies (GWASs) of common multifactorial diseases have identified dozens of loci harboring disease-causing sequence variants (1, 2). However, because the human genome contains regions of strong linkage disequilibrium, a disease-associated locus sometimes encompasses several genes and multiple tightly associated polymorphisms, making it difficult to pinpoint the causal variant by association mapping. Moreover, in many instances, the single nucleotide polymorphisms (SNPs) showing the most significant disease association map to genomic regions with no obvious function, thus providing few clues as to how causal variants affect the disease gene.

One way to overcome this limitation is to search for sequence variants that are rare in the population (frequency < 3%) but that reside in exons and other genomic regions of known function to identify polymorphisms that alter expression of the gene and/or function of the protein product (3). If rare disease-associated variants with obvious functional effects are found in a candidate gene that harbors a common disease-associated variant, then the gene is likely to be causal. Recent technological advances in high-throughput sequencing (4) provide an opportunity to resequence multiple genetic regions in hundreds of participants and discover rare sequence variants (57). We used 454 sequencing (8) to search for rare variants in 10 candidate genes and to study their association with type 1 diabetes (T1D), previously known as insulin-dependent diabetes mellitus (IDDM). T1D is a common disorder that develops as a result of a complex interaction of genetic and environmental factors leading to the immune-mediated destruction of the insulin-producing pancreatic β cells. To date, 15 loci associated with T1D have been identified in the human genome (913).

Of the 10 genes that we selected, 6 contain common T1D-associated polymorphisms: PTPN22, PTPN2, IFIH1, SH2B3, CLEC16A, and IL2RA (10, 11, 1416). We also studied two genes that contain rare mutations causing monogenic syndromes that may include immune-mediated diabetes: (i) FOXP3, which is responsible for X-linked syndrome of immunodysregulation-polyendocrinopathy-enteropathy [Online Mendelian Inheritance in Man (OMIM) 304790], and (ii) AIRE, which is responsible for the autoimmune polyendocrinopathy-candidiasis-ectodermal dystrophy syndrome (OMIM 240300). Finally, we studied KCNJ11 because mutations in this gene cause permanent neonatal diabetes, an insulin-dependent diabetes of the nonimmune etiology that can be misdiagnosed as T1D in young children (17). We also studied IAN4L1 because the ortholog of this gene is associated with immune-mediated diabetes in the rat model of T1D (18, 19).

We resequenced 144 target regions that covered exons and regulatory sequences of the 10 genes, 31 kb in total (table S1 and T1DBase:, in DNA of 480 T1D patients and 480 healthy controls from Great Britain arranged in 20 DNA pools (20). We generated 9.4 million reads with an average length of 250 bases and identified a total of 212 SNPs (20). We classified 33 of them as common because their estimated minor allele frequency (MAF) was >3% (table S2), and we categorized 179 as rare because their estimated MAF was <3%. Of the 179 rare SNPs, 156 were previously unseen (table S3). In the pooled samples, it was impossible to distinguish rare insertion/deletion polymorphisms from sequencing errors; thus, we studied nucleotide substitutions only.

Our goal was not only to discover previously unseen rare variants but also to test their association with T1D in the same experiment, comparing allele frequency in DNA pools of patients and controls. Therefore, it was important that sequence reads generated from the DNA pools estimated accurately allele frequency among individuals that contributed DNA to these pools. To test this, we analyzed eight SNPs from the sequenced regions that had been genotyped previously. We found good correlation between allele frequency in the individual samples and its estimate in the DNA pools (correlation coefficient r = 0.99) (fig. S1), demonstrating that high-throughput sequencing of the DNA pools can be used to accurately measure allele frequencies. We then tested association of all 212 SNPs with T1D, comparing pooled samples of cases and controls. As expected, we confirmed the previously known association of the common SNPs with T1D (P = 0.02 to 5 × 10–7, χ2 test) (table S2). Among rarer SNPs that had not been previously studied for association with T1D, we noted that the two most associated variants, rs35667974 and rs35337543 (P = 0.0049 and 0.000044, exact test) (Table 1), reside within the interferon induced with helicase C domain 1 (IFIH1) gene. We did not find evidence of association for rare variants in other genes, except for potential associations of the two SNPs located in introns of the CLEC16A gene (Table 1 and table S3).

Table 1.

Association analysis of rare variants in sequenced pools of DNA from T1D patients and controls. Rare SNPs (MAF < 3%) associated with T1D with P < 0.05 are shown in this table. Results for all rare SNPs are shown in table S3. n, number.

View this table:

We next studied two IFIH1 and two CLEC16A SNPs in individual DNA samples from 8379 T1D patients and 10,575 controls from Great Britain. We also studied IFIH1 SNPs in 3165 families from Europe and USA that include one or more offspring with T1D and their parents. The two rare intronic CLEC16A SNPs were not associated (table S4), whereas both rare IFIH1 SNPs demonstrated strong statistical evidence of association with T1D, showing consistent effect in the case-control and family collections (combined P = 2.1 × 10–16 for rs35667974 and 1.4 × 10–4 for rs35337543, score test) (Table 2). SNP rs35667974 in exon 14 changes a conserved amino acid from Ile923 to Val923 (I923V) (fig. S2), whereas SNP rs35337543 resides within a conserved splice donor site at position +1 in intron 8. Apart from these two SNPs, our sequencing study identified other rare IFIH1 SNPs, including three nonsynonymous SNPs (nsSNPs) (ss107794691/K349R, ss107794690/T702I, and rs10930046/H460R), another SNP in a conserved splice donor site at position +1 in intron 14 (rs35732034), and a nonsense mutation in exon 10 (rs35744605). We genotyped these rare SNPs and found evidence of T1D association for the non-sense mutation rs35744605 and SNP rs35732034 located in the conserved splice site (Table 2), but not for nsSNPs K349R, T702I, or H460R (table S5). We did not genotype IFIH1 intronic and synonymous SNPs or very rare nsSNP (MAF ≤ 0.2%).

Table 2.

Association analysis of the four rare IFIH1 polymorphisms in T1D patients and controls and in families that have one or more offspring with T1D and their parents. Results for additional IFIH1 SNPs are shown in table S5. CI, confidence interval; T/NT, number of alleles transmitted and nontransmitted to the affected offspring.

View this table:

We calculated linkage disequilibrium and found that it is low (r2 < 0.04) between all four associated rare variants, indicating that association of one SNP cannot be explained by any of the other SNPs. We also genotyped two common nsSNPs, rs3747517/R843H and rs1990760/T946A (MAF > 25%), that had been found to be associated with T1D by GWASs (10, 12, 21), and we confirmed their association (table S5). We also used logistic regression analyses (22) and found that all four rare variants (rs35667974, rs35337543, rs35732034, and rs35744605) were associated with T1D, independently of each other and of the common nsSNP rs1990760/T946A (table S6), so these rare variants do not account for association of rs1990760/T946A detected previously by GWASs. Two common nsSNPs were in strong linkage disequilibrium with each other (r2 = 0.60), and association of rs1990760/T946A explained the effect of rs3747517/R843H. Thus, in the IFIH1 gene, four rare polymorphisms and one common nsSNP (rs1990760/T946A) show independent association with T1D (fig. S3), although we cannot exclude a possibility that additional variants with weaker effects also exist in this gene. Thus, we demonstrated T1D association and measured effects of each of the newly discovered rare variants separately, without grouping them (5, 6).

In the previous GWAS of 12,000 common nsSNPs, we identified a T1D-associated locus on chromosome 2q24 that included IFIH1, along with the FAP and GCA genes and part of the KCNH7 gene (fig. S4) (10). Although IFIH1 is a biologically plausible candidate gene, there was no evidence to indicate which of these genes is causative for T1D. Discovery of multiple rare T1D-associated variants in IFIH1 now points to its etiological role in T1D, because it is highly unlikely that multiple untested variants elsewhere in the region could explain association of the rare IFIH1 variants via linkage disequilibrium. We did not resequence the FAP, GCA, and KCNH7 genes, so we cannot formally exclude that they might also contain rare T1D-associated variants. This possibility is unlikely, but if true, it would not negate the role of IFIH1; instead, it would imply that IFIH1 is not the only T1D gene in this region.

All four associated rare IFIH1 variants have predicted biological effects, either truncating the protein (non-sense mutation rs35744605) or affecting essential splicing positions (rs35337543 and rs35732034) or a highly conserved amino acid (rs35667974/I923V) (fig. S2). These rare IFIH1 variants have stronger protective effects on T1D risk [odds ratio (OR) = 0.51 to 0.74] than does the common nsSNP rs1990760/T946A (OR = 0.86) (table S5). For example, rare individuals carrying valine at position 923 of the IFIH1 protein have only ∼50% risk of developing T1D compared with those who carry isoleucine. Our results suggest that, in complex diseases such as T1D, there may be no (or very few) low-frequency variants with very strong effects (allele OR > 3), even if such variants have large impacts on a certain molecule's function. This is the case possibly because, in complex multifactorial diseases, such a molecule and its biological pathway are just one of many contributing to the pathogenesis. Nevertheless, the discovery of such rare variants with the use of high-throughput sequencing will help to pinpoint disease genes in the associated loci found by GWASs in various complex diseases.

IFIH1 (interferon induced with helicase C domain 1), also known as MDA5 (melanoma differentiation-associated protein 5), is a 1025–amino acid cytoplasmic protein that recognizes RNA of picornaviruses and mediates immune activation (23). Infection with enteroviruses, which belong to the picornavirus family, is more common among newly diagnosed T1D patients and prediabetic subjects than in the general population, and it precedes the appearance of autoantibodies (markers of prediabetes) (24). Enteroviruses are small RNA viruses that include coxsackie A and B, polioviruses, and echoviruses and cause common and often asymptomatic infections. Upon infection, IFIH1 senses the presence of viral RNA in the cytoplasm, triggers activation of NF-κB and interferon regulatory factor pathways, and induces antiviral interferon-β response (25). Although the mechanisms by which IFIH1 polymorphisms contribute to T1D pathogenesis remain to be explored, we note that one of the protective variants is a non-sense mutation leading to a truncated 626–amino acid protein lacking the C-terminal helicase domain (fig. S3), whereas two other protective variants localize to the conserved splice donor sites and probably disrupt normal splicing of the IFIH1 transcript. This suggests that variants, which are predicted to reduce function of the IFIH1 protein, would decrease the risk of T1D, whereas normal IFIH1 function is associated with T1D. To elucidate a biological mechanism linking enterovirus infection with T1D, future functional experiments should test whether normal immune activation caused by enterovirus infection and mediated by IFIH1 protein may stimulate autoreactive T cells leading to T1D and whether blocking IFIH1 can disrupt this pathogenic mechanism.

We have found that rare alleles of all associated IFIH1 polymorphisms consistently protect from T1D, whereas IFIH1 alleles carried by the majority of the population predispose to the disease. This observation suggests that variants that disrupt IFIH1 function in the host antiviral response have been negatively selected, rather than positively selected because they confer protection from T1D.

Supporting Online Material

Materials and Methods

SOM Text

Figs. S1 to S5

Tables S1 to S6


References and Notes

View Abstract

Navigate This Article