An Erythroid Enhancer of BCL11A Subject to Genetic Variation Determines Fetal Hemoglobin Level

See allHide authors and affiliations

Science  11 Oct 2013:
Vol. 342, Issue 6155, pp. 253-257
DOI: 10.1126/science.1242088

BCL11A Variants

Recent chromatin mapping data have suggested that trait-associated variants often mark regulatory DNA. However, there has been little rigorous experimental investigation of regulatory variation. Bauer et al. (p. 253; see the Perspective by Hardison and Blobel) performed an in-depth study of the BCL11A fetal hemoglobin-associated locus. The trait-associated variants revealed a chromatin signature that enhanced erythroid development. The enhancer was required for erythroid expression of BCL11A and thus for globin gene expression.


Genome-wide association studies (GWASs) have ascertained numerous trait-associated common genetic variants, frequently localized to regulatory DNA. We found that common genetic variation at BCL11A associated with fetal hemoglobin (HbF) level lies in noncoding sequences decorated by an erythroid enhancer chromatin signature. Fine-mapping uncovers a motif-disrupting common variant associated with reduced transcription factor (TF) binding, modestly diminished BCL11A expression, and elevated HbF. The surrounding sequences function in vivo as a developmental stage–specific, lineage-restricted enhancer. Genome engineering reveals the enhancer is required in erythroid but not B-lymphoid cells for BCL11A expression. These findings illustrate how GWASs may expose functional variants of modest impact within causal elements essential for appropriate gene expression. We propose the GWAS-marked BCL11A enhancer represents an attractive target for therapeutic genome engineering for the β-hemoglobinopathies.

Genome-wide association studies (GWASs) have identified numerous common single-nucleotide polymorphisms (SNPs) associated with human traits and diseases. However, advancing from genetic association to causal biologic process has been challenging (1). Recent genome-scale chromatin mapping studies have highlighted the enrichment of GWAS variants in regulatory DNA elements, suggesting many causal variants may affect gene regulation (26). GWASs of HbF level have identified trait-associated variants at BCL11A (supplementary text) (712). The transcriptional repressor BCL11A has been validated as a direct regulator of HbF level (1318). Although constitutive BCL11A deficiency results in embryonic lethality and impaired lymphocyte development (19, 20), erythroid-specific deficiency of BCL11A counteracts developmental silencing of embryonic and fetal globin genes and rescues the hematologic and pathologic features of sickle cell disease (SCD) in mouse models (17).

To further understand how common genetic variation affects BCL11A, HbF level, and β-globin disorder severity, we compared the distribution of the HbF-associated SNPs at BCL11A with deoxyribonuclease I (DNase I) sensitivity, which is an indicator of chromatin state suggestive of regulatory potential. In primary human erythroblasts, three peaks of DNase I hypersensitivity were observed in intron-2, adjacent to and overlying the HbF-associated variants (Fig. 1A). We term these DNase I hypersensitive sites (DHSs) +62, +58, and +55 based on distance in kilobases from the transcription start site (TSS) of BCL11A. Brain and B-lymphocytes, two tissues that express high levels, and T-lymphocytes, which do not express BCL11A, showed distinct patterns of DNase I sensitivity at the BCL11A locus, with a paucity of hypersensitivity overlying the trait-associated SNPs (Fig. 1A and fig. S1).

Fig. 1 Chromatin state and TF occupancy at BCL11A.

(A) ChIP-seq from human erythroblasts with indicated antibodies. DNase I cleavage densities are from indicated human tissues. Three erythroid DHSs termed +62, +58, and +55 are based on distance in kilobases from BCL11A TSS. BCL11A transcription is from right to left. (B) ChIP–quantitative PCR from human erythroblasts at BCL11A intron-2. DHSs +62, +58, and +55 are boxed. Enrichment at negative (GAPDH and OCT4) and positive control (β-globin LCR HS3 and α-globin HS-40) loci are displayed. (C) Chromosome conformation capture in human erythroblasts using BCL11A promoter as anchor. Error bars indicate SD.

Chromatin immunoprecipitation sequencing (ChIP-seq) demonstrated histone modifications with an enhancer signature overlying the trait-associated SNPs at BCL11A intron-2, including the presence of H3K4me1 and H3K27ac and absence of H3K4me3 and H3K27me3 marks (Fig. 1A and fig. S1). The major erythroid TFs GATA1 and TAL1 also occupy this enhancer region. ChIP–quantitative polymerase chain reaction (PCR) confirmed three discrete peaks of GATA1 and TAL1 binding within BCL11A intron-2, each falling within an erythroid DHS (Fig. 1B). A common feature of distal regulatory elements is long-range interaction with cognate promoters. We evaluated the interactions between the BCL11A promoter and fragments across 250 kb of the BCL11A locus using a chromosome conformation capture assay. The greatest promoter interaction was observed within the region of intron-2 containing the trait-associated SNPs (Fig. 1C).

We hypothesized that the causal trait-associated SNPs could function by modulating critical cis-regulatory elements. Therefore, we performed extensive genotyping of SNPs within the three erythroid DHSs +62, +58, and +55 in 1263 DNA samples from the Cooperative Study of SCD (CSSCD) (21). We used 1178 individuals and 38 SNPs for association testing (fig. S2A). Analysis of common variants [minor allele frequency (MAF) > 1%] revealed that rs1427407 in DHS +62 had the strongest association to HbF level (P = 7.23 × 10−50) (Fig. 2A, fig. S2B, and supplementary text). We identified associations to HbF level within the three DHSs that remained after conditioning on rs1427407 (Fig. 2A and fig. S2B), which is consistent with the hypothesis that multiple functional SNPs within the composite enhancer act combinatorially to influence BCL11A regulation. The most significant residual association was for rs7606173 in DHS +55 (P = 9.66 × 10−11).

Fig. 2 Regulatory variants at BCL11A.

(A) Genotype data obtained in 1178 individuals from CSSCD for 38 variants within BCL11A +62, +58, or +55 DHSs. Shown are most highly significant associations to HbF level among common (MAF > 1%) SNPs (n = 10 variants) before (rs1427407) or after (rs7606173) conditional analysis on rs1427407. SNP coordinates are chromosome 2, build hg19. (B) Chromatin from erythroblasts of individuals heterozygous for rs1427407, immunoprecipitated by GATA1 or TAL1 and pyrosequenced to quantify the relative abundance of the rs1427407-G allele. Composite half E-box–GATA motif previously identified (23) is shown. (C) gDNA and cDNA from erythroblasts of individuals heterozygous for rs1427407, rs7606173, and rs7569946. Haplotyping demonstrated rs7569946-G, rs1427407-G, and rs7606173-C on the same chromosome in each. Pyrosequencing was performed to quantify the relative abundance of the rs7569946-G allele.

The SNP rs1427407 falls within a peak of GATA1 and TAL1 binding (Fig. 1, A and B). The minor T-allele disrupts the G-nucleotide of a sequence element resembling a half E-box/GATA composite motif [CTG(n9)GATA], a consensus sequence enriched for chromatin bound by GATA1 and TAL1 complexes in erythroid cells (22, 23). We identified five primary erythroblast samples from individuals heterozygous for the major G-allele and minor T-allele at rs1427407 and subjected these samples to ChIP followed by pyrosequencing. As anticipated, we observed an even balance of alleles in the input DNA. However, we detected more frequent binding to the G-allele as compared with the T-allele in both the GATA1 and TAL1 immunoprecipitated chromatin samples (Fig. 2B).

Because the common synonymous SNP rs7569946 lies within exon-4 of BCL11A, it can be used to discriminate expression of alleles. We identified three primary erythroblast samples doubly heterozygous for the rs1427407–rs7606173 haplotype and rs7569946. For each sample, we determined by means of molecular haplotyping that the major rs7569946 G-allele was in phase with the low-HbF–associated rs1427407–rs7606173 G–C haplotype (table S4) (24, 25). Pyrosequencing revealed that whereas the alleles were balanced in genomic DNA (gDNA), significant imbalance was observed in complementary DNA (cDNA) with 1.7-fold increased expression of the low-HbF–linked G-allele of rs7569946 (Fig. 2C and supplementary text).

To understand the context within which these apparent regulatory trait-associated SNPs play their role, we explored the function of the harboring composite element. We cloned a 12.4-kb (+52.0 to 64.4 kb from TSS) human gDNA fragment containing the three erythroid DHSs in order to assay enhancer potential in a mouse transgenic lacZ reporter assay (fig. S4). Endogenous BCL11A shows abundant expression throughout the developing central nervous system, with much lower expression observed in the fetal liver (26). In contrast, we observed in the transgenic embryos reporter gene expression largely confined to the fetal liver, the site of definitive erythropoiesis, with weaker expression noted in the central nervous system (Fig. 3A).

Fig. 3 The GWAS-marked BCL11A enhancer is sufficient for adult-stage erythroid expression.

(A) A 12.4-kb fragment of BCL11A intron-2 (+52.0 to 64.4 kb from TSS) was cloned to a lacZ reporter construct. Shown is a transient transgenic mouse embryo from 12.5 dpc X-gal stained. Arrowhead indicates liver. (B) Cell suspensions isolated from peripheral blood (PB) and fetal liver (FL) of stable transgenic embryos at 12.5 dpc X-gal stained. (C) Sorted erythroblasts and B-lymphocytes from young adult stable transgenic mice subject to X-gal staining or RNA isolation followed by quantitative reverse transcription (RT)–PCR. Gene expression was normalized to glyceraldehyde-3-phosphate dehydrogenase and expressed relative to T-lymphocytes. Error bars indicate SD.

A characteristic feature of globin gene and BCL11A expression is developmental regulation (supplementary text). In stable transgenic BCL11A +52.0- to 64.4-kb reporter lines at 12.5 days post coitum (dpc), circulating primitive erythrocytes failed to stain for X-gal, whereas definitive erythroblasts in fetal liver robustly stained positive (Fig. 3B). Endogenous BCL11A was expressed at 10.4-fold–higher levels in B-lymphocytes as compared with erythroblasts. LacZ expression was restricted to erythroblasts and not observed in B-lymphocytes (Fig. 3C). These results indicate that the GWAS-marked BCL11A intron-2 regulatory sequences are sufficient to specify developmentally restricted, erythroid-specific gene expression.

We aimed to disrupt the enhancer to investigate its requirement for BCL11A expression. Because there are no suitable adult-stage human erythroid cell lines, we turned to the mouse erythroleukemia (MEL) cell line. We observed an orthologous enhancer signature at intron-2 of mouse Bcl11a indicated by sequence homology, erythroid-specific DNase I hypersensitivity, characteristic histone marks, and GATA1/TAL1 occupancy (fig. S6) (22, 27). Sequence-specific nucleases can produce small chromosomal deletions via nonhomologous end joining (NHEJ)–mediated repair (28). We engineered transcription activator-like effector nucleases (TALENs) to introduce double-strand breaks to flank the orthologous 10-kb Bcl11a intron-2 sequences carrying the erythroid enhancer chromatin signature (fig. S7A). Three different clones were isolated that had undergone biallelic excision of the intronic segment (figs. S7 and S8 and supplementary text). BCL11A transcript was profoundly reduced in the absence of the orthologous erythroid composite enhancer (Fig. 4A). BCL11A protein expression was not detectable in the enhancer-deleted clones (Fig. 4B). In the absence of the BCL11A enhancer, embryonic globin gene derepression was pronounced, with the ratio of embryonic εy to adult β1/2 globin increased by a mean of 364-fold (fig. S9).

Fig. 4 The GWAS-marked BCL11A enhancer is necessary for erythroid but dispensable for nonerythroid expression.

(A) Three MEL and two pre-B lymphocyte clones with biallelic deletion of the orthologous Bcl11a erythroid enhancer (Δ50.4 to 60.4) subject to quantitative RT-PCR. (B) Immunoblot of Δ50.4 to 60.4 MEL and pre-B lymphocyte clones.

To examine potential lineage-restriction of the requirement for the +50.4- to 60.4-kb intronic sequences for BCL11A expression, we evaluated their loss in a nonerythroid context. The same strategy of introduction of two pairs of TALENs to obtain clones with NHEJ-mediated deletion was used in a pre-B lymphocyte cell line. In contrast to the erythroid cells, BCL11A expression was retained in the Δ50.4- to 60.4-kb enhancer deleted pre-B cell clones at both the RNA and protein levels (Fig. 4, A and B). These results indicate that the orthologous erythroid enhancer sequences are essential for erythroid gene expression but are not required in B-lymphoid cells for integrity of transcription from the Bcl11a locus.

The prior identification of BCL11A as a critical repressor of HbF levels has raised new hope for mechanism-based therapeutic approaches to the β-hemoglobinopathies (29). However, the paradox that genetic variation at BCL11A is common, well-tolerated, and disease-protective despite the critical roles of BCL11A in neurogenesis and lymphopoiesis (19, 20, 30) has remained unresolved. We have demonstrated that the HbF-associated variants localize to an erythroid enhancer of BCL11A. Through allele-specific analyses, we show that genetic variation within this enhancer is associated with modest impact on TF binding, BCL11A expression, and HbF level. Relatively small effect sizes associated with individual variants may not be surprising given that most single-nucleotide substitutions, even within critical motifs, result in only modest loss of enhancer activity (31, 32). In contrast, loss of the BCL11A enhancer results in the absence of BCL11A expression in the erythroid lineage. Most trait-associated SNPs identified by GWAS are noncoding and have small effect sizes (1, 33). The impact of GWAS-identified SNPs on biological processes is often uncertain. Our findings underscore how a modest influence engendered by an individual noncoding variant neither predicts nor precludes a profound contribution of an underlying regulatory element.

Challenges to inhibiting BCL11A for mechanism-based reactivation of HbF include the supposedly “undruggable” nature of transcription factors (34) and its important nonerythroid functions (20, 30). With recent developments in their efficiency and precision, sequence-specific nucleases can be designed to exquisitely target genomic sequences of interest (3537). We propose the GWAS-identified enhancer of BCL11A as a particularly promising therapeutic target for genome engineering in the β-hemoglobinopathies. Disruption of this enhancer would impair BCL11A expression in erythroid precursors with resultant HbF derepression while sparing BCL11A expression in nonerythroid lineages. Rational intervention might mimic common protective genetic variation.

Supplementary Materials

Materials and Methods

Supplementary Text

Figs. S1 to S9

Tables S1 to S6

References (3854)

References and Notes

  1. Acknowledgments: We thank A. Woo, A. Cantor, M. Kowalczyk, S. Burns, J. Wright, J. Snow, J. Trowbridge, and members of the Orkin laboratory—particularly C. Peng, P. Das, G. Guo, M. Kerenyi, and E. Baena—for discussions. C. Guo and F. Alt provided the pre-B cell line; A. He and W. Pu provided the pWHERE lacZ reporter construct; C. Currie and M. Nguyen provided technical assistance; D. Bates and T. Kutyavin provided expertise with sequence analysis; R. Sandstrom provided help with data management; G. Losyev and J. Daley provided aid with flow cytometry; and J. Desimini provided graphical assistance. L. Yan at EpigenDx (Hopkinton, Massachusetts) conducted the custom pyrosequencing reactions. This work was funded by grants from the Doris Duke Charitable Foundation (2009089) and Canadian Institute of Health Research (123382) to G.L.; Amon Carter Foundation, Hyundai Hope on Wheels, NIH, Lucille Packard Foundation to M.H.P.; NIH grants U54HG004594 and U54HG007010 to J.A.S.; and NIH R01HL032259, P01HL032262, and P30DK049216 (Center of Excellence in Molecular Hematology) to S.H.O. D.E.B. is supported by National Institute of Diabetes and Digestive and Kidney Diseases Career Development Award K08DK093705. D.E.B., J.X., and S.H.O. are inventors on a patent application related to this work, filed by Boston Children’s Hospital. The CSSCD samples with DNA and associated phenotype information are available from the National Heart, Lung, and Blood Institute to researchers that have appropriate institutional review board approval to use the materials.
View Abstract

Navigate This Article