A Direct Estimate of the Human αβ T Cell Receptor Diversity

See allHide authors and affiliations

Science  29 Oct 1999:
Vol. 286, Issue 5441, pp. 958-961
DOI: 10.1126/science.286.5441.958


Generation and maintenance of an effective repertoire of T cell antigen receptors are essential to the immune system, yet the number of distinct T cell receptors (TCRs) expressed by the estimated 1012 T cells in the human body is not known. In this study, TCR gene amplification and sequencing showed that there are about 106 different β chains in the blood, each pairing, on the average, with at least 25 different α chains. In the memory subset, the diversity decreased to 1 × 105 to 2 × 105 different β chains, each pairing with only a single α chain. Thus, the naı̈ve repertoire is highly diverse, whereas the memory compartment, here one-third of the T cell population, contributes less than 1 percent of the total diversity.

Adaptive immunity is dependent on a genetic recombination machinery that assembles a diverse set of functional immunoglobulin or TCR genes from a pool of discontinuous gene segments. The available pool, for the human αβ TCR, consists of 42 variable (V) and 61 joining (J) segments in the α locus and 47 V, two diversity (D), and 13 J segments in the β locus. During the Vα-Jα or Vβ-Dβ-Jβ rearrangement, nucleotide additions or deletions at the junctions add to the diversity (1, 2). As a result, most of the variation in each chain lies in the complementarity-determining region 3 (CDR3), which is encoded by the V(D)J junction and interacts with the antigenic peptide presented by the major histocompatibility complex molecule (3). A further diversifying factor is the pairing of an α chain to a β chain to form the TCR heterodimer. The potential diversity thus created has been calculated to be perhaps 1015 (2). Obviously, not all of it is used, but the T cell repertoire remains complex enough to have precluded any attempts to directly measure it. Here, we have analyzed a fraction of the repertoire in a way that allowed us to extrapolate the results to the whole repertoire (4).

To determine the diversity of the TCR β chain, we selected a V-gene family with a single member, Vβ18, and studied its rearrangement to the Jβ1.4 segment. Complementary DNA from 108 peripheral blood T cells from a healthy donor (male, between 20 and 30 years old) was amplified with Vβ18- and Jβ1.4-specific primers and separated on an acrylamide gel, producing a pattern of eight bands spaced by three nucleotides that correspond to in-frame TCR transcripts with different CDR3 lengths (Fig. 1A). The band corresponding to a CDR3 with a length of 12 amino acids was purified and cloned, and all of the different sequences within it were then identified by sequencing (Fig. 1B) (5). The number of different sequences, 17, was used to calculate the β-chain diversity in the sample (Table 1). The intensity of the bands in a V-J profile is proportional to the diversity within them (6), so the 12–amino acid band contained 9.3% of the total Vβ18-Jβ1.4 diversity. The Jβ1.4 gene is used, on the average, by 3.0% of T cells (7), and the frequency of Vβ18+ cells in our donor was 0.8%. Together, these give an average frequency of (0.093 × 0.03 × 0.008)/17 = 1.3 × 10–6 for any one β sequence; that is, the sample contains 0.8 × 106different β chains. This was confirmed by an analysis of the Vβ16-Jβ2.2 rearrangement in the same sample. The Vβ16 family also consists of only one member, but unlike the Jβ1.4 gene, the Jβ2.2 gene may be rearranged to both Dβgenes and thus has, in principle, more potential to generate diversity. We found 15 different sequences with a 13–amino acid CDR3 and a total diversity of 0.8 × 106 different β chains. In a second donor (female, 26 years old), the Vβ16-Jβ2.2, 13–amino acid band contained 14 different sequences, with the total diversity in this sample of 108 T cells being 1.2 × 106. Thus, analysis of three rearrangements, from two donors, produces estimates that are in close agreement and indicates that the repertoire in the blood consists of at least 106 different β chains.

Figure 1

Analysis of the Vβ18-Jβ1.4 rearrangement. Complementary DNA from 108 T cells was amplified by Vβ18- and Cβ-specific primers for 40 cycles, followed by a run-off reaction with a fluorescent Jβ1.4-specific primer. (A) Electrophoresis of the PCR products. The denatured amplification products were separated on a 6% acrylamide gel, and the bands are shown as converted to peaks (4). The band corresponding to CDR3 with a length of 12 amino acids (aa) (arrow) was excised after silver staining and cloned, after 20 PCR cycles, to pCR2.1 vector with the TOPO-TA cloning system (Invitrogen). (B) Sequencing of the clones. The inserts were amplified with M13-40 and reverse primer, and the product was treated with exonuclease I and shrimp alkaline phosphatase. The sequencing reaction was done with M13-20 primer and Big Dye Terminator mix and analyzed with an Applied Biosystems 377 sequencer. Sequencing was continued until it was obvious that no new distinct sequences would be found. The sequences are plotted in the order of appearance.

Table 1

Determination of TCR diversity. The average frequency of a given β or α sequence was calculated from these values and the number of different sequences found, as described in the text.

View this table:

To determine how well our sample reflected the whole diversity in blood, we analyzed the Vβ18-Jβ1.4 12–amino acid rearrangement in two more samples of 108 T cells from the first donor. Overall, 69% of the distinct sequences were shared between the three samples. The second sample contained four previously unidentified sequences, and the third sample contained only one. This suggests that together these three samples contain most of the total β-chain diversity in the blood. Thus, there are probably less than 30 different 12–amino acid Vβ18-Jβ1.4 sequences present in the blood of this donor, or, calculated as above, no more than 1.3 × 106 different β chains.

Nevertheless, it could still be argued that by random sequencing we might, despite the apparent plateau in the appearance of new sequences, miss less frequent clones. Indeed, a recent estimate of the human β-chain diversity in the CD4+ population, based on limiting dilution analysis of the frequency of arbitrarily selected β-chain sequences, suggests a diversity that is an order of magnitude higher than that observed by sequencing (8). To address this possibility, we analyzed the second sample from the first donor by colony hybridization. We designed a panel of oligonucleotides, specific to the Vβ18-Jβ1.4 12–amino acid CDR3 sequences already found by random sequencing, to look for colonies that did not give a positive signal and might thus contain new CDR3 sequences. Among >3000 colonies, 150 did not give a signal and were sequenced, but no previously unidentified sequences were found (9). Another panel of oligonucleotides was used, specific to CDR3 sequences apparently absent from the second sample but present in the other samples from the same donor. By using this panel, we tried to determine whether these sequences were truly absent or simply not yet found. Among >2500 colonies, 16 were positive and were sequenced, but none contained new CDR3 sequences. Thus, even by increasing the number of analyzed clones by 10-fold, no further distinct sequences were found, including those present in other samples from the same donor. A further indication that our method detected practically all of the sequences present in the sample was that, within a sample of 750,000 cells, we measured a diversity of 500,000 different β chains (10).

Another layer of diversity is introduced by the heterodimerization of TCR chains. To estimate how freely a given β chain may associate with different α chains, we isolated 4 × 106Vα12+ T cells (11) and determined the β-chain diversity within this sample. Two β rearrangements were sequenced, indicating a diversity of 0.5 × 106 to 0.7 × 106 different β chains in the Vα12+ sample (Table 1). The frequency of the Vα12 gene in this donor was 2.5%, or one-fortieth of the total α-chain repertoire. Therefore, the total αβ TCR diversity must be at least 40 times the β-chain diversity observed in the Vα12+ fraction alone, that is, 25 × 106. Comparison with total β-chain diversity, 106, shows that each β chain must, on the average, pair with at least 25 different α chains. This is probably an underestimate, if only because it assumes that every β chain within the Vα12+ population represents a clone, which is unlikely to be the case. We also determined the α-chain diversity itself, although it is not needed for the estimate of αβ TCR diversity. Analysis of two CDR3 lengths of the Vα12-Jα20 rearrangement, performed as described for the β chain, indicated a repertoire of ∼0.5 × 106 different α chains (Table 1).

Finally, we analyzed 15 × 106 CD45RO+memory T cells (11) from a healthy 38-year-old female in whom this subset comprised 37% of T cells. Sequencing of two rearrangements (12) indicated a β-chain diversity of 0.1 × 106 to 0.2 × 106, or 10 to 20% of that in the total population (Table 1). Moreover, unlike in the total population, the α to β pairing was highly restricted. Within the CD45RO+ population, sorting of Vα12+ cells drastically decreased the β-chain diversity. In this donor, the Vα12 was expressed by 12%, or one-eighth of the memory cells. At least 75% of Vβ-Jβ rearrangements observed in the Vα12 population were absent in the Vα12+ cells (Fig. 2), and sequencing of the nine–amino acid peak of the Vβ17-Jβ2.7 rearrangement showed that, within a given rearrangement, the diversity was further decreased fourfold. Thus, within the memory compartment, each β sequence is probably paired with only one α chain, and the memory αβ TCR repertoire of this donor is not larger than 2 × 105. Although factors such as age and antigenic exposure may cause individual variation, they are unlikely to change this difference between naı̈ve and memory repertoire.

Figure 2

The β-chain diversity in Vα12+ and Vα12 memory T cells. Peripheral blood lymphocytes were enriched for CD45RO+ cells by immunoaffinity chromatography, and then the CD45RO+Vα12+ population was isolated to a purity of 95% by cell sorting. Complementary DNA from 150,000 sorted Vα12+ or Vα12 memory cells was amplified with Vβ17- and Cβ-specific primers, followed by a run-off reaction with fluorescent primers that were specific for each of the 13 Jβ genes. Detectable rearrangements are shown as black squares if shared by the two populations and as gray squares if otherwise. Three-quarters of the rearrangements found in the Vα12population are absent from the Vα12+population. The Vβ16-Jβ and Vβ18-Jβ rearrangements were also analyzed, with results similar to those shown here.

Our findings firmly place a lower limit to the total αβ TCR diversity in the blood at 25 × 106 different TCRs. The upper limit is set by the number of different α chains that each of the 106 β chains can pair with. In both humans and mice, during thymic development, the β chain is rearranged first, after which the cells proliferate until the α rearrangement (13). In mice, this proliferation results in a 20- to 50-fold expansion in thymocyte numbers, with an interval between β and α rearrangement of ∼2 days (14). Not all of the potential pairs will be functional or survive selection, and in the mature repertoire, each β chain pairs, on the average, with only three or four different α chains (15). The scanty existing data for humans suggest a longer period of perhaps 5 days between the β and α rearrangement (16), which could allow an ∼1000-fold expansion. In the mature repertoire, the upper limit of average pairing may therefore be ∼100 different α chains per each β chain. Other mechanisms that might modify the α to β pairing and contribute to the difference between humans and mice include less stringent allelic exclusion of the α chain or more extensive editing of the α chain in humans than in mice (17).

The naı̈ve repertoire consists of cells that have survived the negative and positive selection in the thymus, perhaps only 5% of the thymocyte population (18). From this pool, some cells are selected by antigen to proliferate. At the height of the response, the antigen-specific T cells may account for 25% of the population in the spleen and even more locally (19). Although most have a life-span of only a few days, some will survive as long-lived memory cells. Another feature of the response is its relative oligoclonality, shown, for example, for influenza and Epstein-Barr viruses (20). Our results reflect these facts in that although in our donor the memory compartment comprised one-third of the total T cell population, it contributed <1% to the total diversity. The overall diversity may be preserved by independent homeostatic regulation of the naı̈ve and memory populations (21) and, perhaps, by preferred survival of recent thymic migrants (22). Indeed, it has been reported that the thymus retains its function as a primary lymphoid organ well into adulthood, albeit with a decreasing output (23). The maintenance of a stable level of total diversity is also indirectly supported by the convergent values we obtained from two different donors. Thus, by maintaining two pools of T cells with a very different repertoire, the immune system combines two conflicting needs: a recognition of a wide array of antigens and an efficient and timely response.

  • * Present address: Department of Virology, Haartman Institute, University of Helsinki, Haartmaninkatu 3, FIN-00014 Helsinki, Finland.

  • To whom correspondence should be addressed. E-mail: petteri.arstila{at}


View Abstract

Navigate This Article