Clonal analysis of immunodominance and cross-reactivity of the CD4 T cell response to SARS-CoV-2

See allHide authors and affiliations

Science  18 Jun 2021:
Vol. 372, Issue 6548, pp. 1336-1341
DOI: 10.1126/science.abg8985

Probing CD4 T cell immunity to SARS-CoV-2

A better understanding of CD4+ T cell responses to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is crucial to the design of effective next-generation vaccines. Low et al. defined and estimated the CD4+ T cell repertoire of convalescent COVID-19 patients. After sorting various CD4+ T cell subsets, they generated numerous T cell clones that reacted to the SARS-CoV-2 spike protein. A large number of T cell clones from almost all individuals recognized a small conserved immunodominant region within the spike protein receptor-binding domain (RBD). The researchers isolated T cell clones that broadly reacted to the spike protein of other coronaviruses, providing evidence for the recall of preexisting cross-reactive memory T cells after SARS-CoV-2 infection.

Science, abg8985, this issue p. 1336


The identification of CD4+ T cell epitopes is instrumental for the design of subunit vaccines for broad protection against coronaviruses. Here, we demonstrate in COVID-19–recovered individuals a robust CD4+ T cell response to naturally processed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike (S) protein and nucleoprotein (N), including effector, helper, and memory T cells. By characterizing 2943 S-reactive T cell clones from 34 individuals, we found that the receptor-binding domain (RBD) is highly immunogenic and that 33% of RBD-reactive clones and 94% of individuals recognized a conserved immunodominant S346–S365 region comprising nested human leukocyte antigen DR (HLA-DR)– and HLA-DP–restricted epitopes. Using pre– and post–COVID-19 samples and S proteins from endemic coronaviruses, we identified cross-reactive T cells targeting multiple S protein sites. The immunodominant and cross-reactive epitopes identified can inform vaccination strategies to counteract emerging SARS-CoV-2 variants.

The identification of T cell epitopes in disease-causing organisms is challenging in view of the polymorphism of human leukocyte antigen (HLA) molecules and the variability of rapidly mutating pathogens. In the context of the COVID-19 pandemic, bioinformatic analysis (1) has been used to predict T cell epitopes in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) proteins (2, 3) and to produce peptide pools to stimulate peripheral blood mononuclear cells (PBMCs) and enumerate antigen-specific T cells. These studies revealed a robust CD4+ and CD8+ T cell response against SARS-CoV-2 proteins in recovered patients (26) and a level of cross-reactivity with endemic coronaviruses in pre-pandemic samples (79).

A limitation of bioinformatics predictions is the difficulty in identifying immunodominant epitopes, because immunodominance is determined by multiple factors such as antigen processing, T cell repertoire, HLA alleles, and preexisting cross-reactive immunity (1012). To identify naturally processed immunodominant CD4+ T cell epitopes, we took the unbiased approach of stimulating T memory (Tm) cells with protein-pulsed antigen-presenting cells (APCs), followed by the isolation of T cell clones to precisely map the epitope recognized (13).

PBMCs from a first cohort of 14 patients who had recovered from mild to severe COVID-19 (table S1) were used to isolate total CD4+ Tm cells or T central memory (Tcm), T effector memory (Tem), and circulating T follicular helper (cTfh) cells (fig. S1A). The cells were labeled with carboxyfluorescein diacetate succinimidyl ester (CFSE) and stimulated with autologous monocytes in the presence of recombinant SARS-CoV-2 spike (S) protein or nucleoprotein (N). In all individuals, we observed a strong response to both antigens in terms of proliferation and interferon-γ (IFN-γ) production (Fig. 1, A and B, and fig. S1, B and C). Proliferating cells were detected at different levels in Tcm, Tem, and cTfh cells, consistent with a recent report (14), and over a 1-year period (fig. S1D). By contrast, the CD4+ Tm cell response to SARS-CoV-2 proteins in unexposed individuals was low or undetectable (Fig. 1B and fig. S1C), consistent with the presence of a few cross-reactive T cells primed by endemic coronaviruses (4, 5, 9).

Fig. 1 Robust T cell response to SARS-CoV-2 S and N proteins in CD4+ Tm cell subsets.

Total CD4+ memory T cells from seven COVID-19–recovered patients and six unexposed (pre–COVID-19) healthy donors (HD) or CD4+ Tcm, Tem, and cTfh cells from seven COVID-19–recovered patients were labeled with CFSE and cultured with autologous monocytes in the presence or absence of recombinant SARS-CoV-2 S or N protein. (A) CFSE profiles on day 7 and percentage of CFSElo proliferating Tcm, Tem, and cTfh cells in a representative recovered patient. Negative controls of T cells cultured with monocytes in the absence of antigen are shown as red lines. (B) Individual values and median and quartile values of the percentage of CFSEloCD25+ICOS+ cells in total CD4+ Tm cells and CD4+ Tcm, Tem, and cTfh cell subsets in recovered patients and healthy donors. Also shown are IFN-γ concentrations in culture supernatants of Tm cell subsets from recovered patients at day 7 post stimulation with SARS-CoV-2 S or N protein. IFN-γ concentrations were below the detection limit in HD and in negative control cultures. ****P < 0.0001, ***P < 0.001, and **P < 0.01 as determined by two-tailed unpaired t test (total CD4+ Tm and IFN-γ) or by two-tailed paired t test (CD4+ Tcm, Tem, and cTfh cells). (C) Pairwise comparison of TCR Vβ clonotype frequency distribution in samples of T cells isolated from S protein–stimulated Tcm, Tem, or cTfh cell subsets (initial input, 5 × 105 cells per subset) from P33. Frequencies are shown as a percentage of productive templates. The total number of clonotypes is indicated in the x- and y-axes. Values in the upper right corner represent the number of clonotypes shared between the two samples. The Venn diagrams show the number of unique and shared clonotypes between the Tcm, Tem, and cTfh cell subsets. (D) Bar histograms showing the Chao–Jaccard similarity index between pairs of TCR Vβ repertoires in three donors.

The clonal composition of SARS-CoV-2–reactive T cells and the relationship between different memory subsets was studied in three individuals (P28, P31, and P33) by T cell receptor (TCR) Vβ sequencing. The Tcm, Tem, and cTfh cell lines comprised, on average, 908, 480, and 697 S-reactive clonotypes and 1452, 623, and 908 N-reactive clonotypes, respectively (Fig. 1C and fig. S2). Unexpectedly, several of the most expanded clonotypes were shared between two subsets, and even among all three subsets (Fig. 1, C and D), indicating a polyfunctional response consistent with previous studies on intraclonal diversification of antigen-primed CD4+ T cells (15, 16).

In view of the interest in the design of a subunit vaccine, we analyzed in depth the CD4+ T cell response to the S protein, in particular to the receptor-binding domain (RBD), which is the main target of neutralizing antibodies (17, 18). CD4+ T cells from a larger cohort of 34 COVID-19-recovered individuals (table S1) were stimulated with S protein–pulsed monocytes, and proliferating T cells were cloned by limiting dilution. We obtained 2943 T cell clones and mapped their specificity using three pools of peptides spanning S1ΔRBD, RBD, and S2 (Fig. 2, A and B). RBD-specific T cell clones were found in 32 out of 34 donors, accounting for, on average, 20% of the response to the S protein (Fig. 2B). Using a matrix-based approach, we mapped the epitope specificity of 1254 RBD-reactive CD4+ T cell clones (Fig. 2C) and found that, in each individual, the clones recognized multiple sites that collectively spanned almost all of the RBD sequence. However, certain regions emerged as immunodominant, such as those spanning residues S346–S385 and S446–S485. A 20–amino acid region (S346–S365) was recognized by 94% of the individuals (30 out of 32) and by 33% of the clones (408 out of 1254) (Fig. 2D). This region is highly conserved among human sarbecoviruses, including the recently emerged variants of concern and zoonotic sarbecoviruses (Fig. 2E) (19). RBD- and S346–S365-specific T cell clones were found in different memory subsets of COVID-19-recovered individuals and were also isolated from individuals after SARS-CoV-2 mRNA vaccination (fig. S3). Thus, RBD is highly immunogenic in vivo and contains a large number of naturally processed T cell epitopes, including a conserved immunodominant region.

Fig. 2 CD4+ T cell clones target multiple sites on the S protein.

(A and B) CD4+ T cell clones (n = 2943) were isolated from S-reactive T cell cultures from 34 COVID-19-recovered individuals, and their specificity was mapped by stimulation with autologous B cells and three pools of 15-mer peptides overlapping of 10 amino acids spanning the S1–S325 and S536–S685 sequences (S1ΔRBD, 91 peptides), the S316–S545 sequence (RBD, 44 peptides), and the S676–S1273 sequence (S2, 118 peptides), using as readout 3H-thymidine incorporation. (A) Characterization of representative T cell clones (n = 72) from P20. Proliferation was assessed on day 3 after a 16-hour pulse with 3H-thymidine and is expressed as counts per minute after subtraction of the unstimulated control value (Δcpm). (B) Percentage of T cell clones specific for S1ΔRBD (white), RBD (black), and S2 (gray) in the 34 individuals tested. The number of clones tested is indicated on the right. The distribution of all S protein–reactive T cell clones isolated from all 34 individuals (ALL, n = 2943) is also indicated. (C) RBD-specific T cell clones (n = 1254) isolated from 32 individuals were further characterized for their epitope specificity using 15-mer peptides overlapping of 10 amino acids spanning the S316–S545 RBD sequence. The 20-mer specificity of each clone is represented by a horizontal line, and the total number of clones mapped for each individual is indicated on the right. (D) Percentage of clones specific and percentage of individuals carrying T cells specific for different 20-mer segments of the RBD. Data for the immunodominant region S346–S365 is shown in black. (E) Sequence alignments of the SARS-CoV-2–immunodominant region S346–S365 with homologous sequences in different sarbecoviruses, human and animal SARS-related coronaviruses, and alpha and beta coronaviruses. Dots indicate amino acid residues identical to SARS-CoV-2 reference strain; dashes indicate deletions.

To study the CD4+ T cell response to the immunodominant S346–S365 region, we sequenced TCR Vβ chains of 329 specific T cell clones. The 206 clonotypes identified used a broad spectrum of TCR Vβ genes and, even in the same individual, carried different CDR3 sequences (Fig. 3, A and B, and table S2). In P31 and P33, certain S346–S365 clonotypes were detected among the top 5% expanded Tm cells ex vivo (Fig. 3C). Using blocking antibodies, we determined that most of the T cell clones analyzed (n = 247 from 22 individuals) were HLA-DR restricted, whereas the remaining clones (n = 50 from five individuals) were HLA-DP restricted and one was HLA-DQ restricted (Fig. 3, D and E). Using truncated peptides and T cell clones from individuals with different HLA types (table S3), we defined two HLA-DR–restricted epitopes (VYAWNRKRIS and RFASVYAWNRKR) and one HLA-DP–restricted epitope (NRKRISNCVAD) (Fig. 3F). Thus, the S346–S365 region comprises at least three nested epitopes recognized in association with different allelic forms of HLA-DR or HLA-DP by T cell clones that use a large set of TCR Vβ genes and CDR3 of different sequence and length.

Fig. 3 The immunodominant S346–S365 RBD region contains nested epitopes targeted by a diverse repertoire of T cells restricted by HLA-DR and HLA-DP.

(A and B) Rearranged TCR Vβ sequences of S346–S365-reactive CD4+ T cell clones (n = 329) isolated from 25 COVID-19–recovered individuals as determined by reverse transcription polymerase chain reaction and Sanger sequencing. (A) TCR Vβ gene usage of the 206 unique clonotypes. Slices in the chart represent different Vβ genes, and their size is proportional to the number of clonotypes using that particular gene. Color-coded legend is reported for the top 18 Vβ genes (used by at least five different TCR Vβ clonotypes). (B) Number of S346–S365-reactive T cell clones and clonotypes identified in the 25 individuals. Slices in the charts represent different TCR Vβ clonotypes, and their size is proportional to the number of sister clones bearing the same sequence. The number of clones is reported on the top and the number of clonotypes at the center of the pie chart. (C) Frequency distribution of TCR Vβ clonotypes from CD4+ Tcm, Tem, and cTfh cell subsets sequenced directly after ex vivo isolation from P31 and P33. Colored circles mark the TCR Vβ clonotypes found among the S346–S365-specific T cell clones isolated from the same individual. Dotted lines in the graphs indicate the frequency threshold of the top 5% expanded clonotypes. (D) HLA class II isotype restriction of S346–S365-specific T cell clones (n = 10) isolated from P33 as determined by stimulation with peptide-pulsed autologous APCs in the absence (control) or presence of blocking antibodies to HLA-DR, HLA-DP, HLA-DQ, or pan–HLA class II. Proliferation was assessed on day 3 after a 16-hour pulse with 3H-thymidine. Data are expressed as a percentage of control counts per minute (cpm). (E) HLA class II isotype usage by S346–S365-reactive CD4+ T cell clones (n = 298) from 24 individuals as determined by >80% inhibition of proliferation. (F) Identification of the minimal peptide recognized by HLA-DR– or HLA-DP–restricted S346–S365-reactive CD4+ T cell clones (n = 23) isolated from seven individuals, as determined by stimulation with autologous APCs pulsed with a panel of truncated peptides. Proliferation was assessed on day 3 and is expressed as counts per minute (cpm). Bars indicate mean ± SD; circles indicate individual clones. The minimal amino acid sequences recognized by T cell clones are highlighted with colored shading.

To address the extent of T cell cross-reactivity between different S proteins, SARS-CoV-2 S protein–specific T cell lines from P28 and P33 were relabeled with CFSE and stimulated with S proteins from endemic human coronaviruses. In these secondary cultures, a robust proliferation was observed in response to SARS-CoV and HKU1 (Fig. 4A). Unexpectedly, a sizeable fraction of clonotypes in SARS-CoV-2 primary cultures (ranging from 7 to 25%) were found in SARS-CoV and/or HKU1 secondary cultures, consistent with a substantial degree of T cell cross-reactivity (fig. S4). To corroborate this finding, we isolated from secondary cultures several T cell clones that proliferated in response to two or even three different naturally processed S proteins (Fig. 4B and table S4).

Fig. 4 Identification of coronavirus S protein cross-reactive T cell clones.

(A and B) CFSE-labeled CD4+ Tm cells from P28 and P33 were stimulated with autologous monocytes pulsed with recombinant SARS-CoV-2 S protein. CFSElo cells were expanded with interleukin-2 for 10 days, relabeled with CFSE, and restimulated with S proteins from human beta (SARS-CoV, HKU1, and OC43) or alpha (NL63 and 229E) coronaviruses. T cell clones from proliferating secondary cultures were isolated and tested for cross-reactivity against different S proteins. (A) CFSE profiles from primary and secondary stimulation in the absence or presence of the indicated antigens. (B) Proliferative response of representative T cell clones, isolated from secondary cultures, to autologous APCs pulsed with titrated doses of recombinant S proteins from SARS-CoV-2, SARS-CoV, or HKU1. Proliferation was assessed on day 3 after a 16-hour pulse with 3H-thymidine and is expressed as counts per minute (cpm). (C to G) Multiple blood samples were obtained from donor P34 several years before and 1.5 months after COVID-19 (disease onset Oct 2020) and characterized regarding Tm cells and serum antibody levels. (C) T cell proliferation measured by CFSE dilution in pre–COVID-19 (Dec 2018) and post–COVID-19 (Dec 2020) samples in response to autologous monocytes pulsed with different S proteins. (D) Time course of serum immunoglobulin G antibodies against different coronavirus S proteins as determined by enzyme-linked immunosorbent assay [half-maximal serum dilution (EC50) values]. These data demonstrate that, together with a strong induction of serum antibodies to SARS-CoV-2, antibody titers against HKU1 and OC43 also increased in the post–COVID-19 sample. (E) Proliferative response (day 3 cpm) to a pool of peptides spanning SARS-CoV-2 S protein of T cell clones obtained from post–COVID-19 CFSElo cultures stimulated by SARS-CoV-2, SARS-CoV, OC43, or NL63. Pie charts show the total number of clones tested and the fraction of responsive clones is shown in gray. (F) Specificity of SARS-CoV-2 S peptide pool-reactive T cell clones isolated from each culture (E) was further mapped by stimulation with pools of peptides spanning the S1ΔRBD, RBD, and S2 regions of the SARS-CoV-2 S protein. Histograms show the percentage of clones specific for each region. The total number of clones tested is indicated at the top. (G) Characterization of representative cross-reactive T cell clones isolated from P34 post–COVID-19 sample. Left panels report the proliferative response (day 3 cpm) of T cell clones stimulated with titrated doses of recombinant S proteins in the presence of autologous monocytes. The peptides recognized are indicated on the right panels. Shown are sequence alignments of the recognized SARS-CoV-2 epitopes (S816–S830 and S981–S1000) with homologous sequences of endemic alpha and beta coronaviruses. Dots indicate amino acid residues identical to the SARS-CoV-2 reference strain.

Cross-reactive T cells may derive from preexisting memory T cells or from the priming of naïve T cells. We therefore analyzed a COVID-19–recovered individual from whom we had previously cryopreserved PBMCs. A robust CD4+ Tm cell proliferation in the pre–COVID-19 sample was detected against NL63 and 229E S proteins, whereas the response to HKU1 and OC43 was limited and the response to SARS-CoV and SARS-CoV-2 undetectable (Fig. 4C). Conversely, in the post–COVID-19 sample, strong T cell proliferation was observed not only in response to SARS-CoV-2, but also in response to all other alpha and beta coronavirus S proteins (Fig. 4, C and D), and shared clonotypes were detected between SARS-CoV-2 and endemic coronavirus S protein–stimulated cultures (fig. S5A). Furthermore, T cell clones isolated from cultures stimulated by SARS-CoV, OC43, or NL63 proliferated in response to the SARS-CoV-2 S peptide pool, and their specificity was mapped primarily to the S2 region (Fig. 4, E and F), consistent with its high degree of sequence conservation (2022). T cell clones that fully cross-reacted with all S proteins were mapped to the highly conserved fusion peptide (Fig. 4G).

To determine whether S-reactive T cells in the post–COVID-19 sample could be detected in pre-pandemic samples, we performed clonotypic analysis of total Tm cells on the post–COVID-19 sample and on samples collected in 2014 and 2017. Most of the SARS-CoV-2–specific clonotypes identified above were found only in the post–COVID-19 sample, consistent with priming of naïve T cells (fig. S5B). By contrast, clonotypes specific to endemic coronaviruses were found at a comparable number at all time points. Some T cell clonotypes against the highly conserved fusion peptide could be tracked back to the 2014 sample and were found to be expanded in the post–COVID-19 sample (fig. S5C). These findings demonstrate that preexisting cross-reactive Tm cells are recalled and expanded upon SARS-CoV-2 infection.

The robust CD4+ T cell response to the RBD and the identification of the S346–S365 immunodominant region conserved in the emerging SARS-CoV-2 variants of concern provide the rationale for the development of a subunit vaccine based on the RBD because it is the target of most neutralizing antibodies (17, 18). These findings were not anticipated in previous studies based on bioinformatics predictions (2, 3) and short-term peptide stimulation of PBMCs, highlighting the value of combining T cell stimulation with protein antigens with cloning and TCR sequencing for the analysis of antigen-specific T cell repertoires.

The immunodominance of RBD S346–S365 at the individual level and at the population level may be due to the presence of three nested T cell epitopes presented by HLA-DR and HLA-DP and to the relative abundance of naturally processed peptides, as recently reported in an immunopeptidomics study (23). The S346–S365 region is also a contact site for the broadly reactive neutralizing antibody S309 (24), providing a good example of convergence of B and T cells around a conserved epitope.

Our study also provides evidence for the recall of preexisting cross-reactive Tm cells upon SARS-CoV-2 infection. However, this phenomenon, reminiscent of the “original antigenic sin” (25), does not prevent a robust and persistent primary response to new epitopes of SARS-CoV-2 that is characterized by extensive intraclonal diversification into Tem, cTfh, and Tcm cells, which represent inflammatory, helper, and long-lived Tm cells, respectively (26, 27). The availability of a large number of cross-reactive T cell clones is not only instrumental for defining target sites in relevant pathogens but also for understanding whether cross-reactivity is due to epitope structural similarities or to TCR-binding degeneracy (11, 28).

The possibility of leveraging a robust, cross-reactive T helper cell function against conserved sites will be instrumental in driving neutralizing antibody responses to adaptive vaccines that incorporate escape mutations found in emerging SARS-CoV-2 variants.

Supplementary Materials

Materials and Methods

Tables S1 to S5

Figs. S1 to S5

References (30, 31)

MDAR Reproducibility Checklist

This is an open-access article distributed under the terms of the Creative Commons Attribution license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

References and Notes

Acknowledgments: We thank all participants in the study and the personnel at the hospitals for blood collection; G. Durini for cell isolation; W. Pertoldi, S. Tettamanti, and E. Picciocchi from the Istituti Sociali Chiasso for blood samples from vaccinated donors; and D. Lilleri and members of the Tipizzazione laboratory of the IRCCS San Matteo Hospital Foundation San Matteo Pavia, Italy, for HLA typing. Funding: This work was supported in part by the Henry Krenter Foundation, the Cariplo Foundation (grant 2020-1374 CoVIM), the Swiss National Science Foundation (grant 189331), and by EOC research funds. F.S. and the Institute for Research in Biomedicine are supported by the Helmut Horten Foundation. Author contributions: A.Ca. and F.S. contributed to study concept and design. J.S.L., D.V., F.M., and J.J. contributed to experimental work. M.P. and L.P. produced recombinant proteins. D.J. performed cell sorting. S.J. provided technical support. M.F. performed bioinformatics analysis. R.C. performed HLA typing. T.T., A.F.P., M.B., C.G., P.F., and A.Ce. contributed clinical samples. A.Ca., A.L., and F.S. analyzed the data and wrote the manuscript. All authors contributed to interpretation of data and critical revision of the manuscript. Competing interests: The authors declare no competing interests. Data and materials availability: TCR Vβ sequences have been deposited in the ImmuneACCESS database (29). All other data are available in the main text or the supplementary materials. This work is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. To view a copy of this license, visit This license does not apply to figures/photos/artwork or other content included in the article that is credited to a third party; obtain authorization from the rights holder before using such material.

Stay Connected to Science

Navigate This Article