Research Article

Selective and cross-reactive SARS-CoV-2 T cell epitopes in unexposed humans

See allHide authors and affiliations

Science  04 Aug 2020:
DOI: 10.1126/science.abd3871


Many unknowns exist about human immune responses to the SARS-CoV-2 virus. SARS-CoV-2 reactive CD4+ T cells have been reported in unexposed individuals, suggesting pre-existing cross-reactive T cell memory in 20-50% of people. However, the source of those T cells has been speculative. Using human blood samples derived before the SARS-CoV-2 virus was discovered in 2019, we mapped 142 T cell epitopes across the SARS-CoV-2 genome to facilitate precise interrogation of the SARS-CoV-2-specific CD4+ T cell repertoire. We demonstrate a range of pre-existing memory CD4+ T cells that are cross-reactive with comparable affinity to SARS-CoV-2 and the common cold coronaviruses HCoV-OC43, HCoV-229E, HCoV-NL63, or HCoV-HKU1. Thus, variegated T cell memory to coronaviruses that cause the common cold may underlie at least some of the extensive heterogeneity observed in COVID-19 disease.

The emergence of SARS-CoV-2 in late 2019 and its subsequent global spread has led to millions of infections and substantial morbidity and mortality (1). Coronavirus disease 2019 (COVID-19), the clinical disease caused by SARS-CoV-2 infection, can range from mild self-limiting disease to acute respiratory distress syndrome and death (2). The mechanisms underlying the spectrum of COVID-19 disease severity states, and the nature of protective immunity against COVID-19 currently remains unclear.

Studies dissecting the human immune response against SARS-CoV-2 have begun to characterize SARS-CoV-2 antigen-specific T cell responses (38), and multiple studies have described marked activation of T cell subsets in acute COVID-19 patients (913). Surprisingly, antigen-specific T cell studies performed with five different cohorts reported that 20-50% of people who had not been exposed to SARS-CoV-2 had significant T cell reactivity directed against peptides corresponding to SARS-CoV-2 sequences (37). The studies were from geographically diverse cohorts (USA, Netherlands, Germany, Singapore, and UK), and the general pattern observed was that the T cell reactivity found in unexposed individuals was predominantly mediated by CD4+ T cells. It was speculated that this phenomenon might be due to preexisting memory responses against human “common cold” coronaviruses (HCoVs), such as HCoV-OC43, HCoV-HKU1, HCoV-NL63, or HCoV-229E. These HCoVs share partial sequence homology with SARS-CoV-2, are widely circulating in the general population, and are typically responsible for mild respiratory symptoms (1416). However, the hypothesis of crossreactive immunity between SARS-CoV-2 and common cold HCoVs still awaits experimental trials. This potential preexisting cross-reactive T cell immunity to SARS-CoV-2 has broad implications, as it could explain aspects of differential COVID-19 clinical outcomes, influence epidemiological models of herd immunity (17, 18), or affect the performance of COVID-19 candidate vaccines.

Epitope repertoire in SARS-CoV-2 unexposed individuals

To define the repertoire of CD4+ T cells recognizing SARS-CoV-2 epitopes in previously unexposed individuals, we used in vitro stimulation of PBMCs for 2-weeks with pools of 15-mer peptides. This method is known to be robust for detecting low frequency T cell responses to allergens, bacterial, or viral antigens (19, 20), including naive T cells (21). For screening SARS-CoV-2 epitopes, we utilized PBMC samples from unexposed subjects collected between March 2015 and March 2018, well before the global circulation of SARS-CoV-2 occurred. The unexposed subjects were confirmed seronegative for SARS-CoV-2 (fig. S1A).

SARS-CoV-2 reactive T cells were expanded with one pool of peptides spanning the entire sequence of the spike protein (CD4-S), or a non-spike “megapool” (CD4-R) of predicted epitopes from the non-spike regions (i.e., “remainder”) of the viral genome (4). In total, 474 15-mer SARS-CoV-2 peptides were screened. After 14 days of stimulation, T cell reactivity against intermediate “mesopools,” each encompassing approximately 10 peptides, were assayed using a FluoroSPOT assay (e.g., 22 CD4-R mesopools; fig. S2A). Positive mesopools were further deconvoluted to identify specific individual SARS-CoV-2 epitopes. Representative results from one donor show the deconvolution of mesopools P6 and P18 to identify seven different SARS-CoV-2 epitopes (fig. S2B). Intracellular cytokine staining assays (ICS) specific for IFN-γ determined whether antigen specific T cells responding to the SARS-CoV-2 mesopools were CD4+ or CD8+ T cells (fig. S2C). Results from the 44 donors/CD4-R mesopool and 40 donors/CD4-S mesopool combinations yielding a positive response are shown in fig. S2, D and E, respectively. In 82/88 cases (93.2%) the cells responding to SARS-CoV-2 mesopool stimulation were clearly CD4+ T cells, as judged by the ratio of CD4/CD8 responding cells. In four cases (4.5%), the responding cells were CD8+ T cells, and in two cases (2.3%) the responses were mediated by both CD4+ and CD8+ T cells. The fact that CD8+ T cells were rarely detected was not surprising, since the peptides used in CD4-R encompassed predicted class II epitopes, and the CD4-S is constituted of 15-mer peptides (9-10-mer peptides are optimal for CD8+ T cells). Furthermore, the 2-week restimulation protocol was originally designed to expand CD4+ T cells (20). Overall, these results indicated that the peptide screening strategy utilized mapped SARS-CoV-2 epitopes recognized by CD4+ T cells in unexposed individuals.

A total of 142 SARS-CoV-2 epitopes were identified, 66 from the spike protein (CD4-S) and 76 from the remainder of the genome (CD4-R) (table S1). For each combination of epitope and responding donor, potential HLA restrictions were inferred based on the predicted HLA binding capacity of the particular epitope for the specific HLA alleles present in the responding donor (22). Each donor recognized an average of 11.4 epitopes (range 1 to 33, median 6.5; fig. S3A). Forty of the 142 epitopes were recognized by 2 or more donors (fig. S3B), accounting for 55% of the total response (fig. S3C). These 142 mapped SARS-CoV-2 epitopes may prove useful in future studies as reagents for tracking CD4+ T cells in SARS-CoV-2 infected individuals, and in COVID-19 vaccine trials.

Epitope distribution by ORF of origin

While a broad range of different SARS-CoV-2 antigens were recognized, it was striking that several of the epitopes yielding the most frequent (i.e., recognized in multiple donors) or most vigorous (i.e., most SFC/106 cells) responses were derived from the SARS-CoV-2 spike antigen (table S1). We therefore assessed the overall distribution of the 142 T cell epitopes mapped among all SARS-CoV-2 proteins, compared to the relative size of each SARS-CoV-2 antigen (Fig. 1, A and B). Notably, 54% of the total positive response was associated with spike-derived epitopes (Fig. 1A; 11% for RBD, and 44% for the non-RBD portion of spike). Of relevance for COVID-19 vaccine development, only 20% of the spike responses were derived from the receptor-binding domain (RBD) region (Fig. 1A; comparing 11% vs 44%, as described above), and the RBD region accounted for only 11% of the overall CD4+ T cell reactivity (Fig. 1A). Mapped epitopes were fairly evenly distributed across the SARS-CoV-2 genome in proportion to the size of each protein (Fig. 1B; p=0.038, r=0.42). In addition to the strong responses directed to spike, responses were also seen for ORF6, ORF3a, N, ORF8 and within Orf1a/b, where nsp3, nsp12, nsp4, nsp6, nsp2 and nsp14 were more prominently recognized. These mapped epitope results at the ORFeome level partially overlap with the ORFs targeted by CD4+ T cells in COVID-19 cases (4). Notably, no epitopes derived from the membrane protein (M) were identified in unexposed individuals (Fig. 1B), but M is robustly recognized by SARS-CoV-2-specific CD4+ T cell responses in COVID-19 cases (4). The lack of quality class II epitopes in M was unsurprising, based on M molecular biology; M is a small protein with three transmembrane domains. Combined, the data indicate that class II epitopes are relatively broadly available across the SARS-CoV-2 genome, but that SARS-CoV-2 memory CD4+ T cells preferentially target proteins highly expressed during infection, exemplified by M and S (spike) epitope mapping results.

Fig. 1 Characteristics of SARS-CoV-2 epitopes identified in unexposed donors.

Reactivity was determined by FluoroSPOT assay after 17 days of in vitro stimulation of unexposed donor PBMCs (n=18) with one pool of peptides spanning the entire sequence of the spike protein (CD4-S), or a non-spike “megapool” (CD4-R) of predicted epitopes from the non-spike (i.e., “remainder”) regions of the viral genome. (A) Summary of the responses as a function of the protein of origin. (B) Spearman correlation of positive responses per SARS-CoV-2 protein size. (C) Percent similarity of the identified epitopes with common cold coronavirus peptides as a function of the number of responding donors. (D) Each dot shows the reactivity of a donor/epitope combination derived from either non-spike (CD4-R) or spike (CD4-S). Black bars indicate the geometric mean and geometric SD. Red indicates donor/epitope combinations with sequence identity >67% with common cold coronaviruses, while blue indicates highly reactive donor/epitopes combinations (>1000 SFC*106) with sequence identity ≤67%. In (C) and (D), statistical comparisons are performed by two-tail Mann-Whitney test. *** p<0.001, **** p<0.0001.

Sequence homology of the identified SARS-CoV-2 epitopes to other common HCoVs

When this epitope mapping study was initiated, an assumption was that the in vitro T cell culture epitope mapping would reveal an epitope repertoire associated with de novo generation of responses from naïve T cells. However, while these epitope mapping studies were in progress, we and others detected significant ex vivo reactivity against bulk pools of SARS-CoV-2 peptides (37). We speculated that this might reflect the presence of memory T cells cross-reactive between common cold human coronaviruses (HCoVs) and SARS-CoV-2. These other HCoVs circulate widely in human populations and are typically responsible for mild, usually undiagnosed, respiratory illnesses, such as the common cold (1416). However, there is currently a lack of experimental data addressing whether memory CD4+ T cells, cross-reactive between SARS-CoV-2 and other HCoVs, do indeed exist.

We therefore next determined the degree of homology for all four widely circulating HCoVs for all 142 SARS-CoV-2 epitopes identified herein. For the analysis, we split the peptides into three groups based on immunogenicity: 1) never immunogenic, 2) immunogenic in one individual, or 3) immunogenic in two or more individuals (Fig. 1C). There was significantly higher sequence similarity in peptides recognized by more than one individual compared to peptides recognized by a single individual or not at all (p<0.0001, two-tailed Mann Whitney test). Additionally, almost all donors from the unexposed cohort used for the epitope screen were seropositive for three widely circulating common cold coronaviruses HCoVs (HCoV-NL63, HCoV-OC42, HCoV-HKU1) (fig. S1B). Thus, epitope homology and seropositivity data suggest that T cell cross-reactivity was plausible between SARS-CoV-2 and HCoVs already established in the human population.

To select epitope subsets to be analyzed in more detail, we plotted the T cell response magnitude of each positive epitope per donor (Fig. 1D). This analysis confirms the dominance of the spike antigen over the epitopes derived from the remainder of the genome (p<0.001, two-tailed Mann Whitney test).

Next, we selected two categories of SARS-CoV-2 epitopes of interest. The first category was epitopes with potential cross-reactivity from HCoVs. We initially selected the 67% arbitrary cut off, since we reasoned that a 9mer is the epitope region involved in binding to class II (23), and that often one or two residues in addition to the 9-mer core region are required for optimal recognition (24) (Fig. 1D, red). Second, we independently filtered for any epitopes associated with high responses (top ~30%; Fig. 1D, blue). This resulted in selection of 31 epitopes from spike (6 with high homology, and 25 for dominant responses), organized in a new CD4-[S31] pool. Similarly, we generated a new CD4-[R30] pool, composed of 30 epitopes from the remainder of the genome (9 with high homology and 21 associated with strong responses; Fig. 1D). These epitope pools were then used for further CD4+ T cell studies.

Direct evidence of reactivity to HCoV epitopes homologous to SARS-CoV-2 epitopes

To directly address whether reactivity against SARS-CoV-2 in unexposed donors could be ascribed to cross-reactivity against other HCoVs, we designed a peptide pool encompassing peptides homologous to CD4-R30 epitopes, derived from HCoV-229E, HCoV-NL63, HCoV-OC43, HCoV-HKU1 and several other HCoVs (see Methods), for a total of 129 HCoV homologs (HCoV-R129; table S2). Similarly, we synthesized a pool that encompassed peptides homologous to the SARS-CoV-2 CD4-S31 epitope pool, consisting of potential epitopes derived from other HCoVs, for a total of 124 HCoV homologs (HCoV-S124; table S3).

Next, we utilized an activation induced marker assay (AIM assay, (2527)) to detect virus-specific T cells in a new set of unexposed donors, not used for the epitope identification studies (Fig. 2A and table S4), and a set of convalescent COVID-19 patients (table S5). We detected significant ex vivo CD4+ T cell responses against the SARS-CoV-2 non-spike (CD4-R) and spike (CD4-S) peptides compared to the negative control (DMSO) (Fig. 2, B and C; p<0.0001 and p<0.0001 respectively, two-tailed Mann-Whitney). These responses were increased in COVID-19 cases compared to unexposed subjects (Fig. 2D; p=0.0015 and p=0.0022, respectively, two-tailed Mann-Whitney), as previously reported (4). In the unexposed subjects, significant frequencies of CD4+ T cells were detected against the CD4-R30 and CD4-S31 SARS-CoV-2 epitope pools compared to the negative control (Fig. 2B; p=0.0063 and p=0.0012, respectively, two-tailed Mann-Whitney). Significant CD4+ T cell reactivity was also seen against the corresponding HCoV-R129 and HCoV-S124 pools of matching homologous peptides from other HCoVs (Fig. 2D; p<0.0001 and p<0.0001, two-tailed Mann-Whitney). Detection of CD4+ T cells with peptide pools selected on the basis of homology was consistent with the hypothesis that cross-reactive CD4+ T cells between SARS-CoV-2 and other HCoVs exist in many individuals.

Fig. 2 CD4+ T cells in SARS-CoV-2-unexposed and COVID-19 individuals against HCoV epitopes homologous to SARS-CoV-2 epitopes.

(A) Example of flow cytometry gating strategy for antigen-specific CD4+ T cells based on AIM (OX40+ and CD137+ double expression) after stimulation of PBMCs with HCoV or SARS-CoV-2 peptides. (B to D) Antigen-specific CD4+ T cells measured as a percentage of AIM+ (OX40+CD137+) CD4+ T cells after stimulation of PBMCs with HCoV epitopes homologous to SARS-CoV-2 epitopes. Samples were derived from SARS-CoV-2-unexposed donors (unexposed, n = 25) and recovered COVID-19 patients (COVID-19, n = 20). Black bars indicate the geometric mean and geometric SD. Each dot is representative of an individual subject. Statistical pairwise comparisons [(B) and (C)] were performed with the Wilcoxon test. P-values related to comparisons to the DMSO controls are listed at the bottom of the graphs, while any significant p-values related to inter-group comparisons are listed on top of the graphs. Statistical comparisons across cohorts were performed with the Mann-Whitney test (D). See also figs. S5 and S6.

Reactivity against CD4-R30 and CD4-S31 (Fig. 2D; p=0.0008 and p=0.0026, respectively), but not against HCoV-R129 and HCoV-S124, was increased in COVID-19 cases compared to unexposed individuals (Fig. 2C). Thus, pre-existing CD4+ T cell reactivity to HCoV epitopes is modulated by COVID-19 and exposure to cross-reactive SARS-CoV-2 epitopes in COVID-19. These data from COVID-19 cases do not support the hypothesis that the HCoV exposure might induce an original antigenic sin phenomenon, impairing subsequent T cell responses to SARS-CoV-2 epitopes (28, 29), at least for COVID-19 cases of average disease severity.

Next, we examined the ex vivo memory phenotype of the T cells responding to the various epitope megapools. Results from one representative unexposed donor are shown in Fig. 3A. Responding cells in unexposed donors were predominantly found in the effector memory CD4+ T cell population (CD45RAnegCCR7neg), followed by the central memory T cells (CD45RAnegCCR7pos) (30) (Fig. 3, A, B, and D). Comparable patterns of effector and central memory cells were observed among the antigen-specific CD4+ T cells detected in the COVID-19 cases (Fig. 3, C and D). In conclusion, the CD4+ T cells in unexposed donors that recognize SARS-CoV-2 epitopes, and epitopes from other HCoVs, have a memory phenotype. Overall, these data are consistent with the SARS-CoV-2-reactive CD4+ T cells in unexposed subjects being HCoV-specific memory CD4+ T cells with cross-reactivity to SARS-CoV-2.

Fig. 3 Phenotypes of antigen-specific CD4+ T cells from SARS-CoV-2-unexposed and COVID-19 responding to HCoV epitopes homologous to SARS-CoV-2 epitopes.

(A) Example of flow cytometry gating strategy for antigen-specific CD4+ T cell subsets after overnight stimulation of PBMCs with HCoV or SARS-CoV-2 peptides ex vivo. (B and C) Phenotype of antigen-specific CD4+ T cells (OX40+CD137+) responding to indicated pools of SARS-CoV-2 and HCoV epitopes in unexposed and recovered COVID-19 patients. Data are shown as mean +/− SD. Each dot represents an individual subject. Statistical pairwise comparisons in (B) and (C) were performed with the Wilcoxon test. (D) Overall averages of antigen-specific CD4+ T cell subsets detected in unexposed and COVID-19 subjects. See also fig. S5.

Identification of SARS-CoV-2 epitopes cross-reactive with other common HCoVs

The epitopes derived from the CD4-R30 and CD4-S31 pools were used to generate short term T cell lines derived by stimulation of PBMCs from unexposed subjects. PBMCs were stimulated with an individual SARS-CoV-2 cognate epitope demonstrated to be recognized by T cells from that subject (Fig. 1 and table S1). Overall, T cell lines could be derived specific for a total of 42 SARS-CoV-2 epitopes.

These T cell lines were next tested for cross-reactivity against various coronavirus homologs, analogous to an approach previously successful in flavivirus studies (31). Cross-reactivity between SARS-CoV-2 epitope recognition and other HCoV epitope recognition was detected for 10/42 (24%) of the T cell lines (Fig. 4, A to J). Cross-reactivity was associated with epitopes derived from SARS-CoV-2 spike, N, nsp8, nsp12, and nsp13. In three cases, HCoV analogs were better antigens than the SARS-CoV-2 peptide, suggesting that they may be the cognate immunogen (Fig. 4, E, I, and J). One SARS-CoV-2 spike epitope was tested in two different donors with similar findings suggesting that HCoV cross-reactivity patterns are recurrent across individuals. Non-cross-reactive SARS-CoV-2 T cell lines are also shown (Fig. 4, K to L, and fig. S4). It is possible that cross-reactivity to these epitopes might be detected if T cell lines from additional individuals would be tested. In addition, these epitopes might be homologous to some other not yet unidentified viral sequence, or recognized by cognate naive T cells expanding in the in vitro culture (32). In addition, only 3/18 cases of strong response epitopes (defined in Fig. 1D) were cross-reactive, compared to 4/5 of weaker epitopes (p=0.02, Fisher’s Exact test). To further demonstrate that the cross-reactive responses in unexposed donors are indeed derived from memory T cells, we stimulated purified memory and naïve CD4+ T cells with the CD4-[S31] epitope pool. After 14 days, we detected responses to the CD4-[S31] peptide pool from cultures of memory CD4+ T cells but not naïve CD4+ T cells (fig. S8). In sum, these data demonstrate that memory CD4+ T cells recognizing common cold coronaviruses including HCoV-OC43, HCoV-HKU1, HCoV-NL63, and HCoV-229E can exhibit substantial cross-reactivity to the homologous epitope in SARS-CoV-2.

Fig. 4 Cross-reactivity of SARS-CoV-2 and homologous HCoV peptides.

Twelve short-term cell lines were generated using specific SARS-CoV-2 epitope/donor combinations selected on the basis of the primary screen. After 14 days of in vitro expansion, each T cell line was tested with the SARS-CoV-2 epitope used for stimulation and peptides corresponding to analogous sequences from other HCoVs at six different concentrations (1μg/mL, 0.1μg/mL, 0.01μg/mL, 0.001μg/mL, 0.0001μg/mL, 0.00001μg/mL). Spot forming cells per million (SFC/106) PBMCs are plotted for T cell lines stimulated with each peptide. See also fig. S7.

Next we examined, for each SARS-CoV2:HCoV epitope pair, the degree of amino acid sequence homology and any relationship between homology and T cell cross-reactivity, considering different ranges of potentially relevant homology. Only 1% (1/99) of peptide pairs with 33-40% homology were cross-reactive. In the 47-60% epitope homology range, we observed cross-reactivity in 21% of cases (7/33). Strikingly, epitope homology ≥ 67% was associated with cross-reactivity in 57% of cases (21/37; p=0.0001 or p=0.0033 by two-tailed Fisher’s Exact test, when compared against the 33-40% range epitopes or 47-60% range, respectively). A relationship was observed between epitope homology and CD4+ T cell cross-reactivity. The data demonstrated that the arbitrary selection utilized as described for Fig. 1D, was indeed supported by the experimental data. Thus, ~67% amino acid homology appears to be a useful benchmark for consideration of potential cross-reactivity between class II epitopes. In summary, here we identified more than 140 human T cell epitopes derived from across the genome of SARS-CoV-2. We provide direct evidence that numerous CD4+ T cells that react to SARS-CoV-2 epitopes actually cross-react with corresponding homologous sequences from any of multiple different commonly circulating HCoVs, and that these reactive cells are largely canonical memory CD4+ T cells. These finding of cross-reactive HCoV T cell specificities are stark contrast to HCoV neutralizing antibodies, which are HCoV species-specific and did not show cross-reactivity against SARS-CoV-2 RBD (3335). Based on these data, it is plausible to hypothesize that pre-existing cross-reactive HCoV CD4+ T cell memory in some donors could be a contributing factor to variations in COVID-19 patient disease outcomes, but this is at present highly speculative (36).


Materials and Methods

Figs. S1 to S8

Tables S1 to S8

References (3147)

MDAR Reproducibility Checklist

This is an open-access article distributed under the terms of the Creative Commons Attribution license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

References and Notes

Acknowledgments: We would like to thank the Flow Cytometry Core Facility at the La Jolla Institute for Immunology for the technical assistance provided during FACS experiments. Funding: This work was funded by the NIH NIAID under awards AI42742 (Cooperative Centers for Human Immunology) (S.C. and A.S.), National Institutes of Health contract Nr. 75N9301900065 (A.S. and D.W.), and U19 AI118626 (A.S. and B.P.). This work was additionally supported by the NIAID under K08 award AI135078 (J.D.), UCSD T32s AI007036 and AI007384 Infectious Diseases Division (S.Ram., S.Raw.) and the John and Mary Tu Foundation (D.S). Author contributions: Conceptualization: D.W., S.C., and A.S.; Investigation, J.M. A.G., A.T, J.S, E.P, S.M, M.L, P.R, L.Q., A.S., E.Y, R.A.S, A.M, L.P and D.W.; Formal Analysis, J.M, A.G., D.W, J.M.D, A.M, L.P and S.C.; Resources, S.I.R., Z.C.B. S.A.R., D.M.S., S.C., and A.S.; Data Curation and Bioinformatic analysis, J.A.G. and B.P.; Writing, S.C., A.S., and D.W.; Supervision, B.P., A.M.d.S., S.C., A.S and D.W.; Project Administration, A.F.; Funding Acquisition, S.C., A.S., D.W., S.R., and J.D. Competing interests: A.S and S.C. are inventors on patent application no. 63/012,902, submitted by La Jolla Institute for Immunology, that covers the use of the megapools and peptides thereof for therapeutic and diagnostic purposes. A.S. is a consultant for Gritstone and Flow Pharma. A.S. and S.C are consultants for Avalia. All other authors declare no conflict of interest. Data and materials availability: All datasets generated for this study are included in the supplementary materials. All the epitopes identified in this study have been submitted to immune epitope database ( Epitope pools utilized in this paper will be made available to the scientific community upon request and execution of a material transfer agreement (MTA) directed to D.W. This work is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. To view a copy of this license, visit This license does not apply to figures/photos/artwork or other content included in the article that is credited to a third party; obtain authorization from the rights holder before using such material.

Stay Connected to Science

Navigate This Article