Research Article

Deep immune profiling of COVID-19 patients reveals distinct immunotypes with therapeutic implications

See allHide authors and affiliations

Science  15 Jul 2020:
DOI: 10.1126/science.abc8511


COVID-19 is currently a global pandemic, but human immune responses to the virus remain poorly understood. We analyzed 125 COVID-19 patients, and compared recovered to healthy individuals using high dimensional cytometry. Integrated analysis of ~200 immune and ~50 clinical features revealed activation of T cell and B cell subsets in a proportion of patients. A subgroup of patients had T cell activation characteristic of acute viral infection and plasmablast responses reaching >30% of circulating B cells. However, another subgroup had lymphocyte activation comparable to uninfected subjects. Stable versus dynamic immunological signatures were identified and linked to trajectories of disease severity change. These analyses identified three “immunotypes” associated with poor clinical trajectories versus improving health. These immunotypes may have implications for the design of therapeutics and vaccines for COVID-19.

The COVID-19 pandemic has to date caused >7 million infections resulting in over 400,000 deaths. Following infection with SARS-CoV2, COVID-19 patients can experience mild or even asymptomatic disease, or can present with severe disease requiring hospitalization and mechanical ventilation. The case fatality rate can be as high as ~10% (1). Some severe COVID-19 patients display an acute respiratory distress syndrome (ARDS), reflecting severe respiratory damage. In acute respiratory viral infections, pathology can be mediated by the virus directly, by an overaggressive immune response, or both (24). However, in severe COVID-19 disease, the characteristics of, and role for the immune response, as well as how these responses relate to clinical disease features, remain poorly understood.

SARS-CoV2 antigen-specific T cells have been identified in the central memory (CM), effector memory (EM), and CD45RA+ effector memory (EMRA) compartments (5) but the characteristics of these cells and their role in infection or pathogenesis remain unclear. Recovered subjects more often have evidence of virus-specific CD4 T cell responses than virus-specific CD8 T cell responses, though pre-existing CD4 T cell responses to other coronaviruses also are found in a subset of subjects in the absence of SARS-CoV2 exposure (6). Inflammatory responses have been reported, including increases in IL-6- or GM-CSF-producing CD4 T cells in the blood (7) or decreases in immunoregulatory subsets such as regulatory T cells (Treg) or ɣδ T cells (811). T cell exhaustion (12, 13) or increased inhibitory receptor expression on peripheral T cells has also been reported (7, 14), though these inhibitory receptors are also increased following T cell activation (15). Moreover, although there is evidence of T cell activation in COVID-19 patients (16), some studies have found decreases in polyfunctionality (12, 17) or cytotoxicity (12); however, these changes have not been observed in other studies (13). Furthermore, how this activation should be viewed in the context of COVID-19 lymphopenia (1820) remains unclear.

Most patients seroconvert within 7-14 days of infection and increased plasmablasts (PB) have been reported (16, 2123). However, the role of humoral responses in the pathogenesis of COVID-19 remains unclear. Whereas IgG levels reportedly drop slightly ~8 weeks after symptom onset (24, 25), recovered patients maintain high Spike-specific IgG titers (6, 26). IgA levels also can remain high and may correlate with disease severity (25, 27). Furthermore, neutralizing antibodies can control SARS-CoV2 infection in vitro and in vivo (4, 28, 29). Indeed, convalescent plasma containing neutralizing antibodies can improve clinical symptoms (30). However, not all patients that recover from COVID-19 have detectable neutralizing antibodies (6, 26), suggesting a complex relationship between humoral and cellular response in COVID-19 pathogenesis.

Taken together, this previous work provokes questions about the potential diversity of immune responses to SARS-CoV2 and the relationship of this immune response diversity to clinical disease. However, many studies describe small cohorts or even single patients, limiting a comprehensive interrogation of this diversity. The relationship of different immune response features to clinical parameters, as well as the changes in immune responses and clinical disease over time, remain poorly understood. Because potential therapeutics for COVID-19 patients include approaches to inhibit, activate, or otherwise modulate immune function, it is essential to define the immune response characteristics related to disease features in well-defined patient cohorts.

Acute SARS-CoV2 infection in humans results in broad changes in circulating immune cell populations

We conducted an observational study of hospitalized patients with COVID-19 at the University of Pennsylvania (UPenn IRB 808542) that included 149 hospitalized adults with confirmed SARS-CoV2 infection (COVID-19 patients). Blood was collected at enrollment (typically ~24-72 hours after admission; Fig. 1A). Additional samples were obtained from patients who remained hospitalized on day 7. Blood was also collected from non-hospitalized patients who had recovered from documented SARS-CoV2 infection (Recovered Donors (RD); n = 46), as well as from healthy donors (HD; n = 70) (UPenn IRB 834263) (Fig. 1A). Clinical metadata are available from the COVID-19 patients over the course of disease (table S1). Of the total patients and donors, flow cytometry data from PBMCs was collected from COVID-19 patients (n = 125), RDs (n = 36), and HDs (n = 60) along with clinical metadata (Fig. 1A and tables S2 to S4).

Fig. 1 Clinical characterization of patient cohorts, inflammatory markers, and quantification of major immune subsets.

(A) Overview of patient cohorts in study, including healthy donors (HD), recovered donors (RD), and COVID-19 patients. (B) Quantification of key clinical parameters in COVID-19 patients. Each dot is a COVID-19 patient; Healthy donor range indicated in green. (C) Spearman correlation and hierarchical clustering of indicated features for COVID-19 patients. (D) Representative flow cytometry plots and (E) frequencies of major immune subsets. (F) Ratio of CD4:CD8 T cells. (G) Spearman correlation of CD4:CD8 ratio and clinical lymphocyte count per patient. Dark and light gray shaded regions represent clinical normal range and normal range based on study HD, respectively. Vertical dashed line indicates clinical threshold for lymphopenia. (H) Spearman correlations of indicated subsets with various clinical features. (E and F) Each dot represents an individual healthy donor (green), RD (blue), or COVID-19 patient (red). Significance determined by unpaired Wilcoxon test with BH correction: *p < 0.05, **p < 0.01, ***p < 0.001, and ****p < 0.0001.

COVID-19 patients had a median age of 60 and were significantly older than HD and RD (median age of 41 and 29 respectively), though the age distributions for all three cohorts overlapped (Fig. 1A and fig. S1A). For COVID-19 patients, median BMI was 29 (range 16-78), and 68% were African American (table S2). Comorbidities in COVID-19 patients were dominated by cardiovascular risk factors (83% of the cohort). Nearly 20% of subjects suffered from chronic kidney disease and 18% had a previous thromboembolic event. A subset of patients (18%) were immunosuppressed, and 7% and 6% of patients were known to have a diagnosis of cancer or a pre-existing pulmonary condition, respectively. 45% of the patients were treated with hydroxychloroquine (HCQ), 31% with steroids, and 29% with remdesivir. Eighteen individuals died in the hospital or within a 30 day follow-up. The majority of the patients were symptomatic at diagnosis and were enrolled ~9 days after initiation of symptoms. Approximately 30% of patients required mechanical ventilation at presentation, with additional extracorporeal membrane oxygenation (ECMO) in four cases.

As has been reported for other COVID-19 patients (31), this COVID-19 cohort presented with a clinical inflammatory syndrome. C reactive protein (CRP) was elevated in over 90% of subjects and LDH and D-dimer were increased in the vast majority, whereas ferritin was above normal in ~75% of COVID-19 patients (Fig. 1B and fig. S1B). Similarly, troponin and NT-proBNP were increased in some patients (fig. S1B). In a subset of patients where it was measured, IL-6 levels were normal in 5 patients, moderately elevated in 5 patients (6-20 pg/ml), and high in 31 patients (21-738 pg/ml) (fig. S1B). Although white blood cell counts (WBC) were mostly normal, individual leukocyte populations were altered in COVID-19 patients (Fig. 1B). A subset of patients had high PMN counts (fig. S1B) as described previously (8, 32) and in a companion study (33). Furthermore, approximately half of the COVID-19 patients were clinically lymphopenic (ALC <1 THO/ul, Fig. 1B). In contrast, monocyte, eosinophil, and basophil counts were mostly normal (Fig. 1B and fig. S1B).

To examine potential associations between these clinical features, we performed correlation analysis (Fig. 1C and fig. S1C). This analysis revealed correlations between different COVID-19 disease severity metrics, as well as clinical features or interventions associated with more severe disease (e.g., D-dimer, vasoactive medication) (Fig. 1C and fig. S1C). WBC and PMN also correlated with metrics of disease severity (e.g., APACHE III), as well as with IL-6 levels (Fig. 1C and fig. S1C). Other relationships were also apparent, including correlations between age or mortality and metrics of disease severity and many other correlations between clinical measures of disease, inflammation, and co-morbidities (Fig. 1C and fig. S1C). Thus, COVID-19 patients presented with varied pre-existing comorbidities, complex clinical phenotypes, evidence of inflammation in many patients, and clinically altered leukocyte counts.

To begin to interrogate immune responses to acute SARS-CoV2 infection, we compared peripheral blood mononuclear cells (PBMC) of COVID-19 patients, RD, and HD subjects using high dimensional flow cytometry. We first focused on the major lymphocyte populations. B cell and CD3+ T cell frequencies were decreased in COVID-19 patients compared to HD or RD subjects, reflecting clinical lymphopenia, whereas the relative frequency of non-B and non-T cells was correspondingly elevated (Fig. 1, D and E). Although a numerical expansion of a non-B, non-T cell type is possible, loss of lymphocytes likely results in an increase in the relative frequency of this population. This non-B, non-T cell population is also interrogated in more detail in the companion study. Examining only CD3 T cells revealed preferential loss of CD8 T cells compared to CD4 T cells (Fig. 1, F and G, and fig. S1D); this pattern was reflected in absolute numbers estimated from the clinical data, where both CD4 and CD8 T cell counts in COVID-19 patients were lower than the clinical reference range, though the effect was more prominent for CD8 T cells (49/61 subjects below normal) than for CD4 T cells (38/61 subjects below normal) (fig. S1E). These findings are consistent with previous reports of lymphopenia during COVID-19 disease (1720) but highlight a preferential impact on CD8 T cells.

We next asked if the changes in these lymphocyte populations were related to clinical metrics (Fig. 1H). Lower WBC counts were associated preferentially with lower frequencies of CD4 and CD8 T cells and increased non-T non-B, but not with B cells (Fig. 1H). These lower T cell counts were associated with clinical markers of inflammation including ferritin, D-dimer, and hsCRP (Fig. 1H), whereas altered B cell frequencies were not. Thus, hospitalized COVID-19 patients present with a complex constellation of clinical features that may be associated with altered lymphocyte populations.

SARS-CoV2 infection is associated with CD8 T cell activation in a subset of patients

We next applied high-dimensional flow cytometric analysis to further investigate lymphocyte activation and differentiation during COVID-19 disease. We first used principal component analysis (PCA) to examine the general distribution of immune profiles from COVID-19 patients (n = 118), RD (n = 60), and HD (n = 36) using 193 immune parameters identified by high-dimensional flow cytometry (tables S5 and S6). COVID-19 patients clearly segregated from RD and HD in PCA space, whereas RD and HD largely overlapped (Fig. 2A). We investigated the immune features driving this COVID-19 immune signature. Given their role in response to viral infection, we focused on CD8 T cells. Six major CD8 T cell populations were examined using the combination of CD45RA, CD27, CCR7, and CD95 cell surface markers to define naïve (CD45RA+CD27+CCR7+CD95), central memory (CD45RACD27+CCR7+ [CM]), effector memory (CD45RACD27+CCR7 [EM1], CD45RACD27CCR7+ [EM2], CD45RACD27CCR7 [EM3]), and EMRA (CD45RA+CD27CCR7) (Fig. 2B) CD8 T cells. Among the CD8 T cell populations, there was an increase in the EM2 and EMRA populations and a decrease in EM1 (Fig. 2C). Furthermore, the frequency of CD39+ cells was increased in COVID-19 patients compared to HD (Fig. 2D). Although the frequency of PD-1+ cells was not different in the total CD8 population (Fig. 2D), it was increased for both CM and EM1 (fig. S2A). Finally, all major CD8 T cell naive/memory populations in RD were comparable to HD (Fig. 2, C and D, and fig. S2A).

Fig. 2 CD8 T cell subset skewing and activation patterns in COVID-19 patients and potential links to T cell driven cytokines.

(A) Principle Component Analysis (PCA) of aggregated flow cytometry data. (B) Representative flow cytometry plots of the gating strategy for CD8 T cell subsets. (C) Frequencies of CD8 T cell subsets as indicated. (D) Frequencies of PD1+ and CD39+ cells. Frequencies of (E) KI67+ and (F) HLA-DR+CD38+ cells and representative flow cytometry plots; green line at upper decile of HD. (G) (Top) Global viSNE projection of non-naïve CD8 T cells for all subjects pooled, non-naïve CD8 T cells from healthy donor (HD), recovered donor (RD), and COVID-19 patients concatenated and overlaid. (Bottom) viSNE projections of indicated protein expression. (H) viSNE projection of non-naïve CD8 T cell clusters identified by FlowSOM clustering. (I) Mean fluorescence intensity (MFI) as indicated, column-scaled z-score. (J) Percentage of non-naive CD8 T cells from each cohort in each FlowSOM cluster. Boxes represent IQR. (C, D, E, F, J) Each dot represents an individual healthy donor (HD; green), recovered donor (RD; blue), or COVID-19 patient (red). Significance determined by unpaired Wilcoxon test with BH correction: *p < 0.05, **p < 0.01, ***p < 0.001, and ****p < 0.0001.

Most acute viral infections induce proliferation and activation of CD8 T cells detectable by increases in KI67 or co-expression of CD38 and HLA-DR (34, 35). There was a significant increase in KI67+ and also HLA-DR+CD38+ non-naïve CD8 T cells in COVID-19 patients compared to HD or RD (Fig. 2, E and F). In COVID-19 patients, KI67+ CD8 T cells were increased compared to HD and RD across all subsets of non-naïve CD8 T cells, including CM and EM1 (fig. S2B). These data indicate broad T cell activation, potentially driven by bystander activation and/or homeostatic proliferation in addition to antigen-driven activation of virus-specific CD8 T cells. This activation phenotype was confirmed by HLA-DR and CD38 co-expression that was significantly increased for all non-naïve CD8 T cell subsets (Fig. 2F and fig. S2C). However, the magnitude of the KI67+ or CD38+HLA-DR+ CD8 T cells varied widely in this cohort. The frequency of KI67+ CD8 T cells correlated with the frequency of CD38+HLA-DR+ CD8 T cells (fig. S2D). However, the frequency of CD38+HLA-DR+, but not KI67+ CD8 T cells, was elevated in COVID-19 patients who had concomitant infection with another microbe but was not impacted by pre-existing immunosuppression or treatment with steroids (fig. S2E). Moreover, these changes in CD8 T cell subsets in COVID-19 patients did not show clear correlations with individual metrics of clinical disease such as hsCRP or D-dimer (fig. S2E), although the frequency of KI67+ CD8 T cells associated with IL-6 and ferritin levels. Although CD8 T cell activation was common, ~20% of patients had no increase in KI67+ or CD38+HLA-DR+ CD8 T cells above the level found in HD (Fig. 2, E and F). Thus, although robust CD8 T cell activation was a clear characteristic of many hospitalized COVID-19 patients, a substantial fraction of patients had little evidence of CD8 T cell activation in the blood compared to controls.

To gain more insights, we applied global high-dimensional mapping of the 27-parameter flow cytometry data. A tSNE representation of the data highlighted key regions of non-naïve CD8 T cells found preferentially in COVID-19 patients (Fig. 2G). A major region of this tSNE map present in COVID-19 patients, but not HD or RD, were CD8 T cells that enriched for expression of CD38, HLA-DR, KI67, CD39, and PD1 (Fig. 2G), highlighting the co-expression of these activation markers with other features including CD95 (i.e., FAS). Notably, although non-naïve CD8 T cells from RD were highly similar to those from HD, subtle differences existed, including in the lower region highlighted by T-bet and CX3CR1 (Fig. 2G). To further define and quantify these differences between COVID-19 patients and controls, we performed FlowSOM clustering (Fig. 2H) and compared expression of fourteen CD8 T cell markers to identify each cluster (Fig. 2I). This approach identified an increase in cells in several clusters including Clusters 1, 2, and 5 in COVID-19 patients, reflecting CD45RA+CD27CCR7 TEMRA-like populations that expressed CX3CR1 and varying levels of T-bet (Fig. 2, I and J). Clusters 12 and 14 contained CD27+HLA-DR+CD38+KI67+PD-1+ activated, proliferating cells and were more prevalent in COVID-19 disease (Fig. 2, I and J, and fig. S2F). In contrast, the central Eomes+CD45RACD27+CCR7 EM1-like Cluster 6 and T-bethiCX3CR1+ Cluster 11 were decreased compared to HD (Fig. 2, I and J, and fig. S2F). Thus, CD8 T cell responses in COVID-19 patients were characterized by populations of activated, proliferating CD8 T cells in a subgroup of patients.

SARS-CoV2 infection is associated with heterogeneous CD4 T cell responses and activation of CD4 T cell subsets

We next examined six well-defined CD4 T cell subsets as above for the CD8 T cells, including naïve, effector memory (EM1,2,3), central memory (CM), and EMRA (Fig. 3A). Given the potential role of antibodies in the response to SARS-CoV2 (27, 29), we also analyzed circulating Tfh (CD45RAPD1+CXCR5+ [cTfh] (36)) and activated circulating Tfh (CD38+ICOS+ [activated cTfh]), the latter of which may be more reflective of recent antigen encounter and emigration from the germinal center (37, 38) (Fig. 3A). These analyses revealed a relative loss of naïve CD4 T cells compared to controls, but increased EM2 and EMRA (Fig. 3B). The frequency of activated but not total cTfh was statistically increased in COVID-19 patients compared to HD, though this effect appeared to be driven by a subgroup of patients (Fig. 3B). It is worth noting that activated cTfh frequencies were also higher in RD compared to HD (Fig. 3B), perhaps reflecting residual COVID-19 responses in that group. Frequencies of KI67+ or CD38+HLA-DR+ non-naïve CD4 T cells were increased in COVID-19 patients (Fig. 3, C and E); however, this change was not equivalent across all CD4 T cell subsets. The most substantial increases in both KI67+ and CD38+HLA-DR+ cells were found in the effector memory populations (EM, EM2, EM3) and in cTfh (fig. S3, A and B). Although some subjects had increased activation of EMRA, this was less pronounced. In contrast, PD1 expression was increased in all other non-naïve populations compared to HD or RD (fig. S3C). Co-expression of CD38 and HLA-DR by non-naïve CD4 T cells correlated with the frequency of KI67+ non-naïve CD4 T cells (fig. S3D). Moreover, the frequency of total non-naïve CD4 T cells that were CD38+HLA-DR+ correlated with the frequency of activated cTfh (fig. S3E). In general, the activation of CD4 T cells was correlated with the activation of CD8 T cells (Fig. 3, D and F). However, whereas ~2/3 of patients had KI67+ non-naïve CD4 or CD8 T cell frequencies above controls, ~1/3 of the COVID-19 patients had no increase in frequency of KI67+ CD4 or CD8 T cells above that observed in HD (Fig. 3, D and F). Moreover, although most patients had similar proportions of activated CD4 T cells compared to CD8 T cells, there was a subgroup of patients that had disproportionate activation of CD4 T cells compared to CD8 T cells (Fig. 3, D and F). KI67+ and CD38+HLA-DR+ non-naïve CD4 T cell frequencies correlated with ferritin and with APACHE III score (fig. S3F), suggesting a relationship between CD4 T cell activation and disease severity. Immunosuppression did not impact CD4 T cell activation; however, early steroid administration was weakly associated with CD4 T cell KI67 (fig. S3F). Together, these data highlight T cell activation in COVID-19 patients similar to what has been observed in other acute infections or vaccinations (37, 39, 40), but also identify patients with high, low, or essentially no T cell response based on KI67+ or CD38+HLA-DR+ compared to control subjects.

Fig. 3 CD4 T cell activation in a subset of COVID-19 patients associates with distinct CD4 T cell subsets.

(A) Representative flow cytometry plots of the gating strategy for CD4 T cell subsets. (B) Frequencies of CD4 T cell subsets as indicated. (C) Frequencies of KI67+ cells; green line at upper decile of healthy donors (HD). Representative flow cytometry plots. (D) KI67+ cells from non-naïve CD4 T cells versus non-naïve CD8 T cells, Spearman correlation of COVID-19 patients. (E) Frequencies of HLA-DR+CD38+ cells; green line at upper decile of healthy donors (HD). Representative flow cytometry plots. (F) HLA-DR+CD38+ cells from non-naïve CD4 versus non-naïve CD8 T cells, Spearman correlation of COVID-19 patients. (G) (Top) Global viSNE projection of non-naïve CD4 T cells for all subjects pooled, non-naïve CD4 T cells from HD, RD, and COVID-19 patients concatenated and overlaid. (Bottom) viSNE projections of indicated protein expression. (H) viSNE projection of non-naïve CD4 T cell clusters identified by FlowSOM clustering. (I) Mean fluorescence intensity (MFI) as indicated, column-scaled z-score. (J) Percentage of non-naïve CD4 T cells from each cohort in each FlowSOM cluster. Boxes represent IQR. (B, C, E, J) Each dot represents an individual healthy donor (HD; green), recovered donor (RD; blue), or COVID-19 patient (red). Significance determined by unpaired Wilcoxon test with BH correction: *p < 0.05, **p < 0.01, ***p < 0.001, and ****p < 0.0001.

Projecting the global CD4 T cell differentiation patterns into the high-dimensional tSNE space again identified major alterations in the CD4 T cell response during COVID-19 infection compared to HDs and RDs (Fig. 3G). In COVID-19 infection, there was a notable increase in density in tSNE regions that mapped to expression of CD38, HLA-DR, PD1, CD39, KI67, and CD95 (Fig. 3G), similar to CD8 T cells. To gain more insight into these CD4 T cell changes, we again used a FlowSOM clustering approach (Fig. 3, H and I). This analysis identified an increase in Clusters 13 and 14 in COVID-19 patients compared to HD and RD that represent populations expressing HLA-DR, CD38, PD1, KI67 and CD95, as well as Cluster 15 that contained Tbet+CX3CR1+ “effector-like” CD4 T cells (Fig. 3, I and J, and fig. S3G). In contrast, this clustering approach identified reduction in CXCR5+ cTfh-like cells (Clusters 2, 3) in COVID-19 subjects compared to HD (Fig. 3, I and H). Taken together, this multidimensional analysis revealed distinct populations of activated/proliferating CD4 T cells that were enriched in COVID-19 patients.

A key feature of COVID-19 disease is thought to be an inflammatory response that, at least in some patients, is linked to clinical disease manifestation (2, 4) and high levels of chemokines/cytokines, including IL-1RA, IL-6, IL-8, IL-10, and CXCL10 (11, 41). To investigate the potential connection of inflammatory pathways to T cell responses, we performed 31-plex Luminex analysis on paired plasma and culture supernatants of anti-CD3/anti-CD28 stimulated PBMC from a subset of COVID-19 patients and HD controls. Due to biosafety restrictions, only eight COVID-19 patient blood samples that were confirmed SARS-CoV2 RNA negative in the blood by PCR could be studied (fig. S4A). Half of these COVID-19 patients had plasma CXCL10 concentrations that were ~15 fold higher than HD controls, whereas the remainder showed only a limited increase (fig. S4B). CXCL9, CCL2, and IL-1RA were also significantly increased. In contrast, chemokines involved in the recruitment of eosinophils (eotaxin) or activated T cells (CCL5) were decreased. IL-6 was not elevated in this group of patients, in contrast to the subset of individuals tested clinically (fig. S1B), potentially because IL-6 was measured in the hospital setting often when systemic inflammation was suspected. Following stimulation in vitro, PBMC from COVID-19 patients produced more CCL2, CXCL10, eotaxin, and IL-1RA than HD (fig. S4, C and D) and concentrations of CXCL10 and CCL2 correlated between the matched supernatant from stimulated PBMC and plasma samples (fig. S4E). Finally, we investigated whether CD8 T cells from COVID-19 subjects were capable of producing IFNɣ following polyclonal stimulation. Following ɑCD3+ɑCD28 stimulation, similar proportions of CD8 T cells from COVID-19 patients and HD controls produced IFNɣ, suggesting that PBMC from COVID-19 patients were responsive to TCR crosslinking (fig. S4, F to H). The ability of T cells to produce IFNɣ following stimulation occurred in patients with increases in KI67 as well as patients with low KI67 (fig. S4, F to H). Taken together, these data support the notion that a subgroup of COVID-19 patients has elevated systemic cytokines and chemokines, including myeloid recruiting chemokines.

COVID-19 infection is associated with increased frequencies of plasmablasts and proliferation of memory B cell subsets

B cell subpopulations were also altered in COVID-19 disease. Whereas naïve B cell frequencies were similar in COVID-19 patients and RD or HD, the frequencies of class-switched (IgDCD27+) and not-class-switched (IgD+CD27+) memory B cells were significantly reduced (Fig. 4A). Conversely, frequencies of CD27IgD B cells and CD27+CD38+ PB were often robustly increased (Fig. 4, A and B). In some cases, PB represented >30% of circulating B cells, similar to levels observed in acute Ebola or Dengue virus infections (42, 43). However, these PB responses were only observed in ~2/3 of patients, with the remaining patients displaying PB frequencies similar to HD and RD (Fig. 4B). KI67 expression was markedly elevated in all B cell subpopulations in COVID-19 patients compared to either control group (Fig. 4C). This observation suggests a role for an antigen-driven response to infection and/or lymphopenia-driven proliferation. Higher KI67 in PB may reflect recent generation in the COVID-19 patients compared to HD or RD. CXCR5 expression was also reduced on all major B cell subsets in COVID-19 patients (Fig. 4D). Loss of CXCR5 was not specific to B cells, however, as expression was also decreased on non-naïve CD4 T cells (Fig. 4E). Changes in the B cell subsets were not associated with co-infection, immune suppression, or treatment with steroids or other clinical features, though a possible negative association of IL-6 and PB was revealed (fig. S5A). These observations suggest that the B cell response phenotype of COVID-19 disease was not simply due to systemic inflammation.

Fig. 4 Deep profiling of COVID-19 patient B cell populations reveals robust plasmablast populations and other B cell alterations.

(A) Gating strategy and frequencies of non-PB B cell subsets. (B) Representative flow cytometry plots and frequencies of PB; green line at upper decile of HD. (C) Representative flow cytometry plots and frequencies of KI67+ B cells. (D) (Left) Representative histograms of CXCR5 expression and (right) CXCR5 GMFI of B cell subsets. (E) CXCR5 GMFI of non-naïve CD4 T cells and cTfh. (F) Spearman correlation between PB and activated cTfh. (G) Spearman correlation between PB and anti-SARS-CoV2 IgG. (H and I) Spearman correlation between activated cTfh and anti-SARS-CoV2 (H) IgM and (I) IgG. (J) (Top) Global viSNE projection of B cells for all subjects pooled, with B cell populations of each cohort concatenated and overlaid. (Bottom) viSNE projections of indicated protein expression. (K) Hierarchical clustering of Earth Mover’s Distance (EMD) using Pearson correlation, calculated pairwise for B cell populations for all subjects; row-scaled z-score. (L) Percentage of cohort in each EMD group. (M) Global viSNE projection of B cells for all subjects pooled, with EMD groups 1-3 concatenated and overlaid. (N) B cell clusters identified by FlowSOM clustering. (O) MFI as indicated; column-scaled z-score. (P) Percentage of B cells from each cohort in each FlowSOM cluster. Boxes represent IQR. (A to F, P) Dots represent individual healthy donor (HD; green), recovered donor (RD; blue), or COVID-19 (red) subjects. (A to E, P) Significance determined by unpaired Wilcoxon test with BH correction: *p < 0.05, **p < 0.01, ***p < 0.001, and ****p < 0.0001. (G to I) Black horizontal line represents positive threshold.

During acute viral infections or vaccination, PB responses are transiently detectable in the blood and correlate with cTfh responses (40). Comparing the frequency of PB to the frequency of total cTfh or activated cTfh, however, revealed a weak correlation only with activated cTfh (Fig. 4F and fig. S5, B and C). Furthermore, some patients had robust activated cTfh responses but PB frequencies similar to controls, whereas other patients with robust PB responses had relatively low frequencies of activated cTfh (Fig. 4F and fig. S5, B and C). However, there was also an association between PB frequency and CD38+HLA-DR+ or KI67+ CD4 T cells that might reflect a role for non-CXCR5+ CD4 T cell help (fig. S5D), but such a relationship did not exist for the equivalent CD8 T cell populations (fig. S5E). Although ~70% of COVID-19 patients analyzed made antibodies against SARS-CoV2 spike protein (79/111 IgG; 77/115 IgM (44)), antibody levels did not correlate with PB frequencies (Fig. 4G and fig. S5F). The occasional lack of antibody did not appear to be related to immunosuppression in a small number of patients (fig. S5G). The lack of PB correlation with antibody suggests that a proportion of these large PB responses were: i) generated against SARS-CoV2 antigens other than the spike protein or ii) inflammation-driven and perhaps non-specific or low affinity. Notably, anti-SARS-CoV2 IgG and IgM levels correlated with the activated, but not total, cTfh response, suggesting that at least a proportion of cTfh were providing SARS-CoV2-specific help to B cells (Fig. 4, H and I, and fig. S5, H and I). Although defining the precise specificity of the robust PB populations will require future studies, these data suggest that at least some of the PB response is specific for SARS-CoV2.

Projecting the flow cytometry data for B cells from HD, RD, and COVID-19 patients in tSNE space revealed a distinct picture of B cell populations in COVID-19 compared to controls, whereas RD and HD were similar (Fig. 4J and fig. S5J). The COVID-19 patient B cell phenotype was dominated by loss of CXCR5 and IgD compared to B cells from HD and RD (Fig. 4J). Moreover, the robust PB response was apparent in the upper right section, highlighted by CD27, CD38, CD138, and KI67 (Fig. 4J). The expression of KI67 and CD95 in these CD27+CD38+CD138+ PB (Fig. 4J) could suggest recent generation and/or emigration from germinal centers. We next asked whether there were different groups of COVID-19 patients (or HD and RD) with global differences in the B cell response. We used Earth Movers Distance (EMD) (45) to calculate similarities between the probability distributions within the tSNE map (Fig. 4J) and clustered so that individuals with the most similar distributions grouped together (Fig. 4K). The majority of COVID-19 patients fell into two distinct groups (EMD Groups 1 and 3, Fig. 4L), suggesting two major “immunotypes” of the B cell response. The remainder of the COVID-19 patients (~25%) clustered with the majority of the HD and all of the RD controls, supporting the observation that some individuals had limited evidence of response to infection in their B cell compartment. To identify the population differences between HD, RD and COVID-19 patients, we performed FlowSOM clustering on the tSNE map, and also overlaid each individual EMD group onto this same tSNE map (Fig. 4, M and N). EMD Group 2 containing mostly HD and RD was enriched for naive B cells (IgD+CD27, Cluster 10) and CXCR5+IgDCD27+ switched memory (Cluster 2) and, indeed, Clusters 2 and 10 were statistically reduced in COVID-19 patients (Fig. 4P). EMD Groups 1 and 3 displayed distinct patterns across the FlowSOM clusters. B cells from individuals in EMD Group 1 were enriched for the FlowSOM Clusters 1, 5, and 6, all of which were increased in COVID-19 patients (Fig. 4P). FlowSOM Clusters 1 and 6 captured T-bet+ memory B cells whereas FlowSOM Cluster 5 contained the CD27+CD38+CD138+KI67+ PB, all of which were enriched in COVID-19 patients compared to controls (Fig. 4, O and P, and fig. S5K). In contrast, B cells from COVID-19 patients in EMD Group 3 also showed enrichment for the PB FlowSOM Cluster 5, though less prominent than for EMD Group 1, but the T-bet+ memory B cell Cluster 1 was substantially reduced in EMD Group 3. Thus, B cell responses were evident in many hospitalized COVID-19 patients, most often characterized by elevated PB, decreases in memory B cell subsets, enrichment in a T-bet+ B cell subset, and loss of CXCR5 expression. Whether all of these changes in the B cell compartment were due to direct antiviral responses is unclear. Although there was heterogeneity in the B cell responses, COVID-19 patients fell into two distinct patterns containing activated B cell responses and a third group of patients with little evidence of an active B cell response.

Temporal changes in immune cell populations occur during COVID-19 disease

A key question for hospitalized COVID-19 patients is how immune responses change over time. Thus, we used the global tSNE projections of overall CD8 T cell, CD4 T cell, and B cell differentiation states to interrogate temporal changes in these populations between D0 and D7 of hospitalization (Fig. 5A). Combining data for all patients revealed considerable stability of the tSNE distributions between D0 and D7 in CD8 T cell, CD4 T cell, and B cell populations, particularly for key regions of interest discussed above. For example, for CD8 T cells, the region of the tSNE map containing KI67+ and CD38+HLA-DR+ CD8 T cell populations that was enriched in COVID-19 patients at D0 (Fig. 2) was preserved at D7 (Fig. 5A). A similar temporal stability of CD4 T cell and B cell activation was also observed (Fig. 5A).

Fig. 5 Temporal relationships between immune responses and disease manifestation.

(A) Global viSNE projection of non-naïve CD8 T cells, non-naïve CD4 T cells, and B cells for all subjects pooled, with cells from COVID-19 patients at D0 and D7 concatenated and overlaid. Frequencies of (B) KI67+ and HLA-DR+CD38+ CD4 T cells, (C) KI67+ and HLA-DR+CD38+ CD8 T cells, or (D) PBs as indicated for healthy donor (HD; green), recovered donor (RD; blue), or COVID-19 patients (red) with paired samples at D0 and D7 indicated by the connecting line. Significance determined by paired Wilcoxon test: *p < 0.05, **p < 0.01, ***p < 0.001, and ****p < 0.0001. Longitudinal patterns (see Methods) of (E) HLA-DR+CD38+ CD4 T cells or (F) PBs in COVID-19 patients shown as frequency and representative flow cytometry plots. (G) Spearman correlations of clinical parameters with longitudinal fold changes in immune populations.

Given this apparent stability between D0 and D7, we next investigated temporal changes in lymphocyte subpopulations of interest. Although there were no obvious temporal changes in major phenotypically defined CD4 and CD8 T cell or B cell subsets, including plasmablasts (Fig. 5D), the frequencies of HLA-DR+CD38+ and KI67+ non-naïve CD4 (Fig. 5B) and KI67+ non-naïve CD8 T cells were statistically increased at D7 compared to D0 (Fig. 5C).

However, in all cases, these temporal patterns were complex, with frequencies of subpopulations in individual patients appearing to increase, decrease, or stay the same over time. To quantify these inter-patient changes, we used a previously described data set (46) to define the stability of populations of interest in healthy individuals over time. We then used the range of this variation over time to identify COVID-19 patients with changes in immune cell subpopulations beyond that expected in healthy subjects (see methods). Using this approach, ~50% of patients had an increase in HLA-DR+CD38+ non-naïve CD4 T cells over time, whereas in ~30% of patients, these cells were stable and, in ~20%, they decreased (Fig. 5E). For KI67+ non-naïve CD8 T cells, there were no individuals in whom the response decreased. Instead, this proliferative CD8 T cell response stayed stable (~70%) or increased (~30%; fig. S6A). Notably, for patients in the stable category, the median frequency of KI67+ non-naïve CD8 T cells was ~10%, almost 5-fold higher than the ~1% detected for HD and RD subjects (Figs. 5C and 2E), suggesting a sustained CD8 T cell proliferative response to infection. A similar pattern was observed for HLA-DR+CD38+ non-naïve CD8 (fig. S6B), where only ~10% of patients had a decrease in this population, whereas ~65% were stable and ~25% increased over time. The high and even increasing activated or proliferating CD8 and CD4 T cell responses over ~1 week during acute viral infection contrasted with the sharp peak of KI67 in CD8 and CD4 T cells during acute viral infections, including smallpox vaccination with live vaccinia virus (47), live attenuated yellow fever vaccine YFV-17D (48), acute influenza virus infection (49), and acute HIV infection (35). Approximately 42% of patients had sustained PB responses, at high levels (>10% of B cells) in many cases (Fig. 5F). Thus, some patients displayed dynamic changes in T cell or B cell activation over 1 week in the hospital, but there were also other patients who remained stable. In the latter case, some patients remained stable without clear activation of key immune populations whereas others had stable T and or B cell activation or numerical perturbation (fig. S6C).

We next asked whether these T and B cell dynamics related to clinical measures of COVID-19 disease, by correlating changes in immune features from D0 to D7 with clinical information (Fig. 5G). These analyses revealed distinct correlations. Decreases in all populations of responding CD4 and CD8 T cells (HLA-DR+CD38+, KI67+, or activated cTfh) between D0 and D7 were positively correlated with PMN and WBC counts, suggesting a relationship between T cell activation and lymphopenia. Furthermore, decreases in CD4 and CD8 HLA-DR+CD38+ T cells positively correlated with APACHE III score. However, stable HLA-DR+CD38+ CD4 T cell responses correlated with coagulation complications and ferritin. Whereas decreasing activated cTfh over time was related to co-infection, the opposite pattern was observed for PB. Increases in proliferating KI67+ CD4 and CD8 T cells over time were positively correlated to increasing anti-SARS-CoV2 antibody from day 0 to day 7, suggesting that some individuals might have been hospitalized during the expansion phase of the antiviral immune response (Fig. 5G). Finally, neither Remdesivir nor HCQ treatment correlated with any of these immune features in Fig. 5G). Examining categorical rather than continuous clinical data, 80% of patients with decreasing PB over time had hyperlipidemia, whereas only 20% of patients with increasing PB over time had this comorbidity (fig. S6D). All patients who had decreasing CD38+HLA-DR+ CD8 T cells from day 0 to day 7 were treated with early vasoactive medication or inhaled nitric oxide whereas these treatments were less common for patients with stable or increasing CD38+HLA-DR+ CD8 T cells (fig. S6E). In contrast, vasoactive medication, inhaled nitric oxide, and early steroid treatment were equally common in patients with increasing or decreasing PB (fig. S6D). Similar patterns were apparent for other T cell populations and these categorical clinical data (fig. S6F). Thus, the trajectory of change in the T and B cell response in COVID-19 patients was strongly connected to clinical metrics of disease.

Identifying “immunotypes” and relationships between circulating B and T cell responses with disease severity in COVID-19 patients

To further investigate the relationship between immune responses and COVID-19 disease trajectory, we stratified the COVID-19 patients (n = 125) into eight different categories according to the NIH Ordinal Severity Scale ranging from COVID 1 (death) and COVID 2 (requiring maximal clinical intervention) to COVID 8 (at home with no required care) (Fig. 6A). We then asked how changes in T and B cell populations defined above on D0 were related to disease severity. More severe disease was associated with lower frequencies of CD8 and CD4 T cells, with a greater effect on CD8 T cells in less severe disease (Fig. 6B). Taking all patients together, there were no statistically significant changes in the major T cell and B cell subsets related to disease severity though some trends were present (fig. S7, A to C). In contrast, HLA-DR+CD38+ CD8 T cells as well as both KI67+ and HLA-DR+CD38+ CD4 T cells were increased in patients with more severe disease (fig. S7, D and E).

Fig. 6 High dimensional analysis of immune phenotypes with clinical data reveals distinct COVID-19 patient immunotypes.

(A) NIH ordinal scale for COVID-19 clinical severity. (B) Frequencies of major immune subsets. Significance determined by unpaired Wilcoxon test with BH correction: *p < 0.05, **p < 0.01, ***p < 0.001, and ****p < 0.0001. (C) Heatmap of indicated immune parameters by row; donor type, disease severity, and mortality indicated across top. (D) UMAP projection of aggregated flow cytometry data. (E) Transformed UMAP projection; density contours drawn separately for healthy donor (HD), recovered donor (RD), and COVID-19 subjects (see Methods). (F) Bars represent mean of UMAP Component 1. Dots represent individual subjects; bars shaded by subject group and/or severity score. (G) Density contour plots indicating variation of specified immune features across UMAP Component coordinates. Relative expression (according to heat scale) shown for both individual patients (points) and overall density (contours). Spearman’s Rank Correlation coefficient (ρ) and p-value for each feature vs. Component 1 (C1) and Component 2 (C2) shown. (H) (Left) Spearman correlation between UMAP Components 1 and 2 and FlowSOM clusters. (Right) Select FlowSOM clusters and their protein expression. (I) Spearman correlation between UMAP Components 1 and 2 and clinical metadata. (J) Heatmap of immune parameters used to define Immunotype 3 indicated by row; disease severity and mortality indicated across top. (K) (Left) Transformed UMAP projection; patient status for Immunotype 3 indicated by color. (Right) Spearman correlation between Immunotype 3 and disease severity, mortality, and UMAP Components.

There were two challenges with extracting meaning from these data. First, there was considerable inter-patient heterogeneity for each of these immune features related to disease severity score. Second, these binary comparisons (e.g., one immune subset versus one clinical feature) vastly underutilized the high dimensional information in this dataset. Thus, we next visualized major T and B cell subpopulation data as it related to clinical disease severity score (Fig. 6C). Data were clustered based on immune features and then overlaid with the disease severity score over time for each patient. This analysis revealed groups of patients with similar composite immune signatures of T and B cell populations (Fig. 6C). When individual CD8 T cell, CD4 T cell, or B cell populations were examined, a similar concept of patient subgroups emerged (fig. S7, F, G, and H). These data suggested the idea of “immunotypes” of COVID-19 patients based on integrated responses of T and B cells, though some individual cell types and/or phenotypes separated patients more clearly than others.

These approaches provided insight into potential immune phenotypes associated with patients with severe disease, but suffered from the use of a small number of manually selected T or B cell subsets or phenotypes. We therefore next employed Uniform Manifold Approximation and Projection (UMAP) to distill the ~200 flow cytometry features (see tables S5 and S6) representing the immune landscape of COVID-19 disease in two dimensional space, creating compact meta features (or Components) that could then be correlated with clinical outcomes. This analysis revealed a clear trajectory from HD to COVID-19 patients (Fig. 6D), which we centered and aligned with the horizontal axis (“Component 1”) to facilitate downstream analysis (Fig. 6E). An orthogonal vertical axis coordinate (“Component 2”) also existed that captured non-overlapping aspects of the immune landscape. We next calculated the mean of Component 1 for each patient group, with COVID-19 patients separated by severity (Fig. 6E). The contribution of Component 1 clearly increased in a stepwise manner with increasing disease severity (Fig. 6F). Interestingly, RD were subtly positioned between HD and COVID-19 patients. Component 1 remained an independent predictor of disease severity (P = 5.5 × 10−5) even after adjusting for the confounding demographic factors of age, sex, and race.

We next investigated how the UMAP Components were associated with individual immune features (tables S5 and S6). UMAP Component 1 captured immune features, including the relative loss of CD4 and CD8 T cells and increase in nonB:nonT cells (Fig. 6G). PB also associated with Component 1 (Fig. 6G). Other individual B cell features were differentially captured by UMAP Component 1 and 2. Component 1 contained a signal for T-bet+ PB populations (table S5) whereas Component 2 was enriched for T-bet+ memory B cells and CD138+ PB populations (table S6). Activated HLA-DR+CD38+ and KI67+ CD4 and CD8 T cells had contributions to both Component 1 and Component 2, with these features residing in the upper right corner of the UMAP plot (Fig. 6, G and H, and fig. S8, A to D). In contrast, T-bet+ non-naïve CD8 T cells were strongly associated with Component 2 whereas T-bet+ non-naïve CD4 T cells were also linked to Component 1 (Fig. 6G and tables S5 and S6). Eomes+ CD8 or CD4 T cells were both associated with Component 2 and negatively associated with Component 1 (Fig. 6G and tables S5 and S6).

We next took advantage of the FlowSOM clustering in Figs. 2 to 4 that identified individual immune cell types most perturbed in COVID-19 patients and linked these FlowSOM clusters to UMAP Components (Fig. 6H). For non-naïve CD8 T cells, FlowSOM Cluster 11 that contained T-bet+CX3CR1+ but non-proliferating effector-like cells was positively correlated with UMAP Component 2 and negatively correlated with Component 1 (Fig. 6H). In contrast, FlowSOM Cluster 14 contained activated, proliferating PD-1+CD39+ cells that might reflect either recently generated effector or possibly exhausted CD8 T cells (50) and was strongly associated with UMAP Component 1 (Fig. 6H). For CD4 T cells, FlowSOM Cluster 14, containing activated, proliferating CD4 T cells, was captured by both UMAP Components, whereas a second activated CD4 T cell population that also expressed CD95 (FlowSOM Cluster 13) was only captured by UMAP Component 1 (Fig. 6H). In addition, Component 1 was negatively correlated with CD4 T cell FlowSOM Clusters 2 and 3 that contained cTfh (Fig. 6H). Finally, for B cells, the FlowSOM Cluster of T-bet+CD138+ PB (Cluster 5) was positively correlated with Component 1 whereas the Tbet-CD138+ Cluster 3 was negatively correlated with Component 1 (Fig. 6H). Locations in the UMAP immune landscape were dynamic, changing from D0 to D7 for both Component 1 and Component 2 consistent with the data in Fig. 5 and fig. S9, A to F. The most dynamic changes in Component 1 were associated with the largest increases in IgM antibody levels (fig. S9G).

Given the association of the UMAP Component 1 with disease severity, we next examined the connections between UMAP Components with individual clinical features. UMAP Component 1 correlated with several clinical measurements of inflammation (e.g., ferritin, hsCRP, IL-6), co-infection, organ failure (APACHE III), and acute kidney disease and renal insufficiency (Fig. 6I). It was interesting, however, that, although D-dimer was elevated, this feature did not correlate with UMAP Component 1, but coagulation complication did (Fig. 6I). Several antibody features also correlated with Component 1 consistent with some of the immune features discussed above. In contrast, Component 2 lacked positive correlation to many of these clinical features of disease and rather was negatively correlated only to eosinophil count, NSAID use, and subsequent treatment with Remdesivir (Fig. 6I). UMAP Component 1, but not Component 2, also correlated with mortality, although there were clearly patients with high Component 2, but low Component 1 who succumbed to COVID-19 disease (Fig. 6E). These data indicate that the immune features captured by UMAP Component 1 have a strong relationship to many features of disease severity, whereas other features of immune dynamics during COVID-19 disease captured by UMAP Component 2 have a distinct relationship with clinical disease presentation.

More positive values in UMAP Components 1 or 2 captured mainly signals of change or differences in individual immune features in COVID-19 disease compared to HD and RD. UMAP Component 1 captured an immunotype (Immunotype 1) that was characterized by effector or highly activated CD4 T cells, low cTfh, some CD8 TEMRA-like activation, possibly hyperactivated CD8 T cells, and Tbet+ PB, whereas Component 2 or Immunotype 2 captured Tbetbright effector-like CD8 T cells, lacked some of the robust CD4 T cell activation but has some features of proliferating B cells (Fig. 6G and fig. S8). However, the data presented in Figs. 1 to 5 also suggested a subset of patients with minimal activation of T and B cell responses. To investigate this immune signature, we identified 20 patients who had responses more similar to HD and RD for five activated/responding B and T cell populations (Fig. 6J, middle, and fig. S10). If the UMAP Components 1 and 2 captured two distinct “immunotypes” of patient responses to SARS-CoV2 infection, this group of 20 patients represent a third immunotype. Immunotype 3 was negatively associated with UMAP Components 1 and 2 and negatively associated with disease severity, suggesting that a less robust immune response during COVID-19 was associated with less severe pathology (Fig. 6K and fig. S10), despite the fact that these patients were hospitalized with COVID-19 disease. These data further emphasize the different ways patients can present and possibly succumb to COVID-19 disease. These patterns may be related to pre-existing conditions in combination with immune response characteristics. It is likely that additional immune features, such as comprehensive serum cytokine measurements, will improve this model. Nevertheless, the current computational approach integrating deep immune profiling with disease severity trajectory and other clinical information revealed distinct patient immunotypes linked to distinct clinical outcomes (fig. S11).


The T and B cell response to SARS-CoV2 infection remains poorly understood. Some studies suggest an overaggressive immune response leading to immunopathology (51) whereas others suggest T cell exhaustion or dysfunction (1214). Autopsies revealed high virus levels in the respiratory tract and other tissues (52), suggesting ineffective immune responses. Nevertheless, non-hospitalized subjects who recovered from COVID-19 had evidence of virus-specific T cell memory (53). SARS-CoV2-specific antibodies are also found in convalescent subjects and patients are currently being treated with convalescent plasma therapy (30, 54). However, COVID-19 ICU patients have SARS-CoV2-specific antibodies (30), raising the question of why patients with these antibody responses are not controlling disease. In general, these studies report on single patients or small cohorts and comprehensive deep immune profiling of a large number of COVID-19 hospitalized patients is limiting. Such knowledge would address the critical question of whether there is a common profile of immune dysfunction in critically ill patients. Such data would also help guide testing of therapeutics to enhance, inhibit, or otherwise tailor the immune response in COVID-19 patients.

To interrogate the immune response patterns of COVID-19 hospitalized patients, we studied a cohort of ~125 COVID-19 patients. We used high dimensional flow cytometry to perform deep immune profiling of individual B and T cell populations, with temporal analysis of immune changes during infection, and combined this profiling with extensive clinical data to understand the relationships between immune responses to SARS-CoV2 and disease severity. Using this approach, we made several key findings. First, a defining feature of COVID-19 disease in hospitalized patients was heterogeneity of the immune response. Many COVID-19 patients displayed robust CD8 T cell and/or CD4 T cell activation and proliferation and PB responses, though a considerable subgroup of patients (~20%) had minimal detectable response compared to controls. Furthermore, even within those patients who mounted detectable B and T cell responses during COVID-19 disease, the immune characteristics of this response were heterogeneous. By deep immune profiling, we identified three immunotypes in hospitalized COVID-19 patients including: (1) patients with robust activation and proliferation of CD4 T cells, relative lack of cTfh, together with modest activation of TEMRA-like as well as highly activated or exhausted CD8 T cells and a signature of T-bet+ PB; (2) Tbetbright effector-like CD8 T cell responses, less robust CD4 T cell responses, and Ki67+ PB and memory B cells; and (3) an immunotype largely lacking detectable lymphocyte response to infection, suggesting a failure of immune activation. UMAP embedding further resolved the T cell activation immunotype, suggesting a link between CD4 T cell activation, Immunotype 1, and increased severity score. Although differences in age and race existed between the cohorts and could impact some immune variables, the major UMAP relationships were preserved even when correcting for these variables. Thus, these immunotypes may reflect fundamental differences in the ways patients respond to SARS-CoV2 infection.

A second key observation from these studies was the robust PB response. Some patients had PB frequencies rivaling those found in acute Ebola or Dengue infection (34, 42, 43, 55). Furthermore, blood PB frequencies are typically correlated with blood activated cTfh responses (40). However, in COVID-19 patients, this relationship between PB and activated cTfh was weak. The lack of relationship between these two cell types in this disease could be due to T cell-independent B cell responses, lack of activated cTfh in peripheral blood at this time point, or lower CXCR5 expression observed across lymphocyte populations, making it more difficult to identify cTfh. Indeed, activated (CD38+HLA-DR+) CD4 T cells could play a role in providing B cell help, perhaps as part of an extrafollicular response, but such a connection was also not robust in the current data. Most ICU patients made SARS-CoV2-specific antibodies, suggesting that at least part of the PB response was antigen-specific. Indeed, the cTfh response did correlate with antibodies suggesting that at least some of the humoral response is targeted against the virus. Future studies will be needed to address the antigen specificity, ontogeny, and role in pathogenesis for these robust PB responses.

A striking feature of some patients with strong T and B cell activation and proliferation was the durability of this response. This T and B activation was interesting considering clinical lymphopenia in many patients. This lymphopenia, however, was preferential for CD8 T cells. It may be notable that such focal lymphopenia preferentially affecting CD8 T cells is also a feature of acute Ebola infection of macaques and is associated with CD95 expression and severe disease (55). Indeed CD95 was associated with activated T cell clusters in COVID-19 disease. Nevertheless, the frequency of the KI67+ or CD38+HLA-DR+ CD8 and CD4 T cell responses in COVID-19 patients was similar in magnitude to other acute viral infections or live attenuated vaccines in humans (4749). However, during many acute viral infections, peak CD8 or CD4 T cell responses and the window of detectable PB in peripheral blood are relatively short (43, 56, 57). The stability of CD8 and CD4 T cell activation and PB responses during COVID-19 disease suggests a prolonged period of peak immune responses at the time of hospitalization or perhaps a failure to appropriately down-regulate responses in some patients. These ideas would fit with an overaggressive immune response and/or “cytokine storm” (2) in this subset of patients. Indeed, in some patients, we found elevated serum cytokines and that stimulation of T cells in vitro provoked cytokines and chemokines capable of activating and recruiting myeloid cells. A key question will be how to identify these patients for selected immune regulatory treatment while avoiding treating patients with already weak T and B cell responses.

An additional major finding was the ability to connect immune features not only to disease severity at the time of sampling but also to the trajectory of disease severity change over time. Using correlative analyses, we observed relationships between features of the different immunotypes, patient comorbidities, and clinical features of COVID-19 disease. By integrating ~200 immune features with extensive clinical data, disease severity scores, and temporal changes, we built an integrated computational model that connected patient immune response phenotype to disease severity. Moreover, this UMAP embedding approach allowed us to connect these integrated immune signatures back to specific clinically measurable features of disease. The integrated immune signatures captured by Components 1 and 2 in this UMAP model provided support for the notion of Immunotypes 1 and 2. These analyses suggested that Immunotype 1, comprised of robust CD4 T cell activation, paucity of cTfh with proliferating effector/exhausted CD8 T cells and T-bet+ PB involvement, was connected to more severe disease whereas Immunotype 2, characterized by more traditional effector CD8 T cells subsets, less CD4 T cell activation and proliferating PB and memory B cells, was better captured by UMAP Component 2. Immunotype 3, in which minimal lymphocyte activation response was observed, may represent ~20% of COVID-19 patients and is a potentially important scenario to consider as patients who may have failed to mount a robust antiviral T and B cell response. This UMAP integrated modeling approach could be improved in the future with additional data on other immune cell types and/or comprehensive data for circulating inflammatory mediators for all patients. Nevertheless, these findings provoke the idea of the tailoring clinical treatments or future immune-based clinical trials to patients whose immunotype suggests greater potential benefit.

Respiratory viral infections can cause pathology as a result of an immune response that is too weak and results in virus-induced pathology, or an immune response that is too strong and leads to immunopathology (58). Our data suggest that the immune response of hospitalized COVID-19 patients may fall across this spectrum of immune response patterns, presenting as distinct immunotypes linked to clinical features, disease severity, and temporal changes in response and pathogenesis. This study provides a compendium of immune response data and also an integrated framework as a “map” for connecting immune features to disease. By localizing patients on an immune topology map built on this dataset, we can begin to infer which types of therapeutic interventions may be most useful in specific patients.

Materials and methods

Patients, subjects, and clinical data collection

Patients admitted to the Hospital of the University of Pennsylvania with a positive SARS-CoV2 PCR test were screened and approached for informed consent within 3 days of hospitalization. Healthy donors (HD) were adults with no prior diagnosis of or recent symptoms consistent with COVID-19. Normal reference ranges for HDs were the UPenn clinical laboratory values shaded in green. Recovered COVID-19 subjects (RD) were adults with a prior positive COVID-19 PCR test by self-report who met the definition of recovery by the Centers for Disease Control. HD and RD were recruited initially by word of mouth and subsequently through a centralized University of Pennsylvania resource website for COVID-19-related studies. Peripheral blood was collected from all subjects. For inpatients, clinical data were abstracted from the electronic medical record into standardized case report forms. ARDS was categorized in accordance with the Berlin definition reflecting each subject’s worst oxygenation level and with physician adjudication of chest radiographs. APACHE III scoring was based on data collected in the first 24 hours of ICU admission or the first 24 hours of hospital admission for subjects admitted to general inpatient units. Clinical laboratory data were abstracted from the date closest to research blood collection. HD and RD completed a survey about symptoms. After enrollment, the clinical team determined three patients to be COVID-negative and/or PCR false positive. Two of these patients were classified as Immunotype 3. In keeping with inclusion criteria, these subjects were maintained in the analysis. The statistical significance reported in Fig. 6K did not change when analysis was repeated without the three patients. All participants or their surrogates provided informed consent in accordance with protocols approved by the regional ethical research boards and the Declaration of Helsinki.

Sample processing

Peripheral blood was collected into sodium heparin tubes (BD, Cat#367874). Tubes were spun (15 min, 3000 rpm, RT), plasma removed, and banked. Remaining whole blood was diluted 1:1 with 1% RPMI (table S7) and layered into a SEPMATE tube (STEMCELL Technologies, Cat#85450) pre-loaded with lymphoprep (Alere Technologies, Cat#1114547). SEPMATE tubes were spun (10 min, 1200xg, RT) and the PBMC layer collected, washed with 1% RPMI (10 min, 1600 rpm, RT) and treated with ACK lysis buffer (5 min, ThermoFisher, Cat#A1049201). Samples were filtered with a 70 μm filter, counted, and aliquoted for staining.

Antibody panels and staining

Approximately 1-5×106 freshly isolated PBMCs were used per patient per stain. See table S7 for buffer information and table S8 for antibody panel information. PBMCs were stained with live/dead mix (100 μl, 10 min, RT), washed with FACS buffer, and spun down (1500 rpm, 5 min, RT). PBMCs were incubated with 100 μl of Fc block (RT, 10 min) before a second wash (FACS buffer, 1500 rpm, 5 min, RT). Pellet was resuspended in 25 μl of chemokine receptor staining mix, and incubated at 37°C for 20 min. Following incubation, 25 μl of surface receptor staining mix was directly added and the PBMCs were incubated at RT for a further 45 min. PBMCs were washed (FACS buffer, 1500 rpm, 5 min, RT) and stained with 50 μl of secondary antibody mix for 20 min at RT, then washed again (FACS buffer, 1500 rpm, 5 min, RT). Samples were fixed and permeabilized by incubating in 100 μl of Fix/Perm buffer (RT, 30 min) and washed in Perm Buffer (1800 rpm, 5 min, RT). PBMCs were stained with 50μl of intracellular mix overnight at 4°C. The following morning, samples were washed (Perm Buffer, 1800 rpm, 5 min, RT) and further fixed in 50 μl of 4% PFA. Prior to acquisition, samples were diluted to 1% PFA and 10,000 counting beads added per sample (BD, Cat#335925). Live/dead mix was prepared in PBS. For the surface receptor and chemokine staining mix, antibodies were diluted in FACS buffer with 50% BD Brilliant Buffer (BD, Cat#566349). Intracellular mix was diluted in Perm Buffer.

Flow cytometry

Samples were acquired on a 5 laser BD FACS Symphony A5. Standardized SPHERO rainbow beads (Spherotech, Cat#RFP-30-5A) were used to track and adjust PMTs over time. UltraComp eBeads (ThermoFisher, Cat#01-2222-42) were used for compensation. Up to 2 × 106 live PBMC were acquired per each sample.


PBMCs from patients were thawed and rested overnight at 37°C in complete RPMI (cRPMI, table S7). 96-well flat bottom plates were coated with 1 μg/mL of anti-CD3 (UCHT1, #BE0231, BioXell) in PBS at 4°C overnight. The next day, cells were collected and plated at 1 × 105/well in 100 μl in duplicate. 2 μg/mL of anti-human CD28/CD49d was added to the wells containing plate-bound anti-CD3 (Clone L293, 347690, BD). PBMCs were stimulated or left unstimulated for 16 hours, spun down (1200 rpm, 10 min) and 85 μL/well of supernatant was collected. Plasma from matched subjects was thawed on ice, spun (3000 rpm, 1 min) to remove debris, and 85 μl collected in duplicate. Luminex assay was run according to manufacturer’s instructions, using a custom human cytokine 31-plex panel (EMD Millipore Corporation, SPRCUS707). The panel included: EGF, FGF-2, Eotaxin, sIL-2Ra, G-CSF, GM-CSF, IFN-α2, IFN-γ, IL-10, IL-12P40, IL-12P70, IL-13, IL-15, IL-17A, IL-1Ra, HGF, IL-1β, CXCL9/MIG, IL-2, IL-4, IL-5, IL-6, IL-7, CXCL8/IL-8, CXCL10/IP-10, CCL2/MCP-1, CCL3/MIP-1α, CCL4/MIP-1β, RANTES, TNF-α, and VEGF. Assay plates were measured using a Luminex FlexMAP 3D instrument (Thermofisher, Cat#APX1342).

Data acquisition and analysis were done using xPONENT software Data quality was examined based on the following criteria: The standard curve for each analyte has a 5P R2 value > 0.95 with or without minor fitting using xPONENT software. To pass assay technical quality control, the results for two controls in the kit needed to be within the 95% of CI (confidence interval) provided by the vendor for >25 of the tested analytes. No further tests were done on samples with results out of range low (<OOR). Samples with results that were out of range high (>OOR) or greater than the standard curve maximum value (SC max) were not tested at higher dilutions without further request.

Intracellular stain after CD3/CD28 stimulation

96-well flat bottom plates were coated with 1μg/mL of anti-CD3 (UCHT1, #BE0231, BioXell) in PBS at 4°C overnight. The next day, cells were collected and plated at 1 × 105/well in 100 μl with 1/1000 of GolgiPlug (BD #555029). 2μg/mL of anti-human CD28/CD49d was added to the wells containing plate-bound anti-CD3 (Clone L293, 347690, BD). GolgiPlug-treated PBMCs were stimulated or left unstimulated for 16 hours, spun down (1200 rpm, 10 min) and were stained for intracellular IFNɣ.

Longitudinal analysis D0-D7 and patient grouping

To identify subjects where the frequency of specific immune cell populations increased, decreased or stayed stable over time (day 0 to day 7), where data was available we used a previously published dataset to establish a standard range of fold change over time in a healthy cohort (44). A fold change greater than the mean fold change ± 2 standard deviations was considered an increase, less than this range was considered a decrease, and within this range was considered stable. Where this data was not available, a fold change from day 0 to day 7 of between 0.5 and 1.5 was considered stable. A fold change <0.5 was considered decreased, and >1.5 was considered increased. In order to eliminate redundant tests and maximize statistical power, the pairwise statistical tests shown in Fig. 5G were performed using fold-change as a continuous metric, irrespective of the discrete up/stable/down classification described above. Similarly, in fig. S9G, pairwise association tests between changes in UMAP Component coordinates and clinical data were performed using each difference value as a continuous metric, irrespective of the up/stable/down classification.

Correlation plots and heatmap visualization

Pairwise correlations between variables were calculated and visualized as a correlogram using R function corrplot. Spearman's Rank Correlation coefficient (ρ) was indicated by square size and heat scale; significance indicated by: *p < 0.05, **p < 0.01, and ***p < 0.001; black box indicates FDR < 0.05. Heatmaps were created to visualize variable values using R function pheatmap or complexheatmap.


Due to the heterogeneity of clinical and flow cytometric data, non-parametric tests of association were preferentially used throughout this study unless otherwise specified. Correlation coefficients between ordered features (including discrete ordinal, continuous scale, or a mixture of the two) were quantified by the Spearman rank correlation coefficient and significance was assessed by the corresponding non-parametric methods (null hypothesis: ρ = 0). Tests of association between mixed continuous versus non-ordered categorical variables were performed by unpaired Wilcoxon test (for n = 2 categories) or by Kruskal-Wallis test (for n > 2 categories). Association between categorical variables was assessed by Fisher-exact test. For association testing illustrated in heatmaps, categorical variables with more than 2 categories (e.g., ABO blood type) were transformed into binary dummy variables for each category versus the rest. All tests were performed two-sided, using a nominal significance threshold of P < 0.05 unless otherwise specified. When appropriate to adjust for multiple hypothesis testing, false discovery rate (FDR) correction was performed using the Benjamini-Hochberg procedure at the FDR < 0.05 significance threshold. Joint statistical modeling to adjust for confounding of demographic factors (age, sex, and race) when testing for association of UMAP Components 1 and 2 with the NIH Ordinal Severity Scale was performed using ordinal logistic regression provided by the polr function of the R package MASS. Statistical analysis of flow cytometry data was performed using R package rstatix. Other details, if any, for each experiment are provided within the relevant figure legends.

High dimensional data analysis of flow cytometry data

viSNE and FlowSOM analysis were performed on Cytobank ( B cells, non-naïve CD4 T cells, and non-naïve CD8 T cells were analyzed separately. viSNE analysis was performed using equal sampling of 1000 cells from each FCS file, with 5000 iterations, a perplexity of 30, and a theta of 0.5. For B cells, the following markers were used to generate the viSNE maps: CD45RA, IgD, CXCR5, CD138, Eomes, TCF-1, CD38, CD95, CCR7, CD21, KI67, CD27, CX3CR1, CD39, T-bet, HLA-DR, CD16, CD19 and CD20. For non-naïve CD4 and CD8 T cells, the following markers were used: CD45RA, PD1, CXCR5, TCF-1, CD38, CD95, Eomes, CCR7, KI67, CD16, CD27, CX3CR1, CD39, CD20, T-bet, and HLA-DR. Resulting viSNE maps were fed into the FlowSOM clustering algorithm (59). For each cell subset, a new self-organizing map (SOM) was generated using hierarchical consensus clustering on the tSNE axes. For each SOM, 225 clusters and 10 or 15 metaclusters were identified for B cells and T cells respectively.

To group individuals based on B cell landscape, pairwise Earth Mover’s Distance (EMD) value was calculated on the B cell tSNE axes for all COVID-19 day 0 patients, healthy donors, and recovered donors using the emdist package in R as previously described (60). Resulting scores were hierarchically clustered using the hclust package in R.

Batch correction

During the sample acquisition period, the flow panel was changed to remove one antibody. Batch correction was performed for samples acquired before and after this change to remove potential bias from downstream analysis. Because the primary flow features were expressed as a fraction of the parent population (falling in the 0-to-1 interval) a variance stabilizing transform (logit) was first applied to each data value, prior to re-centering the second panel to have the same mean as the first. After mean-centering, data were transformed back to the original fraction of parent scale by inverse transform. This procedure was applied separately to all 553 flow features annotated in the main text and supplemental data. Notably, this procedure avoids any batch-corrected feature values artificially falling outside of the original 0 to 1 range. Following batch correction, neither UMAP Component 1 nor Component 2 had a statistically significant difference between panels by unpaired Wilcoxon test.

Visualizing variation of flow cytometric features across the UMAP embedding space

A feature-weighted kernel density was computed across all COVID-19 patients, and was displayed as a contour plot (Fig. 6G and fig. S8, A to D). Whereas traditional kernel density methods apply the same base kernel function to every point to visualize point density, here the base kernel function centered at each individual COVID-19 patient sample was instead weighted (multiplied) by the Z-transform (mean-centered and standard deviation-scaled) of the log-transformed input feature prior to computing the overall kernel density. This weighting procedure facilitated visualization of the overall feature gradients (going from relatively low-to-high expression) across UMAP coordinates independent of the different range of each input feature. A radially symmetric two-dimensional Gaussian was used as the base kernel function with a variance parameter equal to one-half, which was tuned to be sufficiently broad in order to smooth out local discontinuities and best visualize feature gradients.

Definition of immunotype 3

To define COVID-19 patients with low or absent immune responses, classified as immunotype 3, the intersection of the bottom 50% of 5 different flow parameters was used: PB as % of B cells, KI67+ as % of non-naïve CD4 T cells, KI67+ as % of non-naïve CD8 T cells, HLA-DR+CD38+ as % of non-naïve CD4 T cells, HLA-DR+CD38+ as % of non-naïve CD8 T cells—graphically displayed in fig. S10.

The UPenn COVID Processing Unit

Zahidul Alam, Mary M. Addison, Katelyn T. Byrne, Aditi Chandra, Hélène C. Descamps, Yaroslav Kaminskiy, Jacob T. Hamilton, Julia Han Noll, Dalia K. Omran, Eric Perkey, Elizabeth M. Prager, Dana Pueschl, Jennifer B. Shah, Jake S. Shilan, Ashley N. Vanderbeck

All affiliated with the University of Pennsylvania Perelman School of Medicine Philadelphia, PA, USA.

Supplementary Materials

This is an open-access article distributed under the terms of the Creative Commons Attribution license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

References and Notes

Acknowledgments: The authors thank patients and blood donors, their families and surrogates, and medical personnel. We thank L. Bershaw for recruitment of HD and RD. We thank S. Ngiow for essential infrastructure support and C. Ash for donation of computational equipment and design of schematic figures. We thank the Wherry lab for discussions and critically reading the manuscript. Funding: This work was supported by the University of Pennsylvania Institute for Immunology Glick COVID-19 research award (MRB), NIH AI105343, AI082630, and the Allen Institute for Immunology (EJW) and NIH grants HL137006 and H137915 (NJM). ACH was funded by grant CA230157 from the NIH. DM and JG were funded by T32 CA009140. ZC was funded by NIH grant CA234842. DAO was funded by NHLBI StARR: 1R38HL143613. NJM reports funding to her institution from Athersys, Inc., Biomarck, Inc., and the Marcus Foundation for Research. JRG is a Cancer Research Institute-Mark Foundation Fellow. JRG, JEW, CA, AH, and EJW are supported by the Parker Institute for Cancer Immunotherapy which supports the Cancer Immunology program at the University of Pennsylvania. The authors declare no conflicts of interest. Author contributions: DM, NJM, MJB, and EJW conceived the project; DM, JRG, AEB, and EJW designed experiments. NM conceived the clinical cohort, obtained clinical samples and metadata from COVID-19 patients and provided clinical input; OK and JD provided clinical samples from HD and RD. AEB and KD coordinated clinical sample procurement and processing. DM, ARG, LKC, MBP, NH, JK, AP, FC, and SFL processed patient samples. DM, ZC, and YJH stained and JEW acquired flow cytometry samples; JRG, AEB, and KN performed downstream flow cytometry analysis. HR and SCC performed qRT-PCR of PBMC. DM, SFL, and FC performed Luminex experiments. ECG, EMA, MEW, SG, CPA, MJB, and SEH analyzed COVID-19 patient plasma and provided antibody data. ACH and LAV provided additional clinical data; CA compiled and JRG, DO, and CA analyzed clinical metadata with input from ACH and LAV. JRG, DAO, SM, and EJW designed data analysis and JRG, ARG, CA, DAO, and SM performed computational and statistical analyses. DM, JRG, ARG, CA, and DAO compiled figures. LKC, MBP, SA, ACH, LAV, NJM and MB provided intellectual input. DM, AEB, ARG, JEW, and EJW wrote the manuscript; all authors reviewed the manuscript. Competing interests: E.J.W. has consulting agreements with and/or is on the scientific advisory board for Merck, Roche, Pieris, Elstar, and Surface Oncology. E.J.W. is a founder of Surface Oncology and Arsenal Biosciences. E.J.W. has a patent licensing agreement on the PD-1 pathway with Roche/Genentech. E.J.W. is an inventor on a patent (US Patent number 10,370,446) submitted by Emory University that covers the use of PD-1 blockade to treat infections and cancer. Data and materials availability: Flow Cytometry data collected in this study was deposited to the Human Pancreas Analysis Program (HPAP-RRID:SCR_016202) Database and Cytobank (61) ( B cell data ( Non Naive CD4 T cells ( Non Naive CD8 T cells ( This work is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. To view a copy of this license, visit This license does not apply to figures/photos/artwork or other content included in the article that is credited to a third party; obtain authorization from the rights holder before using such material.
View Abstract

Stay Connected to Science


Navigate This Article