Research Article

The Long-Term Stability of the Human Gut Microbiota

See allHide authors and affiliations

Science  05 Jul 2013:
Vol. 341, Issue 6141, 1237439
DOI: 10.1126/science.1237439

Structured Abstract


Understanding the dynamics and stability of the human gut microbiota is important if its characterization is to play a role in the diagnosis, treatment, and prevention of disease. To characterize stability in related and unrelated individuals and its responsiveness to physiologic change (weight loss), we developed a method for bacterial 16S rRNA amplicon sequencing at high depth with high precision. We also sequenced the genomes of anaerobic bacteria represented in culture collections prepared from fecal samples collected from individuals over time.


Low-error amplicon sequencing (LEA-Seq) is a quantitative method based on redundant sequencing of bacterial 16S rRNA genes. A dilute, barcoded, oligonucleotide primer solution is used to create ~150,000 linear PCR extensions of the template DNA. The labeled, bottlenecked linear PCR pool is amplified with exponential PCR, using primers that specifically amplify only the linear PCR molecules. The exponential PCR pool is sequenced at sufficient depth to obtain ~20× coverage. Multiple reads enable the generation of an error-corrected consensus sequence for each barcoded template molecule. LEA-Seq can be used for a variety of other applications.

Embedded Image

Relationship among time, physiology, and microbiota stability. (A) Stability of fecal microbiota follows a power-law function (n = 37 females sampled over time; >1 week to <5 years). Dashed lines show 95% confidence bounds over 10- and 50-year extrapolations (inset). (B) Microbiota stability is inversely related to the stability of each individual’s body mass index.

Results and Discussion

LEA-Seq of fecal samples from 37 healthy U.S. adults sampled 2 to 13 times up to 296 weeks apart revealed that they harbored 195 ± 48 bacterial strains, representing 101 ± 27 species. On average, their individual microbiota was remarkably stable, with 60% of strains remaining over the course of 5 years. Stability followed a power law, which, when extrapolated, suggests that most strains in an individual’s intestine are residents for decades (figure, panel A). Members of Bacteroidetes and Actinobacteria are significantly more stable components than the population average. LEA-Seq of four individuals sampled during an 8- to 32-week period during a calorie-restricted dietary study showed that weight stability is a significantly better predictor of microbiota stability than the time interval between samples (figure, panel B). After generating clonally arrayed collections of anaerobic bacteria from frozen fecal samples collected from six weight-stable individuals sampled 7 to 69 weeks apart, we produced draft genome sequences for 534 isolates representing 188 strains and 75 species. A targeted approach focused on Methanobrevibacter smithii isolates from two sets of twin pairs and their mothers and Bacteroides thetaiotaomicron strains from nine donors including sister-sister and mother-daughter pairs. Strains, defined as isolates sharing >96% of their genome content, were maintained over time within an individual and between family members but not between unrelated individuals. Thus, early gut colonizers, such as those acquired from our parents and siblings, have the potential to exert their physiologic, metabolic, and immunologic effects for most, and perhaps all, of our lives.

Inheritance Guts

We know little about the stability of the constituent microbiota in the human gut or the extent to which the gut microbiota are a potential target for long-term health interventions. Faith et al. (p. 10.1126/science.1237439) analyzed the fecal microbiota of 37 individuals and found that, on average, 60% of bacterial strains remained stable for up to 5 years and many were estimated to remain stable for decades.


A low-error 16S ribosomal RNA amplicon sequencing method, in combination with whole-genome sequencing of >500 cultured isolates, was used to characterize bacterial strain composition in the fecal microbiota of 37 U.S. adults sampled for up to 5 years. Microbiota stability followed a power-law function, which when extrapolated suggests that most strains in an individual are residents for decades. Shared strains were recovered from family members but not from unrelated individuals. Sampling of individuals who consumed a monotonous liquid diet for up to 32 weeks indicated that changes in strain composition were better predicted by changes in weight than by differences in sampling interval. This combination of stability and responsiveness to physiologic change confirms the potential of the gut microbiota as a diagnostic tool and therapeutic target.

Our growing understanding of the human gut microbiota as an indicator of and contributor to human health suggests that it will play important roles in the diagnosis, treatment, and ultimately prevention of human disease. These applications require an understanding of the dynamics and stability of the microbiota over the life span of an individual. Amplicon sequencing of the bacterial 16S rRNA gene from fecal microbial communities (microbiota) has revealed that each individual harbors a unique collection of species (13). Estimates of the number of species present in an individual’s microbiota have varied greatly. Culture-based techniques (4) indicate ~100 such species, whereas culture-independent deep shotgun sequencing of fecal community DNA (5) indicates ~160 such species. Several times these numbers of species are suggested by the results of 16S rRNA amplicon sequencing, even after in silico attempts to remove chimeric molecules formed in the course of a polymerase chain reaction (PCR) and errors introduced during sequencing (2). These artifacts complicate tracking of individual bacterial taxa across time by inflating the set of strains in each sample with false positives. Shotgun sequencing of the community’s microbiome is another approach for defining diversity (6), but it is difficult to associate gene sequences with their genome of origin.

With these limitations in mind, we have developed a method for amplicon sequencing to assay the bacterial composition of the gut microbiota of individuals at high depth with high precision over time. When combined with high-throughput methods for culturing and sequencing the genomes of anaerobic bacteria, these results reveal that the majority of the bacterial strains in an individual’s microbiota persist for years, and suggest that our gut colonizers have the potential to shape many aspects of our biological features for most and in some cases all of our lives.

A Method for Low-Error Amplicon Sequencing (LEA-Seq) of Bacterial 16S rRNA genes

A 16S rRNA sequencing method for assaying the stability of an individual’s microbiota over time would ideally retain high precision at high sequencing depth [precision = (true positives)/(true positives + false positives)]. Low-precision data complicate comparison of sequences between samples, as it becomes difficult to differentiate species (typically defined as isolates that share ≥97% sequence identity in their 16S rRNA genes) and strains (isolates of a given species with more minor variations in their 16S rRNA gene sequences) from sequencing errors. Standard amplicon sequencing is limited in its precision by the overall error rate of the sequencing method. If sequencing depth is low, it becomes impossible to determine whether a strain has dropped out of a given individual’s microbiota or has fallen below the limits of detection at the sampling depth used.

In many applications it would be advantageous to exchange sequencing depth for improved sequence quality. Despite several optimizations we developed to increase the precision of standard amplicon sequencing at shallow depths, we found that sequencing a sample beyond 10,000 reads did not substantially increase the lower detection limit possible at high precision (7). Exchanging sequence quantity for sequence quality is inherent in shotgun genome sequencing, where redundant sequencing of genomes at 10× to 50× coverage enables a far lower error rate than is attainable from single reads alone. In general, to redundantly sequence DNA fragments, it is necessary to create a finite DNA pool that is smaller than the amount of sequencing available (i.e., create a bottleneck) and to have a method of labeling the molecules in the pool (810). To adapt these techniques to redundantly sequence PCR amplicons, the initial template DNA could be diluted to create a bottleneck. However, this dilution would likely need to be empirically determined for every input sample (e.g., using quantitative PCR), and one would still need to label each template molecule. As an alternative, we developed a method called low-error amplicon sequencing (LEA-Seq).

As outlined in Fig. 1A, LEA-Seq is based on redundant sequencing of a set of linear PCR template extensions of 16S rRNA genes to trade sequence quantity for quality. In this method, we create the bottleneck with a linear PCR extension of the template DNA with a dilute, barcoded, oligonucleotide primer solution. Each oligonucleotide is labeled with a random barcode positioned 5′ to the universal 16S rRNA primer sequence (Fig. 1A and fig. S1). We then amplify the labeled, bottlenecked linear PCR pool with exponential PCR, using primers that specifically amplify only the linear PCR molecules. During the exponential PCR, an index primer is added to the amplicons with a third primer to allow pooling of multiple samples in the same sequencing run (fig. S1). This exponential PCR pool is then sequenced at sufficient depth to redundantly sequence (~20× coverage) the bottlenecked linear amplicons. The resulting sequences are separated by sample, using the index sequence, and the amplicon sequences within each sample are separated by the unique barcode; the multiple reads for each barcode allow the generation of an error-corrected consensus sequence for the initial template molecule. In LEA-Seq, the linear PCR primers are diluted to a concentration that generates ~150,000 amplicon reads at 20× coverage per amplicon on an Illumina HiSeq DNA sequencer.

Fig. 1 Multiplex bacterial 16S rRNA gene sequencing using LEA-Seq; comparison with previous methods using mock communities composed of sequenced gut bacterial species.

(A) Schematic of how the LEA-Seq method is used to redundantly sequence PCR amplicons from a set of linear PCR template extensions of bacterial 16S rDNA. This approach results in amplicon sequences with a higher precision than standard amplicon sequencing at lower abundance thresholds. (B) Performance of 16S rRNA amplicon sequencing methods assayed as the precision obtained for different sequence abundance thresholds. Standard methods for amplicon sequencing using the 454 pyrosequencer and the Illumina MiSeq instrument exhibit increased precision as less abundant reads are filtered out. By redundantly sequencing each amplicon with LEA-Seq, the precision of amplicon sequencing is increased at lower abundance thresholds for both the V1V2 region of the bacterial 16S rRNA gene (compare red and blue lines) and the V4 region (compare magenta and blue lines), thereby enabling detection of lower-abundance bacterial taxa at high precision.

To empirically test LEA-Seq against existing 16S rRNA amplicon sequencing methods, we first generated nine in vitro “mock” communities composed of different proportions of strains from a 48-member collection of phylogenetically diverse, cultured human gut bacteria whose genomes had been characterized (7) (table S1). To calculate precision, we compared amplicons generated using two sequencing platforms (Illumina MiSeq and 454 FLX instruments), targeting different variable (V) regions of the 16S rRNA gene with different PCR primers. We defined a true positive sequence as 100% identical across 100% of its length to the 16S rRNA gene sequence(s) in the reference genome. We calculated precision at different abundance thresholds by including only those sequences representing at least a minimal portion of the total sequencing reads (0.5%, 0.1%, 0.05%, 0.01%, or 0.005%).

Relative to the existing standard approaches, LEA-Seq produced amplicon sequences with higher precision at lower abundance thresholds (Fig. 1B). For 16S rRNA sequences representing ≥ 0.01% of the reads, LEA-Seq enabled a precision of 0.83 ± 0.02 (V4) and 0.63 ± 0.03 (V1V2) versus 0.08 ± 0.064 and 0.09 ± 0.005, respectively, for the same regions with standard amplicon sequencing (table S2). These performance improvements are dependent on generating the consensus sequence from the redundant amplicon reads (table S2, method “LEA-Seq without consensus”). LEA-Seq also produced slower saturation in performance (precision of >0.7 for reads representing 0.001% of the total; fig. S2 and table S2). Similar results were obtained using the several different mock communities [see (7) for additional details of the analysis, including V1V2 versus V4 comparisons]. On the basis of this assessment of its attributes, we used LEA-Seq to quantify the stability of the gut microbiota within individuals as a function of time and change in body mass index (BMI) while they consumed controlled monotonous or free diets.

Applying LEA-Seq to Define the Stability of the Fecal Microbiota of 37 Healthy Adults

Stability of a Microbiota Best Fits a Power-Law Function

We used LEA-Seq to characterize the microbiota in 175 fecal samples obtained from 37 healthy adults residing throughout the United States; 33 of these donors were sampled 2 to 13 times up to 296 weeks apart (1, 11) (table S3). The remaining four individuals were sampled on average every 16 days for up to 32 weeks while consuming a monotonous liquid diet as part of a controlled in-patient weight loss study (see methods) (1214). None of the individuals took antibiotics for at least 2 months before sampling. All fecal samples were cooled to –20°C immediately after they were produced and then to –80°C within 24 hours. DNA was isolated from all samples by bead beating in phenol and chloroform.

Using an Illumina HiSeq 2000 instrument to sequence amplicons from the V1V2 region of bacterial 16S rRNA genes, we generated 108,677 ± 60,212 (mean ± SD) LEA-Seq reads per fecal DNA sample. Reads were then filtered using a minimum sequence abundance threshold cutoff of eight reads (i.e., to detect strains present in the fecal microbiota at an average relative abundance of 0.007%). On the basis of our mock community data, the precision at this threshold for the V1V2 region is 0.63. We defined the number of strains in a sample as the number of unique amplicon sequences, and the number of species-level operational taxonomic units (OTUs) in the sample as the number of clusters with 97% shared sequence identity. To correct for false positives, we multiplied the number of strains by the precision (i.e., if we detect 100 unique sequences, we expect 63 of them to be true). For individuals sampled over multiple time points, we calculated the number of species and strains for each sample individually and averaged them. The results indicated that individuals in this cohort harbored 195 ± 48 bacterial strains in their fecal gut microbiota, representing 101 ± 27 species.

To study each individual’s microbiota over time, we took all possible pairs of samples from the time series of each individual (table S3) and calculated the time in weeks between the sample dates as well as the fraction of shared strains between them, as measured by the binary Jaccard index (an unweighted metric of community overlap):Jaccard index(sample A, sample B)=sample A  sample Bsample A  sample B (1)Control experiments using mock communities (table S1) established that LEA-Seq of V1V2 16S rRNA amplicons produced highly accurate estimates of the Jaccard index [correlation between known and measured Jaccard index = 0.996; see (7)]. To characterize the stability of an individual’s microbiota, we binned fecal samples into intervals (<3 weeks, 3 to 6 weeks, 6 to 9 weeks, 9 to 12 weeks, 12 to 32 weeks, 32 to 52 weeks, 52 to 104 weeks, 104 to 156 weeks, 156 to 208 weeks, 208 to 260 weeks, and >260 weeks) and determined the average Jaccard index values for each bin. The results disclosed that the bacterial composition of each individual’s fecal microbiota changed over time, with more strains shared between shorter sampling intervals relative to long intervals (Fig. 2A). Nonetheless, the set of microbial strains was remarkably stable overall, with more than 70% of the same strains remaining after 1 year and few additional changes occurring over the following 4 years. The stability of a microbiota best fits a power-law function (R2 = 0.96; Fig. 2A, blue line, and table S4); large differences in community composition occur on shorter time scales, whereas a stable core set of strains persists at longer time scales.

Fig. 2 Measuring the stability of an individual’s fecal microbiota over time with LEA-Seq.

(A) The Jaccard index (fraction of shared strains) was calculated between all possible pairwise combinations of fecal samples collected from each individual, where bacterial strains were considered shared if the nucleotide sequence was 100% identical across 100% of the length of the V1V2 region of their 16S rRNA genes. Jaccard indices were binned into intervals of <3 weeks, 3 to 6 weeks, 6 to 9 weeks, 9 to 12 weeks, 12 to 32 weeks, 32 to 52 weeks, 52 to 104 weeks, 104 to 156 weeks, 156 to 208 weeks, 208 to 260 weeks, and >260 weeks apart (mean ± SE for each bin is shown). The decay in the Jaccard index as a function of time between two samples best fits a power law (blue line). (B) Four individuals losing 10% of their body weight in the study involving consumption of a monotonous low-calorie liquid diet (magenta) had significantly less stable microbiota than the mean of the 33 remaining individuals (blue). Means ± SE for the Jaccard index are plotted. (C) At the phylum level, Bacteroidetes (blue) and Actinobacteria (red) were more stable components of the microbiota than Proteobacteria and Firmicutes (hypergeometric distribution).

To define the stability of a given strain as a function of its relative abundance in the microbiota, we used all pairwise combinations of fecal samples obtained from each individual to calculate (i) the mean abundance of the strains shared by two or more samples, and (ii) the mean abundance of strains that were not shared between any two samples. Strains that were shared across two time points were roughly 3 times as abundant as those that were not shared [0.030 ± 0.013 fraction of the community versus 0.011 ± 0.011 (mean ± SD); P = 2.2 × 10−9 (t test); fig. S3A]. We also binned the strain abundances for each donor using five fractional abundance thresholds of 0.1, 0.01, 0.001, 0.0001, and <0.0001 (e.g., bin 0.01 contains all strains ≤0.1 and >0.01) and calculated the probability that strains in a given bin were shared between samples. We found that the higher the fractional abundance of a strain, the more likely the strain was shared between samples (r = 0.96, P < 0.0087; fig. S3B). Together, these results suggest that the more stable components of the microbiota are also the most abundant members.

Effects of a Monotonous Low-Calorie Diet and Associated Weight Loss on Diversity

To explore the role of weight loss on the microbiota, we applied LEA-Seq to the fecal microbiota of four individuals sampled over the course of an 8- to 32-week period in a three-phase study that used different caloric intakes of a defined monotonous liquid diet to first stabilize initial weight, then decrease weight by 10%, and finally maintain weight at the 10% reduced level (Fig. 2B and table S3). Daily caloric intake was 2988 ± 290 kcal, 800 kcal, and 2313 ± 333 kcal for the three phases of the study, respectively (14, 15). While on this diet, these four individuals experienced significantly reduced stability of their microbiota, as measured by the Jaccard index (Fig. 2B). For each individual, we found no significant correlation between time and diversity/richness (i.e., number of strains in a sample; minimum P = 0.17). Additionally, we found no significant correlation between the change in composition of the microbiota (Jaccard index between two samples) and the change in diversity/richness (absolute difference in the number of species or strains between two samples) (P = 0.09 and 0.44 for strains and species, respectively). Considering family-level taxonomic bins, there were several groups whose abundance was strongly positively or negatively correlated with time during the weight loss period, including Clostridiaceae [average correlation (r) across donors during weight loss = 0.60], Coriobacteriaceae (r = 0.53), Bifidobacteriaceae (r = 0.55), Enterobacteriaceae (r = 0.58), Lachnospiraceae (r = –0.65), Oscillospiraceae (r = –0.53), and Oxalobacteraceae (r = –0.74).

Modeling the Relationship Among Time, Body Composition, and Microbiota Stability

Given the correlation between weight loss and changes in the microbiota of individuals consuming a monotonous 800 kcal/day diet, we took a broader view across all 37 individuals in our study to determine whether this correlation was due to the monotonous diet that the four individuals had consumed, or if there is a generalizable and quantifiable relationship between weight stability and microbiota stability. To explore this question, we not only calculated the time (Δtime) and Jaccard index between all pairs of fecal samples collected from an individual (Fig. 2), but also the absolute value of the change in natural logarithm of the BMI value (abbreviated ΔlnBMI) between all pairs. We found a significant negative correlation between ΔlnBMI and Jaccard index (Fig. 3A; r = –0.68, P = 2.98 × 10−73) that was even greater than that between Δtime and Jaccard index (Fig. 3B; r = –0.42, P = 1.45 × 10−43). These relationships held when we removed the data generated from the four individuals on the monotonous diet (ΔlnBMI: r = –0.69, P = 3.27 × 10−54; Δtime: r = –0.65, P = 9.05 × 10−46).

Fig. 3 Relationship among weight stability, time, and fecal microbiota stability.

(A) The microbiota sampled from a given individual during periods of weight loss or gain has decreased stability (lower Jaccard index). (B) The Jaccard index decreased as the time between samples increased (see also Fig. 2). (C) Across samples from 37 individuals, a linear model of microbiota stability as a function of changes in lnBMI and changes in time explained 46% of the variation in the stability of the microbiota (Jaccard index). Note that changes in lnBMI explained more of the variation in microbiota stability than did the passage of time. Color changes correspond to the Jaccard index values in the color scale at right. Blue dots show the change in Jaccard index, time, and lnBMI between two samples from a given individual.

To quantify the relationship among Δtime, ΔlnBMI, and the Jaccard index between pairs of samples (Fig. 3C), we fit the following model:Microbiota stability = β0+βlnBMIXlnBMI+βtimeXtime (2)

where microbiota stability is the Jaccard index between samples, XlnBMI is the change in lnBMI between any two samples collected from the individual (ΔlnBMI), Xtime is the time between the two samples being compared (Δtime), β0 is the estimated parameter for the intercept, and βlnBMI and βtime are the linear regression estimated parameters for ΔlnBMI and Δtime, respectively. Remarkably, this model explained 46% of the variance in the stability of the microbiota (Jaccard index) within the individuals over time (R2 = 0.46, P = 1.94 × 10−72; when the monotonous dieters were excluded, R2 = 0.51, P = 1.40 × 10−58). Once again, the weight stability of an individual [ΔlnBMI; analysis of variance (ANOVA), P = 1.18 × 10−51] was a better predictor of fecal microbiota stability than the time between samples (Δtime; ANOVA, P = 0.09), with Δtime only being a significant predictor of stability when the monotonous dieters were excluded (ANOVA, P = 2.82 × 10−7). Together, these relationships among time, BMI, and the stability of an individual’s microbiota highlight the role that longitudinal surveys of a microbiota could play in health diagnostics.

Sequenced Collections of Fecal Bacteria Obtained from Individuals over Time

As in previous studies (1, 1619), we found that each individual’s microbiota at a given time point was most similar to their own at other time points (Jaccard index, 0.82 ± 0.022), followed by their family members (Jaccard index, 0.38 ± 0.020), and then unrelated individuals (Jaccard index, 0.30 ± 0.005). The Jaccard index estimates with LEA-Seq suggest that on average any two unrelated individuals share ~30% of the strains in their microbiota. However, it is possible that unrelated individuals on average share no strains in their microbiota, and that this 30% represents the lower resolving limit of 16S rRNA amplicon sequencing of the targeted variable region (V1V2) and currently available maximum read lengths on the Illumina HiSeq 2000 instrument (paired-end, 101 base pairs).

Whole-genome alignments between bacteria isolated and sequenced from different samples provide many orders of magnitude of additional resolving power to determine which strains (now defined at the level of whole-genome sequence identity rather than 16S rRNA identity) remain in an individual’s microbiota over time, or reside in two unrelated individuals. Isolation and sequencing of extensive collections of organisms from the human gut microbiota (20) can provide a practical method to look at the plasticity and evolution of the gene content of microbial strains harbored in individuals’ intestines over time. Therefore, adapting a high-throughput method we had developed for generating clonally arrayed collections of anaerobic bacteria in multiwell format from frozen fecal samples (20), we produced draft genome sequences for 444 bacterial isolates recovered from the frozen fecal microbiota of five donors who had been sampled across periods from 7 to 69 weeks apart (n = 1 to 4 time points per donor; 11 total samples; mean coverage per microbial genome = 118×; see tables S5 and S6) (7). These genomes span a broad phylogenetic range within the four dominant bacterial phyla that constitute the human gut microbiota (Bacteroidetes, Firmicutes, Proteobacteria, and Actinobacteria; table S6).

To look for changes in bacterial genome content across time in each individual, we performed whole-genome alignment with nucmer (21) and calculated the fraction of DNA sequence aligned between each pair of genomes [coverage score = (Xaln + Yaln)/(X + Y), where X and Y are the lengths of genome X and Y, respectively, and Xaln and Yaln are the number of aligned bases of genomes X and Y, respectively] (7, 22). We found that the shared genome content between isolates from unrelated individuals was broadly distributed for taxa from the same genus (coverage score = 0.30 ± 0.20) or species (0.77 ± 0.12), with a maximum of 0.956 (Fig. 4A, blue; fig. S4). We then compared the shared genome content between isolates within each fecal sample (i.e., self-versus-self at a single time point) and found isolates that shared a very high proportion of their content (0.965 to 0.999) (Fig. 4A, red). Remarkably, we found the same high proportion of shared genome content between isolates from a given donor between different time points (i.e., self-versus-self over time; Fig. 4A, green), which suggests that the same strains of bacteria persisted in these individuals over the course of the sampling period.

Fig. 4 Comparison of genome stability in fecal bacterial isolates recovered from individuals over time.

The fraction of aligned nucleotides between any two microbial genomes was calculated using the coverage score (see text). (A) Histogram of the fraction of aligned genome content between all sequenced bacterial isolates from unrelated individuals (blue; only coverage scores ≥ 0.01 are shown) shows that the alignable genome content never exceeded 96% (dashed line). However, highly conserved strains with coverage scores exceeding this threshold were readily detected in the microbiota of individuals at a single time point (red) or between samples from an individual taken up to 15 months apart (green). Counts denote the number of times a sample fell into each coverage score bin. (B and C) Sequencing the genomes of M. smithii strains (B) and B. thetaiotaomicron strains (C) revealed that no two isolates from unrelated individuals had more than 96% shared (alignable) gene content (blue), whereas highly conserved strains above this threshold were found between isolates obtained from a single individual’s fecal microbiota at a single time point (red), as well as from isolates taken from different members of the same family (brown).

Defining replicate bacterial strains as those with a coverage score of >0.96 and species as those with a coverage score of >0.5 (fig. S4), we subsequently clustered the genome isolates by sample and by individual (table S5); this effort yielded a total of 165 strains and 69 species across the five donors (Table 1). Across the four donors with multiple time points, on average 36% of an individual’s bacterial strains were isolated from multiple time points. This fraction of shared bacterial strains across time at the level of the genome is lower than that measured by LEA-Seq; however, this likely reflects the increased sampling depth and culture independence of LEA-Seq [detecting isolates at depths of 1:10,000 to 1:100,000 (0.01 to 0.001%) compared with 0.14 to 0.06% for high-throughput culturing]. For the most deeply sampled individual (F3T1 in table S3), where isolates were sequenced from four samples taken over the course of ~16 months, more than 60% of the strains were isolated from multiple samples.

Table 1 Species composition of the sequenced arrayed culture collections from six donors.

Alternative names for species are in parentheses.

View this table:

Stability Viewed from the Perspective of Phylum-Level Membership

When we assigned phylum-level taxonomy to all LEA-Seq 16S rRNA amplicons from each of the 37 individuals in our study (23), we found that members of the Bacteroidetes and Actinobacteria were significantly more stable components of the microbiota relative to the population average (hypergeometric distribution comparing the total number of shared/not shared strains within a given phylum for all samples versus the total number of shared/not shared strains across all phyla, except the phylum of interest; P = 7.54 × 10−28 and 0.0068, respectively), whereas the Firmicutes and Proteobacteria were significantly less stable (Fig. 2C; P = 1.83 × 10−11 and 0.0015, respectively). The cultured bacterial strains manifested similar trends for the Bacteroidetes and Firmicutes, where 52% and 21%, respectively, of the strains were isolated and sequenced across multiple time points (table S7), thus demonstrating at a whole-genome level the strain stability initially identified when only the 16S rRNA gene was targeted for analysis.

Strains Shared Between Members of Human Families

The power-law response of the Jaccard index as a function of the sample collection interval makes it possible to extrapolate beyond the sampling time frame of the current study and suggests that the majority of strains in the microbiota represent a stable core that persists in an individual’s intestine for his or her entire adult life, and could represent strains acquired during childhood from parents or siblings (fig. S5). Therefore, we used LEA-Seq to measure the fraction of shared strains between family members (sister-sister or mother-daughter). As in previous studies (1), we found the microbiota of related individuals was more similar than unrelated ones, with a significantly larger proportion of shared V1V2 16S rRNA sequences [Jaccard index = 0.38 ± 0.020 (related), 0.30 ± 0.005 (unrelated); P = 0.00053].

To determine whether this increased similarity between family members manifested itself at the level of their gut microbial genome sequences, we used a targeted approach to look at genome content differences in (i) two families using previously sequenced Methanobrevibacter smithii isolates (24) from two sets of twin pairs and their mothers (six total donors, 19 genomes; table S3), and (ii) five families where 26 Bacteroides thetaiotaomicron strains were isolated with a species-specific monoclonal antibody (7, 25) from nine donors including sister-sister and mother-daughter pairs (all isolates were from a single sample from each donor; table S3). M. smithii, a methanogen, is the dominant archaeon in the human gut microbiota and facilitates fermentation of polysaccharides by saccharolytic bacteria such as B. thetaiotaomicron by virtue of its ability to remove hydrogen (24). As with our untargeted large-scale genome sequencing of personal bacterial culture collections described above, we found that unrelated individuals had no pair of isolates of either species that shared >96% of their genome content. However, within an individual we once again found replicate isolates of the same strain (Fig. 4, B and C, blue and red). Strikingly, we also found replicate strains of M. smithii or B. thetaiotaomicron shared across family members (Fig. 4, B and C, brown; table S3).

In contrast with the results obtained using this taxon-targeted whole-genome sequencing approach, our untargeted sequencing of the clonally arrayed personal bacterial culture collections had only involved two related individuals (female dizygotic co-twins 1 and 2 from family 60; F60T1 and F60T2, table S3) and had revealed no strains with >96% of their genomes aligned. Therefore, we isolated and sequenced an additional 89 genomes from two time points of the dizygotic twin sister (F61T2) of subject F61T1 (yielding a total of 188 strains and 75 species across the six donors). As with the previous donors, we were able to isolate numerous strains shared across the two time points (8 of 25 = 32%). In addition, we were able to isolate two strains (B. thetaiotaomicron and Escherichia coli) in both of the sisters, showing that even nontargeted genome isolation and sequencing can retrieve the same strain across family members. We did not explicitly sample members of our cohort of females during significant physiological transitions such as menarche and menopause. However, the presence of the same bacterial strain in mothers and their adult daughters who had progressed through one or both of these life-cycle milestones suggests that components of the microbiota are retained during these events.


The objects we touch and consume during the course of our lives are covered with diverse microbial life. Despite this, LEA-Seq revealed that on average 60% of the approximately 200 microbial strains harbored in each adult’s intestine were retained in their host over the course of a 5-year sampling period. Our results are supported by a microarray-based profiling of fecal microbiota collected from three males and two females over ~8 years (19) but differ from those of a similar analysis using standard 16S rRNA amplicon sequencing, which found high variability in microbiota composition in two individuals sampled for up to 15 months (26). This difference likely reflects the fact that the sequencing depth and precision limitations of standard 16S rRNA amplicon sequencing are overcome to some extent with microarrays where amplicons are mapped and hybridized to a finite pool of target sequences (i.e., sacrificing resolution for precision). The differences could also be due to true differences in the stabilities in microbiota of the individuals, as both studies surveyed only a small number of individuals. Our findings are also supported by a recent report that mapped deep shotgun sequencing data sets of the fecal microbiome to a set of reference bacterial genomes (6) and found that the gut communities of these individuals were more similar to each other at the microbiome level than to unrelated individuals (average maximum time between samples = 32 weeks with two individuals sampled over a period of >1 year). Applying LEA-Seq to longitudinal surveys of the fecal microbiota of 37 twins sampled for up to 5 years allowed us to show that the stability of an individual’s microbiota follows a power-law function. Using this function, we could extrapolate the stability of the microbiota over decades. The resolution and accuracy of these predictions should improve as advances in sequencing chemistry enable longer regions of 16S rRNA genes to be characterized. LEA-Seq itself can be generalized to any application that requires deep amplicon sequencing with high precision (e.g., the VDJ regions of immunoglobulin and T cell receptor genes, or targeted searches for variants in candidate or known disease-producing genes).

Our study also illustrates how a highly personalized analysis of the gut community at strain-level microbial genome resolution can be conducted using collections of cultured bacteria (or archaea) generated from frozen fecal samples collected over time from a given subject. This strain-level analysis can be part of a broad phylogenetic survey, or it can target a particular species.

The stability of the microbiota that we document in healthy individuals has important implications for future use of the microbiota (and microbiome) as a diagnostic tool as well as a therapeutic target for individuals of various ages. Our findings suggest that obtaining a routine fecal sample as part of a yearly physical examination designed to promote disease prevention would be sufficient to monitor changes in the composition and stability of an individual’s fecal microbiota. For example, in the case of inflammatory bowel diseases, the concordance for Crohn’s disease and ulcerative colitis among monozygotic twin pairs is only 38% and 15%, respectively (27). Our results suggest that these twins likely share identifiable unique subsets of their microbiota that represent long-term environmental exposures for their immune systems that should be considered when trying to predict disease risk, or infer which species or strains may have a causal role in disease initiation, progression, relapse, and treatment responses. Moreover, the effects of travel, changes in diet, weight gain and loss, diarrheal disease, antibiotics, immunosuppressive therapy, or clinical trials designed to deliberately manipulate the microbiota (e.g., through administration of existing or new prebiotics, probiotics, synbiotics, antibiotics, or transplantation of microbiota from healthy individuals to those with various diseases attributed to a dysfunctional microbiota) can be more accurately quantified by applying the methods we describe. Finally, the stability we document highlights the impact of early colonization events on our microbiota in later life; earlier colonizers, such as those acquired from our parents and siblings, have the potential to provide their metabolic products and exert their immunologic effects for our entire lives.


Diet Studies

Four obese (BMI > 30 kg/m2) female subjects with a mean (± SD) age of 26 ± 3 years were admitted to the General Clinical Research Center at Columbia University Medical Center and remained as in-patients throughout the study. The protocol for recruitment and for the weight loss study was approved by the Institutional Review Board of New York Presbyterian Medical Center and is consistent with guiding principles for research involving humans. Written informed consent was obtained from all subjects. The diet protocol has been described in detail (12, 13). Briefly, subjects were fed a liquid-formula diet with 40% of energy as fat (corn oil), 45% as carbohydrate (glucose polymer), and 15% as protein (casein hydrolysate). Diet composition but not quantity was constant throughout the study. The diet had a caloric density of 1.25 digestible kcal of energy per gram and was supplemented with vitamins and minerals in quantities sufficient to maintain a stable weight, defined as an average daily weight variation of <10 g/day for ≥2 weeks. This weight plateau is designated as Wtinitial. The four individuals in this study consumed 2600 to 3300 kcal/day of the diet to maintain Wtinitial. After a brief period at Wtinitial, subjects were provided 800 kcal/day of the same liquid-formula diet until they had lost ~10% of Wtinitial. The duration of the weight loss phase ranged from 36 to 62 days (table S3). Once 10% weight loss had been achieved, intake was adjusted upward until subjects were again weight-stable. Weight maintenance calories were disproportionately reduced (~22%) below those required to maintain initial weight and ranged from 2050 to 2800 kcal/day for the four individuals. Subject F72 also received triiodothyronine (25 μg/day) during this second weight-stable period (table S3). Fecal samples were obtained throughout the study (table S3) and frozen at –80°C until processed for DNA extraction (1).

Twin Participants

Twins were selected from a general population cohort of female like-sex twin pairs, born in Missouri to Missouri-resident parents between 1 July 1975 and 30 June 1985, and first assessed at median age 15 with multiple waves of follow-up (28, 29). Selected twins were drawn from (i) a study, which included biological mothers where available, contrasting stably concordant lean twin pairs (both twins had BMIs in the range 18.5 to 24.9 by self-report at all completed assessments) and concordant obese twin pairs (both twins had BMIs ≥ 30, but with pairs prioritized where at least one twin had BMI > 35, to maximize separation from the concordant lean pairs) (1); (ii) a small-scale study of concordant lean monozygotic twin pairs contrasting free diet with free diet supplemented by twice-daily consumption of a fermented milk product (11); and (iii) an ongoing study of twin pairs selected for BMI discordance (either discordant lean/obese or quantitatively discordant).

Other Protocols

See (7) for procedures for (i) creating mock bacterial communities to benchmark standard methods for 16S rRNA sequencing and LEA-Seq, (ii) generating robotically arrayed personal bacterial culture collections from human fecal samples, (iii) isolating B. thetaiotaomicron strains from fecal samples collected over time from individuals and family members, and (iv) sequencing microbial genomes.

Supplementary Materials

Supplementary Methods

Supplementary Text

Figs. S1 to S8

Tables S1 to S15

References (3041)

References and Notes

  1. See supplementary materials on Science Online.
  2. Acknowledgments: We thank D. Hopper and S. Marion for their contributions to the recruitment of twins from the MOAFTS cohort and for obtaining fecal samples for the present study; J. Hoisington-Lopez, M. Meier, and S. Deng for technical assistance; and members of the Gordon lab for their many helpful suggestions during the course of this study. Supported by NIH grants DK30292, DK078669, DK70977, DK64774, and UL1TR000040; the Crohn’s and Colitis Foundation of America; Groupe Danone (through partial support of the postdoctoral stipend of J.J.F.); and the Howard Hughes Medical Institute. Draft genome assemblies are available from the European Bioinformatics Institute (EBI) under accession numbers PRJEB3419 to PRJEB3951. Raw and processed LEA-Seq data, as well as experimental protocols for LEA-Seq and phased 16S rRNA amplicon sequencing, can be found at The authors do not declare any conflicts of interest. Author contributions: J.J.F. and J.I.G. designed the experiments; A.C.H., R.L.L., and M.R. oversaw human studies; J.J.F., J.L.G., M.C., S.S., H.S., and A.L.G. generated the data; J.J.F., R.K., J.C.C., S.S., and J.I.G. analyzed data; J.J.F. and J.I.G. wrote the paper.
View Abstract

Stay Connected to Science

Navigate This Article