Intratumor heterogeneity in localized lung adenocarcinomas delineated by multiregion sequencing

See allHide authors and affiliations

Science  10 Oct 2014:
Vol. 346, Issue 6206, pp. 256-259
DOI: 10.1126/science.1256930


Cancers are composed of populations of cells with distinct molecular and phenotypic features, a phenomenon termed intratumor heterogeneity (ITH). ITH in lung cancers has not been well studied. We applied multiregion whole-exome sequencing (WES) on 11 localized lung adenocarcinomas. All tumors showed clear evidence of ITH. On average, 76% of all mutations and 20 out of 21 known cancer gene mutations were identified in all regions of individual tumors, which suggested that single-region sequencing may be adequate to identify the majority of known cancer gene mutations in localized lung adenocarcinomas. With a median follow-up of 21 months after surgery, three patients have relapsed, and all three patients had significantly larger fractions of subclonal mutations in their primary tumors than patients without relapse. These data indicate that a larger subclonal mutation fraction may be associated with increased likelihood of postsurgical relapse in patients with localized lung adenocarcinomas.

Space, time, and the lung cancer genome

Lung cancer poses a formidable challenge to clinical oncologists. It is often detected at a late stage, and most therapies work for only a short time before the tumors resume their relentless growth. Two independent analyses of the human lung cancer genome may help explain why this disease is so resilient (see the Perspective by Govindan). Rather than take a single “snapshot” of the cancer genome, de Bruin et al. and Zhang et al. identified genomic alterations in spatially distinct regions of single lung tumors and used this information to infer the tumor's evolutionary history. Each tumor showed tremendous spatial and temporal diversity in its mutational profiles. Thus, the efficacy of drugs may be short-lived because they destroy only a portion of the tumor.

Science, this issue p. 251, p. 256; see also p. 169

Intratumor heterogeneity (ITH) may have impacts on tumor biopsy strategy, characterization of actionable targets, treatment planning, and drug resistance (16). ITH has recently been elucidated in substantial detail in several cancer types with the use of next-generation sequencing (NGS) approaches (714). Recent evidence supports a model of branched evolution leading to variable ITH in different tumors (9, 13, 15, 16). Studies in clear cell renal carcinoma (ccRCC) have demonstrated substantial ITH, with the majority of mutations in known cancer genes confined to spatially separated tumor regions, except that VHL loss is a ubiquitous event (16, 17). These data suggest that a single biopsy may be inadequate for identifying all cancer gene mutations from a tumor, as it presents an incomplete view of potential targets for therapy. Critically, the extent to which these observations in ccRCC apply to other solid tumors is currently not clear.

To characterize ITH in localized lung adenocarcinomas, we applied multiregion whole-exome sequencing (WES) on 48 tumor regions from 11 resected lung adenocarcinomas (eight stage I, two stage II, and one stage III tumors, tumor size 2 to 4.6 cm), all from patients who had surgery with curative intent (Fig. 1 and table S1). WES was conducted at a mean depth of 277×. In total, 7269 mutations were identified, and 7026 (97%) somatic mutations were validated by a separate bespoke capture sequencing experiment at mean depth of 863× (table S2). The numbers of mutations varied substantially between tumors (fig. S1), but no significant correlations were identified between mutation burden and age, gender, tumor size, lymph node status, or smoking status.

Fig. 1 Assessment of ITH of 11 lung adenocarcinomas by multiregion sequencing.

An example of regional mutation distribution (case 330) of resected tumors (represented by the ellipsis) is shown at the upper left corner. Mutated cancer genes are indicated to the right of representative hematoxylin and eosin–stained pathology image of each sequenced tumor region. Numbers of trunk, branch, and private branch mutations for each region are indicated in associated windows. A phylogenetic tree was generated from all validated mutations by using the Wagner parsimony method in PHYLIP. Blue, yellow, and red lines represent trunk, branch, and private branches, respectively. Trees are anchored at a germline DNA sequence obtained from peripheral blood of the relevant patients. Known cancer gene mutations are mapped to the trunks and branches as indicated. Point mutations, amplifications, and deletions of known cancer genes are presented as black, +red, and –green, respectively. Trunk and branch lengths are proportional to the numbers of mutations acquired on the corresponding trunk or branch. Note: Five tumors have their trunk lengths reduced to 10%* or 2%** of original length for visualization purposes.

A useful approach when considering ITH is to depict a given tumor as a tree structure with the trunk representing ubiquitous mutations present in all regions of the tumor, branches representing heterogeneous mutations present in only some regions of the tumor, and private branches representing mutations that are present only in one region of the tumor—analogous to a phylogenetic tree. Placement of mutations on trunks versus branches reflects relative molecular time of acquisition, with branch mutations occurring, by definition, subsequent to trunk mutations. We applied this approach to multiregion sequencing data from these 11 lung adenocarcinomas. Evidence for ITH was found in each tumor studied. On average, 76% of all mutations were detected in all regions of the same tumors. However, the phylogenetic structure varied considerably between tumors (Fig. 1). We then characterized known cancer gene mutations, defined as nonsynonymous mutations identical to those previously reported in known cancer genes (1823) or truncating mutations in known tumor suppressor genes, in the context of the derived phylogenetic tree structures. Thirteen of 14 known cancer gene point mutations were mapped to the trunks of the phylogenetic trees (Fig. 1 and table S3), which indicated that these mutations were acquired relatively early during evolution of these 11 tumors. In contrast to ccRCC, these data suggest that single-region sampling may be sufficient to identify the majority of known cancer gene mutations in localized lung adenocarcinomas.

We were also able to evaluate copy number changes relative to ITH. In contrast to ccRCC (16, 17, 24), we did not observe substantial differences in large-scale chromosome aberrations (fig. S2A), and the log2 ratio profiles were similar between different regions within the same tumors (fig. S2B and table S4). Furthermore, amplification or deletion of known cancer genes (22), as well as their relative placement on the phylogenetic trees, were delineated for these 11 lung adenocarcinomas. All of these events were mapped to the trunks of the phylogenetic trees (Fig. 1), which suggested that, like known cancer gene point mutations discussed above, amplification and/or deletion of known cancer genes were also early molecular events for these 11 tumors. Previous work in breast cancer also suggested that known cancer gene mutations were relatively early genetic events shared by all subclones of individual breast cancers (13). Taken together, these results indicate that different cancer types may have different relative timing of acquisition of cancer gene mutations. Further, the data would suggest in this subset of lung adenocarcinomas, there are likely mutations in noncanonical cancer genes that drive tumor development and subclonal divergence.

With a median follow-up of 21 months after surgery at the time of this report, three patients have had disease relapse. These three patients had a significantly larger proportion of subclonal nontrunk mutations (branch plus private branch mutations) in their primary tumors than patients without relapse (average 40% in relapsed patients versus 17% in patients without relapse, P = 0.006 by t test) (Fig. 1). Although the sample size is small, these findings suggest the possibility that subclonal mutations may be important for cancer progression and that larger subclonal mutation fraction may be associated with an increased likelihood of postsurgical relapse in this subset of lung adenocarcinoma patients.

Analysis of NGS data relies heavily on adequate sequencing depth to make high-accuracy consensus base calls. We compared our WES data (average sequencing depth 277×) with deep sequencing data (average sequencing depth 863×) to assess the effect of sequencing depth on detecting known cancer gene mutations. In tumor 499, a canonical KRAS p.G12C mutation was detected in only one of four tumor regions at exome depth but was detected in all four tumor regions at increased sequencing depth (table S2). Extending this analysis, we then compared deep sequencing data with WES data in defining ITH. The result showed many branch and private branch mutations defined by WES were detectable in all regions of individual tumors with increasing sequencing depth (Fig. 2). Taken together, these results indicate that considerable depth of sequencing will be necessary to detect cancer gene mutations and to accurately characterize ITH of lung adenocarcinomas.

Fig. 2 Distribution of trunk, branch, and private branch mutations defined by exome sequencing (average sequencing depth of 277×) versus deep sequencing (average sequencing depth of 863×).

Only validated mutations meeting the following criteria are included: total counts in tumor DNA ≥ 100; total counts in germline DNA ≥ 50; variant allele frequency (VAF) of ≥5% in tumor DNA, and VAF = 0 in germline DNA.

Next, we analyzed the mutational spectrum of these 11 lung adenocarcinomas. Consistent with previous studies (1820, 25), different mutation spectra were observed in smokers and nonsmokers. Three never-smokers (cases 292, 339, and 356) showed C>T-predominant mutation profiles. Three former smokers who had quit more than 20 years before (cases 270, 472, and 4990) and one former smoker who had a 25 pack-year (pack-year = number of packs per day × number of years) history of smoking and quit 6 years ago (case 283) also showed C>T-predominant mutation profiles, as in nonsmokers. Two former smokers who had a >50 pack-year history of smoking and had quit 5 years before (cases 317 and 499) and one former smoker, who had a 25 pack-year history of smoking and had quit only 2.5 years before (case 330) showed C>A-predominant mutation profiles consistent with the mutation profile of cigarette smoke exposure. The only current smoker, who had a 20 pack-year history of smoking but had cut down to two cigarettes a day at the time of cancer diagnosis (case 324), showed an equivalent portion of C>T (26%) versus C>A (21%) substitutions in her tumor (Fig. 3A). These results indicate that tumor mutation spectra in former smokers reflect not only quantity of smoking exposure but also time since smoking cessation.

Fig. 3 Mutation spectrum of the 11 lung adenocarcinomas.

(A) Mutation spectrum of all validated mutations. (B) Mutation spectrum of trunk mutations. (C) Mutation spectrum of nontrunk mutations. The difference of mutation spectrum between trunk and nontrunk mutations in each patient was evaluated with Fisher’s exact test, and significant P values are shown as *P < 0.05 and **P < 0.01. (D) APOBEC mutation signature enrichment odds ratio for trunk and nontrunk mutations. The 95% confidence intervals for Fisher’s exact test are indicated. PY, pack-year. ¶ indicates that the patient had cut down to two cigarettes a day at the time of cancer diagnosis.

We next compared the mutational spectrum of trunk versus nontrunk mutations to explore the relative contribution of mutational processes over time. Significant differences in mutational spectrum were observed in six tumors, which indicated that specific mutational processes were likely operative at different times during development of these tumors (Fig. 3, B and C). Of interest, two former smokers (cases 317 and 330) and the current smoker (case 324) showed significant differences between trunk and nontrunk mutation spectrum with a shift from smoking-associated C>A transversions in trunk mutations to nonsmoker-associated C>T transitions in nontrunk mutations.

Recent evidence has suggested that APOBEC activity is a major source for C>T and C>G mutations (12, 26). We therefore investigated whether there is evidence of an APOBEC mutational process in this subset of lung adenocarcinomas. On average, 28% of all mutations had a specific substitution pattern (C>T/G at TpCpW sites, where W is A or T), consistent with an APOBEC-mediated process (fig. S3). APOBEC mutation signature enrichment was found to be more pronounced for nontrunk mutations compared with trunk mutations in 7 of the 11 patients; however this difference was statistically significant only for case 330 (Fig. 3D). These data suggest that an APOBEC-like process is contributing substantially to the mutations found in this subset of lung adenocarcinomas and that the process tends to be more pronounced in later, subclonal mutations; this further highlights the dynamic nature of mutational processes in play.

Substantial variation in the allele frequency of somatic mutations within each individual tumor region from a given tumor was observed in this set of lung adenocarcinomas (fig. S4A). To more formally characterize subclonal fraction within each tumor region, we used the ABSOLUTE algorithm (27). These analyses demonstrated that at least 29 out of 48 individual tumor regions showed evidence of intraregional subclonal populations. The distribution of clonal and subclonal mutations was different among the sampled regions within the same tumors in some patients (fig. S4B), which suggests that single-biopsy analysis would be inadequate to fully represent ITH in these tumors.

To explore the possible implications of these data on translation to routine ITH assessment for in clinical practice, we repeated the ABSOLUTE analysis on the combined sequencing data from all tumor regions of each patient to assess the global ITH on a per-patient level, defined by the relative proportion of subclonal mutations. Similar to the phylogenetic analyses, all three patients with relapsed disease had larger subclonal fractions in their primary tumors (average 41% in patients with relapse versus 24% in patients without relapse, P = 0.045 by t test) (fig. S5A). Use of a complementary Bayesian Dirichlet process (13) on the per-patient combined data revealed the same trend (average subclonal mutations 66% in patients with relapse versus 36% in patients without relapse, P = 0.035 by t test) (fig. S5B). These results suggest that a measure of overall subclonal fraction may be of interest from a prognostic standpoint in this population of patients.

Resectable localized disease accounts for 30 to 50% of all non–small cell lung cancers, with increasing prevalence as screening is more widely implemented (2830). Given that this subset of patients has tumors surgically resected as standard of care, there is an opportunity to confirm these preliminary observations by deep sequencing multiregion samples obtained from resected tumors. The questions of whether sequencing-targeted cancer gene panels versus WES will yield sufficient mutation data for meaningful analyses and the most appropriate algorithms for analyses will need to be addressed in order to fully test if the clinical correlation suggested in these data are borne out in larger patient cohorts.

Evidence of marked regional ITH in ccRCC suggested substantial challenges to personalized oncology based on single-tumor biopsy to portray the mutational landscape. This study, however, provides evidence that ITH patterns may be different between cancer types. With the caveat of limited sample size fully acknowledged, these data suggest that, although multiregion sampling is needed to fully assess ITH complexity, single-biopsy analysis at appropriate depth might be sufficient to identify the majority of known cancer gene mutations in this subset of lung adenocarcinomas. Studies in much larger cohorts, ideally with comprehensive clinical annotation and repeat biopsy at relapse, are needed to fully understand the clinical impact of ITH and insights afforded by these types of analyses. Furthermore, extension of research to epigenetic and phenotypic assessment through regional DNA methylation, chromatin state, and RNA and/or protein expression studies over time and under treatment is needed to fully understand the impact of ITH on the biology of the cancer itself and its impact on the clinical phenotype of cancer patients.

Supplementary Materials

Materials and Methods

Figs. S1 to S6

Tables S1 to S4

References (3144)

References and Notes

  1. Acknowledgments: This study was supported by the Cancer Prevention and Research Institute of Texas (R120501), the University of Texas (UT) Systems Stars Award (PS100149), the Welch Foundation Robert A. Welch Distinguished University Chair Award (G-0040), Department of Defense PROSPECT grant (W81XWH-07-1-0306), the UT Lung Specialized Programs of Research Excellence grant (P50CA70907), the MD Anderson Cancer Center Support grant (CA016672), NIH T32 Research Training in Academic Medical Oncology grant (CA-009666), the MD Anderson Institutional Support for the Center for Translational and Public Health Genomics, and the A. Lavoy Moore Endowment Fund. The authors thank L. Chin and R. Verhaak for constructive discussions. Sequence data have been deposited at the European Genome-phenome Archive (EGA,, which is hosted by the European Bioinformatics Institute, under accession number EGAS00001000930.
View Abstract

Stay Connected to Science

Navigate This Article