Research Article

Radiation-related genomic profile of papillary thyroid cancer after the Chernobyl accident

See allHide authors and affiliations

Science  22 Apr 2021:
DOI: 10.1126/science.abg2538


The 1986 Chernobyl nuclear power plant accident increased papillary thyroid cancer (PTC) incidence in surrounding regions, particularly for 131I-exposed children. We analyzed genomic, transcriptomic, and epigenomic characteristics of 440 PTCs from Ukraine (359 with estimated childhood 131I exposure and 81 unexposed children born after 1986). PTCs displayed radiation dose-dependent enrichment of fusion drivers, nearly all in the mitogen-activated protein kinase pathway, and increases in small deletions and simple/balanced structural variants that were clonal and bore hallmarks of non-homologous end-joining repair. Radiation-related genomic alterations were more pronounced for those younger at exposure. Transcriptomic and epigenomic features were strongly associated with driver events but not radiation dose. Our results point to DNA double-strand breaks as early carcinogenic events that subsequently enable PTC growth following environmental radiation exposure.

The accidental explosion in reactor 4 at the Chernobyl (Chornobyl in Ukrainian) nuclear power plant in April 1986 resulted in the exposure of millions of inhabitants of the surrounding areas of Ukraine, Belarus, and the Russian Federation to radioactive contaminants (1). Epidemiologic and clinical research in the ensuing decades has demonstrated increased risk of papillary thyroid carcinoma (PTC) with increasing thyroid gland exposure to radioactive iodine (131I) from fallout, which was deposited on pastures with grazing cows and ingested through milk and leafy greens, particularly during early childhood (2). Together with data from populations exposed to other types of radiation, compelling evidence indicates that PTC risk increases following childhood exposure to ionizing radiation, a recognized carcinogen (25).

Currently, there are no established molecular biomarkers for cancers induced by radiation, nor have there been large-scale analyses of the genomic landscape of human cancers occurring after a well-quantified radiation exposure. Classical cytogenetic studies have demonstrated radiation dose-associated increases in large chromosomal aberrations (such as inversions and translocations) that reflect DNA double-strand breaks and are the current standard for biodosimetry; however, these assays are typically performed in peripheral blood lymphocytes from individuals exposed to whole body irradiation and have not been directly linked to tumor characteristics (6, 7). Next-generation sequencing of 12 second primary tumors of various types that occurred within the field of previous therapeutic ionizing radiation suggested an excess of small deletions and balanced inversions (8), but radiation dose estimates were not available. RNA sequencing (RNA-seq) analyses of 65 PTCs (mean age at diagnosis = 24.7 years) occurring after the Chernobyl accident identified that higher doses were associated with an increased likelihood of gene fusion drivers (9). In a genomic landscape analysis of 496 primarily unexposed PTCs (mean age at diagnosis = 46.8 years; 16 patients with known prior radiation exposure), The Cancer Genome Atlas (TCGA) reported a low density of somatic point mutations, especially for PTCs in younger patients, and a high frequency of activating somatic alterations in the mitogen-activated protein kinase (MAPK) pathway, including point mutations in BRAF (61.7%) and NRAS/HRAS/KRAS (12.9%) as well as fusions with RET (6.8%), BRAF (2.7%) and other MAPK-related genes (5.0%) (10).

Here, we report a comprehensive characterization of the genomic, transcriptomic, and epigenomic profile of PTC as well as non-tumor thyroid tissue and/or blood for 440 individuals from Ukraine who developed PTC after the Chernobyl accident (mean age at PTC = 28.0 years), affording an opportunity to investigate the contribution of environmental radiation to PTC characteristics. The study analyzed a collection of pre-treatment fresh frozen tumor tissues with pathological confirmation of first primary PTC by an international panel of experts through the Chernobyl Tissue Bank (CTB) (11, 12). Our study included 359 individuals with PTC with well-quantified 131I-exposure before adulthood (≤18 years of age; mean = 7.3 years) and, as controls, 81 131I-unexposed individuals with PTC born >9 months after the Chernobyl accident (all were born after March 1, 1987) (13).

Samples, clinical data, and analytic approach

Based on the availability of sufficient DNA and RNA extracted from CTB samples, we analyzed up to 440 individuals with whole genome sequencing (WGS) and/or mRNA-seq of pathologically confirmed tumor (374 both, 57 mRNA-seq only, 9 WGS only) (figs. S1 and S2) (13). Matched normal tissue with WGS and/or mRNA-seq included non-tumor thyroid tissue and/or blood (233 both, 182 non-tumor tissue only, 16 blood only, 9 normal tissue not available). The genomic landscape characterization was augmented by single nucleotide polymorphism (SNP) microarray genotyping (Illumina Infinium HumanOmniExpress-24 array) and relative telomere length quantification on all samples, and DNA methylation profiling (Illumina Infinium MethylationEPIC array) and microRNA (miRNA)-seq for PTC and non-tumor thyroid tissue (fig. S3). A total of 357 individuals had tumor sample data across all platforms.

The majority of individuals with PTC were female (n = 335, 76.1%), resided in the Kiev (Kyiv in Ukrainian) region at the time of the accident (n = 286, 65.0%), and were diagnosed during young adulthood (mean = 28.0 years, range: 10.0-45.6); 131I-unexposed individuals with PTC were born at least 9 months after the accident and thus had a younger average age than the exposed individuals (mean: unexposed = 20.7 years, exposed = 29.7 years) (table S1 and fig. S4). For 131I-exposed individuals, mean age at exposure was 7.3 years (range: in utero to 18.9 years) and mean time from the accident to PTC diagnosis was 22.4 years (range: 12.5-29.9). Radiation doses to the thyroid were reconstructed by an international team of dosimetry experts (1416). For 53 individuals, doses were estimated using detailed information derived from individual direct thyroid radioactivity measurements taken within 8 weeks of the accident, with (n = 49) or without (n = 4) personal interviews regarding residential history and dietary patterns. For the remaining individuals, dose estimates were derived from direct measurements taken for other individuals who lived in the same residential area (n = 249), neighboring area (n = 9), or other areas (n = 39), or based on dose estimates to the mother for individuals who were in utero at the time of the accident (n = 9). Mean estimated radiation dose was 250 mGy (range: 11-8,800) (figs. S4 and S5).

Our primary analyses investigated the relationship between 131I dose and 68 PTC molecular characteristics derived from a comprehensive genomic landscape analysis (Fig. 1) using multivariable linear, proportional odds, or logistic regression models adjusted for sex and age at PTC diagnosis (13). For associated variables (defined as P < 7.4 × 10−4 based on a Bonferroni correction for 68 tests), further analyses were conducted by specific molecular characteristics, as well as by age at PTC, age at exposure, and time since exposure (latency) because these factors influence radiation-related thyroid cancer risk (17). In addition, we conducted sensitivity analyses to assess whether the results were consistent when we restricted the population to 131I-exposed individuals with lower radiation dose (<500 mGy, resulting in n = 326 with mean dose = 110 mGy) (18, 19).

Fig. 1 Landscape of somatic alterations in 440 papillary thyroid carcinomas, by radiation dose from 131I exposure.

Blank (white) spaces represent unavailable data due to lack of data from a specific platform (figs. S1 to S3). Signature analyses were restricted to high purity samples, defined as those with tumor purity >20% and no evidence of tumor contamination in the normal tissue. Abbreviations: BRAFV600E-RAS score (BRS), copy neutral loss of heterozygosity (CNLOH), deletion (DEL), dinucleotide polymorphism (DNP, i.e., doublet), ERK-activity score (ERK), indel (ID), insertion (INS), microsatellite (ms), papillary thyroid carcinoma (PTC), probability of causation (POC), single nucleotide variant (SNV), somatic copy number alteration (SCNA), single nucleotide variant (SNV), structural variant (SV), thyroid differentiation score (TDS), trinucleotide polymorphism (TNP, i.e., triplet).

Simple somatic variants

WGS analysis of tumor/normal pairs (n = 383; mean sequencing depth, tumor = 89X, non-tumor thyroid tissue = 33X, blood = 33X; table S2) revealed a low burden of simple somatic variants (SSV) (mean = 0.27 nonsynonymous mutations per Mb) (Fig. 1 and figs. S6 and S7), which was lower than in older TCGA PTC cases (0.41 nonsynonymous mutations per Mb) (10) and comparable to mutationally quiet tumors typically reported for pediatric cancers (20). A total of 318,956 SSVs were identified, the majority (93.3%) of which were single nucleotide variants (SNVs) (n = 297,512 in 383 tumors; mean per tumor = 776.8), whereas small insertions and deletions (indels) were less common (insertions: n = 5,842, 1.8%, mean = 37.2; deletions: n = 14,231, 4.5%, mean = 15.3), and doublet and triplet base substitutions were rare (dinucleotide polymorphism [DNP] or doublet: n = 1,351, 0.4%, mean = 3.5; trinucleotide polymorphism [TNP] or triplet: n = 20, 0.006%) (table S3). Among the 3,886 coding mutations (1.2% of total; 0.35/Mb), most were nonsynonymous (3,023/3,886 = 77.8%). Approximately one-third of mutations (n = 114,898, 36.0%) were clonal (cancer cell fraction ≥0.9), regardless of mutation type (SNV = 35.9%, insertions = 36.7%, deletions = 38.0%) (table S3 and fig. S8) (13).

In multivariable analyses restricted to n = 356 samples with both high tumor and normal tissue purity (Fig. 1 and fig. S9) (13), increased radiation dose was associated with an increase in small deletions (P = 8.0 × 10−9) as well as the deletion:SNV ratio (P = 4.9 × 10−21), but not other SSV types (Fig. 2 and table S4). In addition, we observed the expected increase in the burden of SNVs (P = 3.2 × 10−6), doublet mutations (P = 2.7 × 10−5), insertions (P = 1.5 × 10−6), and deletions (P = 7.4 × 10−16) with increasing age at PTC diagnosis (table S4) (10). Few of these mutations were clustered (22 clusters [>2 mutations within 150 base pairs (bp)] in 18 cases; 83 clusters [>2 mutations within 1 kb] in 36 cases) and were not associated with radiation dose (P > 0.3). In an analysis of the frequency and types of microsatellite indels in the tumors, detected using MSMuTect (21), all tumors were microsatellite stable (mean [range] per tumor, insertions = 1.8 [0-7], deletions = 7.3 [0-24]), and radiation dose was not significantly associated with the number of microsatellite insertions or deletions (table S4).

Fig. 2 Relationship between radiation dose from 131I exposure and small deletions.

(A) Total small deletion count and restricted to (B) clonal and (C) subclonal small deletions. (D) Total deletion:SNV ratio and restricted to (E) clonal and (F) subclonal deletions and SNVs. (G) Total ID5 count and restricted to (H) clonal and (I) subclonal ID5. (J) Total ID8 count and restricted to (K) clonal and (L) subclonal ID8. β per 100 mGy and P-value were derived from multivariable linear regression models adjusting for age at PTC and sex. Gray shading indicates 95% confidence interval (CI). Full model results are provided in table S18.

An investigation of mutational processes in PTC was conducted using SigProfiler to determine both single base substitution (SBS) and small indel (ID) mutational signatures (20, 22). Comparing the PTC mutations with known signatures from the Catalogue of Somatic Mutations in Cancer (COSMIC v3, (20), the majority of the SBS signatures (69.9%) were attributable to clock-like signatures (SBS1 = 9.8%, SBS5 = 60.2%), with smaller fractions due to APOBEC (apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like) cytidine deaminase DNA-editing activity (SBS2 = 6.2%, SBS13 = 6.4%), damage from reactive oxygen species (SBS18 = 0.9%), and two signatures of unknown etiology (SBS8 = 15.1%, SBS23 = 1.6%) (mean cosine similarity between actual mutations and attributed patterns = 0.94) (Fig. 1, table S5, and figs. S10 to S12). In multivariable analyses, no SBS signatures were significantly associated with radiation dose, whereas increased age at PTC diagnosis was associated with an increase in clock-like SBS mutations (PSBS1 = 1.9 × 10−7; PSBS5 = 6.8 × 10−17) as well as SBS8 (P = 4.3 × 10−11) and SBS18 (P = 2.0 × 10−8) (table S4).

The majority (54.0%) of indels were attributed to clock-like signatures (ID1 = 14.2%, ID5 = 39.8%), 21.2% to ID3 (tobacco smoking, which is not a major risk factor for PTC), 19.0% to repair of DNA double-strand breaks by end-joining mechanisms (ID6 = 3.3%, ID8 = 15.8%), and 5.8% to ID4 (unknown etiology) (mean cosine similarity = 0.77) (Fig. 1, table S5, and figs. S10 to S12). In multivariable analyses, radiation dose was strongly associated with end-joining-related indel mutational patterns (P = 1.5 × 10−10), particularly ID8 (P = 7.3 × 10−9), and more weakly with the clock signature ID5 (P = 1.3 × 10−4) (Fig. 2 and table S4). In comparison, increased age at PTC diagnosis was associated with significantly increased numbers of ID3 (P = 1.9 × 10−7), ID5 (P = 1.9 × 10−9), and ID8 (P = 6.6 × 10−4) mutational patterns (table S4). De novo signature extraction did not reveal a novel signature related to environmental exposure to ionizing radiation but identified 4 SBS (mean cosine similarity = 0.96) and 2 ID (mean cosine similarity = 0.83) signatures highly correlated with COSMIC signatures described above (table S4, table S6, and figs. S10 to S12). Similarly, no novel signature was identified when we restricted the analysis to PTCs in individuals who received ≥200 mGy (table S5).

Structural variation

Overall, 479 structural variants (SV) were identified in 356 tumors; approximately one-quarter of SVs (n = 132, 27.6%) were simple/balanced events (balanced interchromosomal translocations and inversions), one-half (n = 253, 52.8%) were simple/unbalanced (such as deletions, unbalanced interchromosomal translocations), and the remaining (n = 94, 19.6%) were complex (such as >2 breaks repaired together in a cluster) (table S6 and figs. S13 and S14) (13). Approximately one-third of tumors (n = 113, 31.7%) had no SV, one-third (n = 126, 35.4%) had one SV, and the remaining third had two (n = 61, 17.1%) or more (n = 56, 15.7%) SVs. Two tumors had >10 SV events (fig. S15), one of which was the only tumor with evidence for chromothripsis (age at PTC = 30.2 years, age at exposure = 1.3 years, dose = 1000 mGy).

Multivariable analyses (n = 354, excluding the 2 outliers) demonstrated that increasing radiation dose was significantly associated with increased SV count (P = 1.4 × 10−8), particularly simple/balanced SVs (P = 1.2 × 10−14) but not those classified as complex (P = 0.52) or simple/unbalanced (P = 5.6 × 10−3) (Fig. 3A and table S4). Increasing radiation dose also was not associated with occurrence of chromoplexy (P = 0.70) (table S4), which was identified in 19 tumors (n = 15 single event, n = 4 two events) (13, 23), nearly all unexposed or in the lower dose groups (nunexposed = 7, n1-99 mGy = 8, n100-199 mGy = 2, n≥500 mGy = 2).

Fig. 3 Relationship between radiation dose from 131I exposure and selected SV metrics.

(A) Number of simple/balanced SVs. (B) Likelihood of having a fusion versus mutation driver. (C) Number of clonal ≥5 bp EJ small deletions. (D) Number of confirmed clonal simple/balanced/end-joining SVs. (E) Number of confirmed clonal other SVs (E). Different scales are used for each panel to reflect the distributions and uncertainties of the excess odds ratio (EOR) estimates. Referent group for categorical analyses: EOR = 0 (which is equivalent to odds ratio = 1). EOR per 100 mGy and P-value were derived from multivariable proportional odds or logistic regression models adjusting for age at PTC and sex. Full model results are provided in table S18.

Somatic copy number alteration

A total of 40.3% (n = 143/355) of the tumors evaluated for somatic copy number alterations (SCNA) had one (n = 96, 27.0%) or more (n = 47, 13.2%) such events (fig. S16) (13). Four tumors had ≥20 SCNAs each (fig. S17): the tumor with chromothripsis and three additional tumors (age at PTC = 19.0-29.3 years, age at exposure = 1.3-2.3 years, dose = 125-175 mGy). Those three tumors predominantly had gains or copy neutral loss of heterozygosity (CNLOH) and were the only tumors with ploidy>2.5, with one of the three displaying extensive CNLOH (>20 arms), similar to previous reports for the rare thyroid Hürthle cell carcinoma (24, 25). Exclusion of the four tumors with ≥20 SCNAs each yielded 239 total SCNAs: 69 (28.9%) at the chromosome level, of which 48 were deletions, and 170 (71.1%) sub-chromosomal, of which 106 were deletions (table S6).

In multivariable models, radiation dose was related to the number of sub-chromosomal SCNAs (P = 3.5 × 10−5), particularly deletions (P = 7.0 × 10−4) but not gains (P = 0.32) or CNLOH (P = 0.52); radiation dose also was not related to the number of chromosomal SCNAs (P = 0.20) (table S4). The most frequent recurrent event was loss of 22q (n = 47/353, 13.3%) (figs. S18 and S19), but occurrence of 22q deletions was not associated with radiation dose (P = 0.37) (table S4).

Drivers of PTC

We identified at least one candidate driver for 433 of 440 (98.4%) tumors (fig. S20 and table S7) (13), with the majority (n = 401; 92.6%) having a single candidate driver, underscoring the parsimony of events driving PTC carcinogenesis. We designated 429 (97.5%) drivers for analysis (Fig. 1). Over half the designated driver events (n = 253, 59.0%) were mutations (SSVs; table S8), predominantly activating point mutations in genes previously implicated in PTC. The most commonly mutated gene was BRAF (n = 194, 45.2%), where all the mutations either were canonical BRAFV600E substitutions (n = 190) or disrupted the V600 sequence context (n = 4). RAS genes were the next most commonly mutated (n = 44, 10.3%), specifically NRAS (n = 20, 4.7%), HRAS (n = 15, 3.5%), and KRAS (n = 9, 2.1%). Additional mutation drivers were identified in TSHR (n = 6, 1.4%), DICER1 (n = 3, 0.7%), APC (n = 2, 0.5%), TSC1/TSC2 (n = 2, 0.5%), and NFE2L2 (n = 2, 0.5%). In TCGA, 9.4% of PTC harbored TERT promoter mutations, often in older individuals (8), but only one individual with a TERT promoter mutation was observed in our study (age at PTC diagnosis = 40.7 years, designated driver = BRAFV600E).

Fusion drivers accounted for 176 (41.0%) PTC cases (table S9). The most frequently involved genes were RET (n = 73, 17.0%) as well as other receptor tyrosine kinase (RTK) genes, specifically NTRK3 (n = 36, 8.4%), NTRK1 (n = 13, 3.0%), ALK (n = 12, 2.8%), and LTK (n = 3, 0.7%). Additional fusion drivers included BRAF (n = 20, 4.7%) and PPARG (n = 13, 3.0%), as well as SVs that resulted in overexpression of IGF2 or IGF2BP3 (n = 6, 1.4%). Of the 23 chromoplexy events described above, 16 generated driver fusions. All 22q deletions co-occurred with known driver mutations, most frequently RAS mutations (Pheterogeneity = 2.8 × 10−10; n = 22/38, 56.4%) (table S10).

In multivariable analyses, fusion drivers in PTC were more common in individuals exposed to higher radiation dose (P = 6.6 × 10−8) and in those diagnosed at younger ages (P = 5.4 × 10−9) relative to those with mutation drivers (Fig. 3B and table S4). There was a suggestion of a heterogeneous effect of dose by specific gene fusion (Pheterogeneity = 0.020), with higher doses on average for PTCs with IGF2/IGF2BP3 or BRAF fusion drivers, whereas the dose distribution did not differ significantly among mutation drivers (BRAF, RAS, other; Pheterogeneity = 0.17) (Fig. 4). We extended our observations by inclusion of 45 non-overlapping individuals with PTC (excluding the 20 individuals already in our analyses) drawn from a previous Chernobyl study with known drivers identified with RNAseq (7); over half had doses ≥500 mGy (mean age at PTC = 24.2 years, mean age at exposure = 7.2 years, mean dose = 1050 mGy) (table S1). That smaller sample set also suggested a radiation dose-related increase in fusion drivers (P = 0.069), which was consistent with the results from our study (Pheterogeneity = 0.90; pooled analysis of fusion vs. mutation driver, adjusting for age at PTC, sex, and study: P = 4.6 × 10−9).

Fig. 4 Distribution of radiation dose from 131I exposure by driver type and pathway.

Gene expression and methylation patterns

We conducted several analyses to assess whether gene expression and methylation patterns were related to radiation dose. First, unsupervised clustering analyses restricted to PTC tumor tissue yielded 5 mRNA clusters, 5 miRNA clusters, and 3 methylation clusters (figs. S21 and S22) (13). None of these clusterings were associated with radiation dose (PmRNA = 0.85; PmiRNA = 0.38; Pmethylation = 0.10), but each closely correlated with the driver gene pathway (PmRNA = 1.6 × 10−64; PmiRNA = 1.0 × 10−9; Pmethylation = 6.4 × 10−43) (Fig. 5A, tables S4 and S11, and fig. S23), supporting the overriding importance of the driver for RNA expression patterns (10). Second, we identified three transcriptional patterns important in PTC based on the TCGA analysis (10): the BRAFV600E-RAS score (BRS), estimating the degree to which the mRNA, miRNA, and methylation profiles resemble either BRAFV600E or RAS-mutated PTC; the thyroid differentiation score (TDS), based on expression of 16 thyroid metabolism and function genes; and the ERK-activity score of 52 expressed genes responsive to MEK inhibition (13). As expected, the mRNA, miRNA, and methylation BRS scores were highly correlated with one another (r = 0.78-0.92) (table S12). Consistent with TCGA (10), the three different mRNA-based scores also were significantly correlated, particularly the BRS with both the TDS (r = 0.69) and the ERK score (r = -0.66). None of the scores were associated with radiation dose after correction for multiple testing (PmRNA-BRS = 2.1 × 10−3; PmiRNA-BRS = 5.5 × 10−3; Pmethylation-BRS = 0.082; PERK = 0.011, PTDS = 7.8 × 10−3) (table S4 and fig. S24A), whereas each was strongly related to driver gene pathway (P < 1.0 × 10−30 for all scores) (fig. S24B).

Fig. 5 Selected RNA-seq results.

(A) Differential expression by driver and cluster. (B) Differential expression for all genes by radiation dose from 131I exposure. (C) Differential expression of CLIP2 by radiation dose from 131I exposure. Abbreviation: principal component (PC).

To confirm the lack of association between radiation exposure and gene expression patterns, we conducted exploratory analyses of the differential expression of specific genes and gene sets by dose (13). In multivariable linear regression models adjusted for age at PTC, sex, and batch, the Padjusted for dose was <0.05 for five genes (Fig. 5B and table S13), with the smallest P-value (Padjusted = 8.0 × 10−3; log10-fold expression change/100 mGy = 0.059) for transfer RNA asparagine (anticodon GUU) (TRNAN-GUU-2), which is a target of the transcription factor MYBL2, a key regulator of cell cycle progression and apoptosis (26, 27). However, each of these associations were attenuated when the model was further adjusted for the driver gene pathway (table S13). In contrast, over half the genes were differentially expressed (Padjusted < 0.05) among the different driver gene pathways (table S13). Despite previous reports suggesting radiation dose could be linked to CLIP2 expression (2830), no such relationship was observed in our substantively larger study, which included 33 overlapping samples from the previously-published analyses (29) (fig. S25), either in the overall set of PTC cases (P = 0.42, Fig. 5C) or in subsets defined by early age at radiation exposure (fig. S26). An exploration of expression signatures through gene set enrichment analyses was pursued in the Molecular Signatures Database (MSigDB; (31). For 3,213 gene sets (those related to hallmark biological processes, thyroid, radiation, and the genes included in germline analyses below) (13), multivariable regression models adjusting for age at PTC, sex, and batch revealed similar results as those above for single gene differential expression analyses, namely no gene set expression patterns were significantly associated with radiation dose (Padjusted < 0.05), whereas over half were strongly associated with the driver gene pathway (fig. S27 and table S14).

Germline genetic variation

Possible contribution of germline genetic variation to radiation-related PTC was investigated in individuals of comparable Ukrainian ancestry (n = 383 individuals, including 305 exposed, 78 unexposed) (fig. S28). Twelve previously reported risk SNPs for sporadic PTC were used to generate a polygenic risk score (PRS) (32). Multivariable analyses adjusting for population substructure revealed that unexposed individuals with PTC and those who received lower radiation doses were more likely to have higher genetic risk (P = 4.7 × 10−4) (Fig. 6 and table S4). Analyses of the 12 individual SNPs, albeit underpowered, yielded three possible associations with radiation dose: rs1588635 (9q22.33; P = 0.012), rs2289261 (15q22.33, SMAD3; P = 0.030) and rs10069690 (5p15.33, TERT; P = 0.054) (table S15).

Fig. 6 Relationship between radiation dose from 131I exposure and PRS.

Data for the 12 single nucleotide polymorphisms that comprise the PRS are provided in table S15.

Investigation of rare potentially protein-damaging variants in genes and pathways related to thyroid or other cancer predisposition, clinical radiation sensitivity syndromes, and DNA damage response revealed no major differences in the burden of these variants among individuals who developed PTC after different radiation doses (tables S4, S16, and S17). Only four individuals (n = 2, 2.6% unexposed; n = 2, 1.2% <100 mGy; 0, 0% ≥100 mGy) carried potentially protein-damaging variants in known thyroid cancer susceptibility genes (tables S16 and S17).

Detailed analyses of molecular characteristics associated with radiation dose

Analyses by clonality for each of the deletion metrics (total deletion count, deletion:SNV ratio, and ID5 and ID8 mutational patterns) revealed that the radiation dose-related associations were restricted consistently to clonal rather than subclonal deletions (Fig. 2 and table S18). Similarly, analyses of SCNAs also demonstrated associations only with clonal but not subclonal sub-chromosomal deletions (table S18). Because distinct repair mechanisms can generate deletions of different lengths (3335), we further stratified the clonal deletion count by length (fig. S29) and found the strongest association between radiation dose and ≥5 bp clonal deletions with patterns characteristic of end-joining repair (P = 4.9 × 10−31) (Fig. 3C and table S18). These results are consistent with the ID8 mutational association and suggest a key role for end-joining mechanisms in repairing radiation-induced DNA double-strand breaks. Analyses of the ≥5 bp clonal deletions by the amount of microhomology at the deletion boundary revealed consistent associations between radiation dose and deletions with 0-1 bp microhomology as well as those with ≥2 bp microhomology (table S18). These results implicate non-homologous end-joining (NHEJ) repair mechanisms, which are employed regardless of the amount of microhomology, whereas alternative end-joining (alt-EJ) repair mechanisms such as theta-mediated end-joining (TMEJ) typically generate deletions with ≥2 bp microhomology (3335). In an ancillary analysis, we quantified in the small insertions the number of TINS (locally templated insertions), which are characteristic of TMEJ repair (34), and found that TINS were not associated with radiation dose (P = 0.69) (fig. S30). Further exploration of insertions and deletions by genomic sequence context (8) revealed only weak correlations for radiation dose with occurrence of deletions classified by flanking GC content (P = 0.015), proximity to CPG islands (P = 0.014), and the mean replication timing at the variant locus (P = 0.010); in each case deletions in the higher radiation dose groups were more similar to a random background distribution (table S19) (13). No such correlations between radiation dose and genomic sequence context were observed for insertions.

We undertook similar analyses of SVs after confirming each event (table S6) (13), identifying those SVs with <20 bp of intervening loss/gain at the breakpoint, which indicates repair by end-joining mechanisms (3335). Increased radiation dose was strongly associated with simple/balanced SVs that were clonal (P = 1.4 × 10−16) but not subclonal (P = 0.91), with a pronounced association for clonal simple/balanced SVs enriched for patterns characteristic of end-joining mechanisms (P = 5.5 × 10−19) (Fig. 3D) versus other clonal SVs (P = 0.41, Fig. 3E) (table S18). Further analyses demonstrated consistent associations for radiation dose with clonal simple/balanced/end-joining SVs with <4 bp and 4-<20 bp of intervening loss/gain (table S18). Similar to our observations for small deletions, these results specifically implicate the importance of NHEJ repair, which accounts for almost all <4 bp events but which could contribute regardless of the amount of intervening loss/gain. By comparison, alt-EJ repair mechanisms primarily give rise to events with ≥4 bp of intervening loss/gain (3335). Additional analyses of clonal simple/balanced/end-joining SVs by type revealed a strong association between radiation dose and inversions (P = 3.6 × 10−14), consistent with a previous report (8), but also an association with translocations (P = 4.4 × 10−4) (table S18).

For each of the radiation dose-associated variables, the results were similar when we restricted the study population to exposed individuals (table S20). Albeit based on limited statistical power, further restriction to individuals with exposures 1-<500 mGy revealed consistent associations for dose only with the clonal deletion:SNV ratio, enrichment of fusion drivers, and presence of clonal simple/balanced/EJ SVs (table S20). Linear-quadratic and linear-exponential models of radiation dose generally did not improve the model fit compared with a linear model for any of the variables, except for clonal small deletions (total and restricted to ≥5 bp EJ deletions) (table S20).

Notably, radiation dose-related increases in clonal deletions (particularly the deletion:SNV ratio, ID8, and the number of clonal ≥5 bp EJ small deletions), as well as fusion PTC drivers were substantially more pronounced for individuals exposed at younger ages (Fig. 7 and tables S21 and S22), albeit based on small numbers for certain analyses. In contrast, the radiation dose-related increase in SCNA clonal sub-chromosomal deletions was most pronounced at longer latencies (table S21).

Fig. 7 Relationship between radiation dose from 131I exposure and selected genomic characteristics by age at exposure.

Clonal deletion:SNV ratio for (A) <5 years at exposure, (B) 5-9 years at exposure, and (C) ≥10 years at exposure. Number of clonal ID8 mtuations for (D) <5 years at exposure, (E) 5-9 years at exposure, and (F) ≥10 years at exposure. Number of clonal ≥5 bp EJ small deletions for (G) <5 years at exposure, (H) 5-9 years at exposure, and (I) ≥10 years at exposure. Likelihood of having a fusion versus mutation driver for (J) <10 years at exposure and (K) ≥10 years at exposure. All analyses exclude 131I-unexposed individuals. β or EOR per 100 mGy and P-value were derived from multivariable linear, proportional odds, or logistic regression models adjusting for age at PTC and sex. Full model results are provided in table S22. * Models evaluating the effect of dose on driver type restricted to <5 years of age at exposure did not converge, so individuals exposed at <5 and 5-9 years were combined in panel J. EOR/100 mGy for 5-9 years alone = 1.78, 95%CI = 0.12-226. ^^ in Panel H indicates that the upper 95% CI exceeds the y-axis maximum value.

Radiation-related acceleration of PTC development

Exploratory analyses to address previous reports that ionizing radiation exposure accelerates aging and cancer development (36, 37) revealed no such evidence in our study population. First, we stratified analyses of the relationship of clock-like SBS and ID signatures with age at PTC and latency but found no effect modification by radiation dose (age at PTC: PSBS = 0.63, PID = 0.93; latency: PSBS = 0.28, PID = 0.21) (figs. S31 and S32). Additionally, radiation dose-dependent associations with key molecular characteristics did not appear to be strongly modified by latency, after accounting for age at exposure and age at PTC (table S21). Analyses of relative telomere length demonstrated the expected association between decreased telomere length and increased age at PTC in blood (P = 3.2 × 10−5) but not in normal thyroid tissue (P = 0.99) or PTC (P = 0.81), and there was no association between relative telomere length and thyroid radiation dose (P > 0.4 for all tissues). Methylation profiles were evaluated to estimate epigenetic age acceleration using two established metrics (38, 39). Regressing epigenetic age against chronological age in the non-tumor thyroid tissue and then comparing the residuals from this predicted age in the PTC tissue (13) revealed no association between age acceleration and radiation dose using either metric (P > 0.1).


Our large-scale integrated analyses of the genomic landscape of PTC that developed following the 1986 Chernobyl nuclear power plant accident provide consistent evidence that ionizing radiation-induced DNA damage, particularly double-strand breaks, represents an early carcinogenic event in thyroid tumorigenesis following radiation exposure. These findings substantially extend preliminary reports of radiation-related human tumor characteristics (8, 9) by integrating data from multiple platforms with large sample size and detailed radiation dose data. Increasing radiation dose was strongly associated with increased likelihood of fusion versus point mutation drivers, simple/balanced SVs, and small deletions, with the strongest associations observed for those that bore hallmarks of NHEJ repair and were clonal, particularly for individuals exposed at a young age. However, no unique radiation-related biomarker was identified. Together, our results indicate that thyroid tumorigenesis following radiation exposure results from DNA double-strand breaks in the genome that have an impact on key thyroid cell growth and differentiation genes, which in turn drive the expression and epigenetic characteristics of individual PTCs.

Most tumors had evidence for only a single, known oncogenic driver, which involved the MAPK pathway in nearly all cases, which is consistent with previously published studies of sporadic PTCs (10, 40, 41). These findings combined with the low mutational burden in thyroid tumors emphasize the efficiency of driver mutations in thyroid tumorigenesis even following ionizing radiation exposure, in contrast to other environmentally-driven cancers, such as cigarette smoking and lung adenocarcinoma or ultraviolet light and melanoma, that often require multiple drivers and have multiple subclones together with substantial somatic burden (23).

Based on multiple lines of evidence, our study demonstrates striking radiation dose-related increases in DNA double-strand breaks in human thyroid cancers developing after the Chernobyl accident, extending results from in vitro and animal radiobiological experiments (3335). In contrast, the PTCs did not have evidence of radiation-related specific base mutations or clustered mutations (42). Cells with DNA double-strand breaks can recruit various repair mechanisms, each of which leaves characteristic evidence in the repaired sequence. A series of analyses consistently implicated NHEJ as the most important repair mechanism for the radiation dose-associated DNA double-strand breaks observed in the PTCs. While the importance of end-joining repair in human tumors has been reported previously (8), our detailed examination of the local sequence context for the SVs (including fusion drivers) and small deletions enabled us to identify that radiation dose was most clearly associated with NHEJ rather than alt-EJ or other repair mechanisms. The lack of association between radiation dose and TINS further demonstrated the lack of importance of alt-EJ mechanisms. The importance of NHEJ repair also was supported by the lack of significant association between radiation dose and mutation signatures associated with APOBEC, which preferentially targets intermediates in replication and repair by homologous recombination (43). Our results necessitate further research such as using genetically modified organoids (44, 45) to establish the causal role of radiation-related DNA double-strand breaks predominantly repaired by NHEJ in human carcinogenesis.

The role of radiation-related DNA damage as an early step in PTC carcinogenesis following the Chernobyl accident is further supported by the lack of association between radiation dose and PTC transcriptomic and epigenomic features, despite the use of various analytic approaches, including clustering, differential expression by gene or miRNA, and gene set enrichment analyses. With our large sample size, we did not confirm the previously reported association between radiation dose and CLIP2 expression (2830), even when we restricted our analyses to individuals exposed at younger ages. Notably, however, the PTC transcriptomic and epigenomic features differed strikingly by driver gene/pathway, supporting the importance of the specific driver in shaping the tumor profile (10, 40, 41). Utilization of both WGS and RNA-seq enabled us to identify a driver in 98% of the PTCs in our study. Deletion of chromosome 22q has been suggested as a driver for PTC, but all cases in our study with 22q deletions also had other known PTC drivers, suggesting that 22q did not act independently in our set of individuals who developed PTC during young adulthood. Intriguingly, however, 22q deletions were strongly related to the driver pathway, occurring most commonly in RAS-mutated PTCs, suggesting that 22q deletion could provide a growth advantage or otherwise enhance the effect of certain MAPK drivers.

With our large sample size, we were able to explore the independent effects of radiation dose, age at PTC, age at exposure, and latency on PTC molecular characteristics. The pronounced evidence of radiation-related damage that we observed for individuals exposed at younger ages is consistent with epidemiologic analyses that have identified higher thyroid cancer risks with radiation exposure at younger ages (17). The relationship of a number of molecular characteristics, particularly total mutational burden and driver type, with age at PTC warrants further investigation across a broader age range (10). Additional research with detailed dose data are needed to understand whether our findings extend across a broader dose range, to other types of radiation, as well as to other tumor types, and whether radiation-related genomic characteristics have an impact on histopathological parameters (4648). It has been hypothesized that ionizing radiation exposure could accelerate tumor development, and substantial evidence demonstrates that cancer survivors exposed to high-dose radiotherapy exhibit an aging phenotype (36, 37). However, exploratory analyses within our data did not support this hypothesis.

Our results have important implications for radiation protection and public health, particularly for low dose exposure, from two perspectives. First, the lack of a unique radiation-related pattern of molecular characteristics in the PTCs in our study, due in part to the random nature of ionizing radiation-related damage across the genome as well as the fact that other mutagens can cause DNA double-strand breaks, suggests that we are yet to establish a reliable biomarker to distinguish tumors induced by radiation versus other causes. Nevertheless, further investigation is warranted to consider how the radiation dose-associated characteristics we identified could be incorporated into estimates of the probability that a specific papillary thyroid tumor was caused by 131I exposure (probability of causation [POC] (fig. S5) (49, 50)), which is currently based on prior epidemiologic studies (17). Second, our data are consistent with a linear dose-response for the key molecular characteristics associated with radiation dose in the range examined in our analysis (≤1 Gy), which aligns with the extensive radiobiological literature and other epidemiologic evidence regarding DNA damage and cancer risk following ionizing radiation exposure (51, 52).

Our study population included a substantial number of PTCs occurring after <100 mGy exposure, likely reflecting both the availability of samples from the Chernobyl Tissue Bank as well as the increased detection of pre-existing PTC in the population that may not become clinically evident until later, if at all, due to intensive screening and heightened awareness of thyroid cancer risk in Ukraine. The increased genetic risk based on the PRS was notable among PTCs that occurred after lower doses despite limited statistical power to investigate germline genetic variants. The low overall mutational burden of early adulthood PTC, small sample sizes in certain population subgroups, and uncertainties in radiation dose estimates also limited our statistical power to thoroughly investigate the shape of the dose-response curve, precisely identify the magnitude of radiation-related effects (as reflected by the wide confidence intervals for many effect estimates), or reliably identify new radiation signatures.

In conclusion, we have characterized the genomic landscape of PTC, the most frequent cancer observed after the Chernobyl nuclear accident. Our results demonstrate a dose-dependent carcinogenic effect of radiation derived primarily from DNA double-strand breaks repaired by NHEJ that initiate subsequent thyroid tumor growth, the patterns of which are shaped not by radiation exposure but rather by the specific driver gene. The consistency of the spectrum of PTC drivers in our study population compared with previous PTC series suggests that current therapeutic approaches for PTC are appropriate even for tumors that arise following radiation exposure (53). Our work provides a foundation for further investigation of radiation-induced cancer, particularly with respect to differences in risk as a function of both dose and age, and underscores the deleterious consequences of ionizing radiation exposure.

Supplementary Materials

Materials and Methods

Figs. S1 to S32

Tables S1 to S22

References (54123)

Data S1 to S3

MDAR Reproducibility Checklist

References and Notes

  1. See supplementary materials.
Acknowledgments: The authors gratefully acknowledge the commitment of the staff of the Laboratory of Morphology of Endocrine System and the staff of the Department of Surgery of Endocrine System of IEM, who prepared the pathological material for the study and operated on the patients; the confirmation of thyroid tumor diagnoses provided by the International Pathology Panel of the Chernobyl Tissue Bank, including Professors A. Abrosimov, T. Bogdanova, G. Fadda, J. Hunt, M. Ito, V. Livolsi, J. Rosai, and E.D. Williams, and Dr. N. Dvinskyh; Nathan Appel (Information Management Services, Inc., Calverton, MD) for programming support; Elizabeth C. Sasse (Radiation Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute) for support in creating figures and editorial review of the manuscript; Dr. Clara Bodelon (Integrative Tumor Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute) for consultation regarding methylation age calculations; Dr. Jia Liu (Cancer Genomics Research Laboratory, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research) for consultation regarding analyses of mRNA expression and assessment of cross-platform concordance; Dr. Bin Zhu (Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute) for consultation regarding analyses of mRNA expression; Cameron Palmer (Cancer Genomics Research Laboratory, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research) for assistance computing principal components for ancestry analyses; Timothy Myers (Laboratory of Genetic Susceptibility, Division of Cancer Epidemiology and Genetics, National Cancer Institute) for independent review of SV sequence context; and the leadership of The Cancer Genome Atlas for conduct of an initial pilot study. This work utilized the computational resources of the NIH HPC Biowulf cluster ( The opinions expressed by the authors are their own, and this material should not be interpreted as representing the official viewpoint of the U.S. Department of Health and Human Services, the National Institutes of Health, or the National Cancer Institute. Funding: This project was supported by the Intramural Program of the National Cancer Institute, National Institutes of Health. The Chernobyl Tissue Bank is supported by the National Cancer Institute (U24CA082102). Author contributions: The following authors conceptualized the study: L.M.M, D.M.K., C.S., T.I.B., E.T.D., A.V.B., A.B.d.G., G.A.T., M.D.T., G.G., and S.J.C. The following authors designed the study methodology: L.M.M, D.M.K., C.S., T.I.B., E.T.D., M.K.S., J.D., S.W.H., S.J.S., J.N.S., Y.M., V.K., D.A.R., J.C.G., C.M.P., J.S.P., M.Y., O.L., M.J.M., A.V.B., V.D., S.M., M.C., L.Y.Z., G.A.T., G.G., and S.J.C. The following authors analyzed and synthesized the data: L.M.M, D.M.K., C.S., E.T.D., M.K.S., J.D., S.W.H., S.J.S., Y.M., V.K., M.Y., O.L., M.J.M., V.D., A.B.d.G., G.A.T., G.G., and S.J.C. The following authors collected or generated the study data: C.S., T.I.B., E.T.D., M.K.S., J.D., M.Y., J.F.B., A.H., B.D.H., C.L.D., J.M.G.F., J.B., E.K.C., S.M., M.C., L.Y.Z., G.A.T., and M.D.T. The following authors contributed resources (study materials, patients, computing resources, or other analysis tools): T.I.B., M.K.S., J.D., M.K., A.V.B., K.M., V.D., M.H., A.B.d.G., G.A.T., M.D.T., G.G., and S.J.C. The following authors curated the research data: L.M.M, D.M.K., C.S., T.I.B., E.T.D., V.K., M.K., A.H., C.L.D., J.M.G.F., and J.B. The following authors wrote the original draft: L.M.M, D.M.K., C.S., E.T.D., M.K.S., J.D., S.W.H., S.J.S., M.Y., G.A.T., G.G., and S.J.C. The following authors visualized the data: L.M.M, D.M.K., C.S., T.I.B., E.T.D., M.K.S., J.D., V.K., O.L., and G.G. The following authors supervised or managed the research: L.M.M, D.M.K., C.S., G.A.T., M.D.T., G.G., and S.J.C. The following authors had responsibility for managing and coordinating the research activity: L.M.M, D.M.K., C.S., T.I.B., E.T.D., E.K.C., A.B.d.G., G.A.T., M.D.T., G.G., and S.J.C. The following authors acquired funding for the study: G.A.T., M.D.T., and S.J.C. All authors edited the final manuscript. G.A.T., M.D.T., G.G., and S.J.C. jointly directed this work. Competing interests: E.T.D. is an employee of Nvidia Corporation and owns stock in Nvidia, Illumina, and Pacific Biosciences. G.G. receives research funds from IBM and Pharmacyclics, and is an inventor on patent applications related to MuTect, ABSOLUTE, MutSig, MSMuTect, MSMutSig, MSIdetect, POLYSOLVER and TensorQTL. G.G. is a founder, consultant and holds privately held equity in Scorpion Therapeutics. All other authors declare no competing interests. Data and materials availability: Analytic data are available in Data S1-S3. The Materials and Methods text specifies code that has been posted to GitHub and is archived on Zenodo (13). Raw molecular data are available from the Genomic Data Commons, accessed through the database of Genotypes and Phenotypes (dbGaP, accession phs001134; Samples are available from G.A.T. under a material agreement with the Chernobyl Tissue Bank.

Stay Connected to Science


Navigate This Article