Special Reviews

Quality Control by DNA Repair

See allHide authors and affiliations

Science  03 Dec 1999:
Vol. 286, Issue 5446, pp. 1897-1905
DOI: 10.1126/science.286.5446.1897


Faithful maintenance of the genome is crucial to the individual and to species. DNA damage arises from both endogenous sources such as water and oxygen and exogenous sources such as sunlight and tobacco smoke. In human cells, base alterations are generally removed by excision repair pathways that counteract the mutagenic effects of DNA lesions. This serves to maintain the integrity of the genetic information, although not all of the pathways are absolutely error-free. In some cases, DNA damage is not repaired but is instead bypassed by specialized DNA polymerases.

The large genomes of mammalian cells are vulnerable to an array of DNA-damaging agents, of both endogenous and environmental origin. This situation requires constant excision and replacement of damaged nucleotide residues by DNA repair pathways to counteract potentially mutagenic and cytotoxic accidents. Consequently, DNA exhibits very slow but substantial turnover in vivo, despite its role as carrier of stable genetic information. No correction procedure is going to be absolutely exact and error-free, but repair of common DNA lesions clearly demands highly accurate performance. In practice, an altered nucleotide residue is usually replaced after the removal of a short segment of the damaged strand and a copying of the intact complementary strand. The most frequent DNA lesions are efficiently removed by this route, with a pathway called base excision repair (BER) working mainly on common modifications caused by endogenous agents, and another, nucleotide excision repair (NER), operating mainly on helix-distorting damage caused by environmental mutagens (Fig. 1). Even these restricted procedures require special strategies to function at high accuracy, such as protection of reactive DNA intermediates by protein-DNA interactions, and access to a high-fidelity DNA-copying machine with editing capacity.

Figure 1

Several pathways used by human cells to withstand alterations to DNA bases. From left to right, the MGMT enzyme can remove a methyl group fromO 6-methylguanine, directly reversing the modification. Base excision repair (BER) corrects common endogenous modifications such as a uracil arising from deamination of cytosine, excising the damaged base and usually replacing just one nucleotide residue. Nucleotide excision repair (NER) removes lesions such as the T-C pyrimidine dimer in oligonucleotide form, by excision and replacement of a segment of ∼27 residues. Sometimes lesions are not eliminated; instead, specialized DNA polymerases are used to insert residues opposite damaged sites so that DNA replication can proceed.

More complex and unusual forms of DNA damage can only be dealt with by cellular procedures that appear surprisingly crude and error-prone. Radiolysis of water and generation of hydroxyl radicals along a track of ionizing radiation can yield clustered sites of base damage in both strands, and attempts to correct such damage by standard BER can result in a DNA double-strand break (1); subsequent fusion of the double-strand break by nonhomologous end-joining may involve loss of nucleotide residues around the break, leading to a mutagenic or abortive lethal repair event. Other challenges to the cell are various noncoding DNA lesions that block the normal replication machinery. In emergency situations, cells sometimes use specialized DNA polymerases that can read through a lesion, but at the expense of sometimes inserting incorrect bases (2).

It remains a matter of debate whether most spontaneous mutations in humans arise from mis-copying a damaged DNA template before repair has had time to occur, or whether the intrinsic marginal inaccuracy of DNA replication factories copying an undamaged template is responsible. The main DNA replicative enzyme in eukaryotes, DNA polymerase δ (POL δ), has both polymerase and 3′ → 5′ exonuclease activity and can efficiently excise a misincorporated residue by exonucleolytic proofreading. The well-characterized DNA mismatch correction system (3) further minimizes replication errors by a systematic survey of newly synthesized strands. In addition, accessory factors such as the DNA helicases encoded by the genes defective in Werner syndrome and Bloom syndrome apparently serve to improve accuracy during DNA elongation, possibly due to resolution of stalled replication forks. Despite all these precautions, occasional misincorporated nucleotides, deletions, and insertions may remain to be expressed as rare mutations.

A comprehensive overview of quality control in DNA would include a discussion of DNA polymerase fidelity and postreplicative mismatch correction and would also consider the damage-responsive cell-cycle checkpoints and the signal transduction systems that lead to cellular effects. Here we focus more closely on the main DNA lesions in human cells and on rapidly accumulating information about the distinct strategies used to repair or tolerate these adducts.

DNA Lesions

Cellular DNA is susceptible to accidental damage from a variety of reactive normal metabolites; active oxygen poses a special threat. Similarly, spontaneous hydrolysis of nucleotide residues occurs to a marked extent at 37°C and is another unavoidable form of DNA damage that necessitates constant repair. The problem has existed since the origin of life, and DNA repair enzymes acting on endogenous lesions often show strong evolutionary conservation from microbes to humans. Correction of such damage usually proceeds through the BER pathway, being initiated by excision of an altered base residue in free form by a DNA glycosylase, followed by a short-patch excision-repair event. Several distinct DNA glycosylases have been identified, and the substrate specificities of these enzymes (Table 1) are a reliable indicator of DNA lesions of sufficient frequency and importance to have provoked the evolution of specific cellular repair functions. The key lesions can generally be removed by both a main repair pathway and one or more backup pathways. This is particularly the case in mammalian cells, as deduced from comparative studies on microbial mutants and ongoing work with gene knockout mice.

Table 1

DNA glycosylases in human cell nuclei.

View this table:

The main lesions generated in DNA by hydrolysis, reactive oxygen species, and small reactive intracellular molecules such asS-adenosylmethionine have been reviewed previously (4). These apparently innocuous agents, including water and oxygen, jeopardize the integrity of the genome. More precise numerical estimates of the rates of spontaneous DNA decay are gradually becoming available with the development of new sensitive methods for DNA damage analysis, such as the detection of small amounts of abasic sites (5), but the main conclusions about the importance of such endogenous DNA damage have not changed.

A technical difficulty that has caused confusion in the field is the problem of isolating DNA for measurements of oxygen radical–induced damage without causing artifactual DNA oxidation during the work-up procedure (6). In particular, DNA guanine residues are readily oxidized to 8-hydroxyguanine (8-oxo G). Initial estimates of the background amount of this promutagenic DNA lesion in normal human cells and organs yielded surprisingly high values, but current estimates (7) are of steady-state levels <100 to 1000 8-oxo G residues in normal cells. Such results indicate that previous reports on the presence of high quantities of 8-oxo G in organ DNA as an apparent consequence of physiological aging should be interpreted with caution.

The major new development over the last few years with regard to identification of endogenous lesions in mammalian DNA has been the detection of lipid peroxidation by-products as exocyclic DNA base adducts. The most abundant of these appears to be the exocyclic pyrimidopurinone called M1G, which is generated by reaction between a G residue in DNA and the lipid peroxidation product malondialdehyde (8). The M1G adduct is one of a small group of bulky oxidative DNA lesions that are repaired through the NER pathway rather than by BER. The mutagenic and cytotoxic M1G lesion is not chemically stable in DNA but undergoes decomposition to a secondary ring-opened derivative. In addition to malondialdehyde, lipid peroxidation may yield acrolein and crotonaldehyde, which are readily metabolized to epoxides and then can generate exocyclic etheno modifications of DNA bases. Two such bases, etheno-A and etheno-C, are excised efficiently by DNA glycosylases (9), which strongly suggests that generation of such adducts occurs at sufficiently high rates in vivo to endanger genomic stability.

Large numbers of various environmental mutagens exist. For humans, the most important self-inflicted mutagen is tobacco smoke, which is responsible for more cancer deaths worldwide than any other identifiable compound. Inadvertent exposure to more than fairly harmless traces of other environmental mutagens seems rare in the Western world, with one exception. For many organisms, including humans, the most important environmental mutagen by far is the ultraviolet (UV) component of sunlight. The main DNA lesions are dipyrimidine photoproducts, principally cyclobutane pyrimidine dimers and (6-4) photoproducts. These lesions are both mutagenic and cytotoxic. Mutations that inactivate tumor suppressor genes in skin cancer, for example, p53, often exhibit the signature pattern of UV-induced sequence changes in two adjacent pyrimidine residues in DNA. The main function of the ubiquitous NER pathway is the removal of UV-induced lesions from DNA, and defects in this pathway in human cells lead to the serious cancer-prone inherited disease xeroderma pigmentosum (XP). Remarkably, humans have no backup pathway for this important cellular defense mechanism, and NER-defective individuals are totally unable to excise pyrimidine dimers from DNA. This situation seems unique to placental mammals, because lower eukaryotes, plants, and bacteria all have additional defense systems against UV light, such as DNA photolyases to monomerize dimers or DNA glycosylases or nucleases to specifically incise DNA at pyrimidine dimers. A likely scenario is that early mammalian precursors were small, furry, nocturnal animals, in which there was no selection pressure to preserve a backup system to NER for handling large amounts of UV damage. With more recent changes in human life-style involving frequent exposure by poorly pigmented individuals to intense sunlight, it seems unfortunate that an additional UV repair system was not retained during recent evolution.

Besides dipyrimidine adducts, near-UV light also causes covalent changes in DNA reminiscent of oxidative damage, such as ring-saturated pyrimidine derivatives. Although these are largely screened and removed by BER enzymes, such near-UV irradiation is nevertheless implicated as a possible causative agent of malignant melanoma in humans. Further work is required to identify the DNA lesion or lesions responsible and to explain the breakdown in DNA quality control in this case.

Damage Reversal

Although direct reversal of UV damage by photoreactivating enzymes has not been detected in placental mammals, another unusual form of DNA repair involving active reversal of DNA damage is present in human cells. This involves correction of the miscoding alkylation lesion O 6-methylguanine, which is generated endogenously in small amounts by reactive cellular catabolites (10). A specific methyltransferase removes the deleterious methyl group from the DNA guanine residue and transfers it to one of its own cysteine residues in a rapid and error-free repair process (11). However, S-methylcysteine is chemically very stable, so the methylated repair protein is not regenerated; consequently, the repair system is readily saturated when cells are exposed to external alkylating agents. This mode of DNA quality control seems well suited for removing a rare but highly mutagenic DNA lesion by the energetically expensive approach of sacrificing an entire protein molecule for each lesion corrected.

O6-Methylguanine pairs ambiguously with both C and T, causing transition mutations. The methylated nucleoside is probably flipped out from DNA to be accommodated in an active-site pocket of the repair enzyme. This epigenetically controlled damage reversal function is occasionally not expressed, leading to cytotoxic abortive attempts to correct the lesion by mismatch repair. Increased resistance, or tolerance, to alkylating agents in such cells can be achieved at the price of an increased amount of spontaneous mutagenesis, by loss of the mismatch repair system (12).

Base Excision Repair

Release of altered bases by BER is initiated by DNA glycosylases that hydrolytically cleave the base–deoxyribose glycosyl bond of a damaged nucleotide residue. The three-dimensional structures and modes of action of several DNA glycosylases have been clarified and reviewed recently (13). Eight human proteins of this type (Table 1) in general show little sequence similarity to one another, even for enzymes that act on the same or closely related substrates. However, the enzymes hNTH1, hOGG1, MYH, and MBD4, which recognize quite different types of DNA damage, clearly have some structural similarities. The human DNA glycosylases have a catalytic domain of ∼250 amino acid residues and use an NH2-terminal or COOH-terminal region for additional interactions. These extra regions are largest in the enzymes that have the complicated task of removing a normal but mismatched base from DNA. The two DNA glycosylases TDG and MBD4 remove a deaminated 5-methylcytosine (= thymine) residue, and the MYH enzyme excises an A residue misincorporated opposite to an 8-oxo G; these DNA glycosylases have to make specific interactions with the complementary strand in addition to recognizing altered bases.

A common strategy for DNA glycosylases, largely deduced from structural studies, appears to be facilitated diffusion along the minor groove of DNA until a specific type of damaged nucleotide is recognized. The enzyme then kinks the DNA by compression of the flanking backbone in the same strand as the lesion, flips out the abnormal nucleoside residue to accommodate the altered base in a specific recognition pocket, and mediates cleavage (14). The DNA glycosylase then may remain clamped to the damaged site until displaced by the next enzyme in the BER pathway, the endonuclease APE1 (also called HAP1), which has greater affinity for the abasic site (Fig. 2). This strategy (14, 15) protects the cytotoxic abasic residue and may delay the rearrangement of the base-free deoxyribose into a reactive free aldehyde conformation that could cause cross-linking and other unwanted side effects.

Figure 2

Single-nucleotide replacement pathway for BER. The example shown is for repair of a T residue arising when 5-methylcytosine (meC) in a CpG sequence (A) is deaminated (B). TDG glycosylase removes the thymine and recruits the APE1 endonuclease (C). APE1 cleaves the chain on the 5′ side of the abasic site and recruits POL β; TDG dissociates (D). POL β releases the remnant 5′-deoxyribosephosphate (dRp), inserts a C residue, and recruits the LIG3-XRCC1 complex (E). LIG3 seals the nick as POL β dissociates (F). The LIG3-XRCC1 complex is liberated (G). To restore the DNA to its original methylation state, a DNA methyltransferase would need to act on the newly synthesized C residue. Sequential binding of protein monomers would be expected to improve repair accuracy and specificity (85). Not shown here is an alternative longer patch BER pathway that can act after chain cleavage in step (D) and involves POL β, POL δ, or POL ɛ together with PCNA, FEN1, and LIG1.

Several of the mammalian DNA glycosylases deal with the mutagenic threat of uracil and 5-methyluracil opposite G in DNA. This common type of lesion mainly is generated by hydrolytic deamination of cytosine and 5-methylcytosine, although enzymatic deamination of 5-methylcytosine residues might also occur in cells with anomalously low amounts of the S-adenosylmethionine methylation cofactor. The most abundant enzyme, UNG, which occurs both in nuclei and mitochondria, is partly sequestered in replication factories (16), where it is well placed to remove occasional U residues that have been incorporated instead of T by use of deoxyuridine triphosphate (dUTP) as a precursor for DNA synthesis. This is a problem separate from C deamination; incorporation of U instead of T is not a directly mutagenic change but results in altered binding affinities for transcription factors and other regulatory proteins. Consequently, complete substitution of T with U in cellular DNA is not compatible with viability. The recently discovered hSMUG1 enzyme (17) removes uracil from single-strand regions of DNA. Such regions are transiently generated during transcription and replication and are particularly susceptible to C deamination, which is >100 times as fast in single-strand as in double-strand DNA. Generation of uracil and removal by hSMUG1 in a DNA bubble structure before reannealing could be one strategy to facilitate strand specificity in the subsequent excision-repair process. Two other DNA glycosylases, TDG and MBD4, remove both uracil and thymine at opposite G residues, implicating repair of deaminated 5-methylcytosine residues as one of their roles (18). In agreement with this notion, both enzymes act particularly efficiently at TpG/GpC sequences, interacting not only with the complementary DNA strand but also with the flanking residue in the same strand as the deaminated 5-methylcytosine. Our recent data on uracil-DNA glycosylase activities in cell extracts from UNG knockout mice have revealed a fifth distinct activity of this type, emphasizing that correction of uracil in DNA is a major biological problem that demands substantial and partly overlapping activities to retain a high extent of DNA quality control. Even in nongrowing cells, expedient removal of uracil from DNA appears necessary to avoid transcriptional base substitution that would generate mutant proteins and phenotypic changes (19). Additional distinct uracil- or thymine-DNA glycosylases, not clearly related to mammalian enzymes, have been described in thermophilic Archaea; such organisms also have DNA polymerases with read-ahead functions that stall incorporation when a uracil residue is detected in the template strand (20).

The same multiplicity of DNA glycosylases to deal with a single type of lesion is not observed for other forms of DNA damage, although oxidation of G to 8-oxo G in DNA is handled in two different stages by two different DNA glycosylases, as first described in Escherichia coli (21). An 8-oxo G residue opposite a C is removed by a specific DNA glycosylase, called OGG1 in eukaryotes (22), but there appears to be no backup DNA glycosylase for 8-oxo G-C base pairs in mammalian cells (7, 23). Replication of an 8-oxo G residue before repair has had time to take place is prone to errors, because either a C or an A can be incorporated in the daughter strand. The MYH glycosylase (24) exists to remove misincorporated A residues opposite 8-oxo G.

Oxidized pyrimidines, such as thymine glycol and cytosine glycol, are excised by the hNTH1 glycosylase. This enzyme is surprisingly ineffective by itself, and both in vivo and in vitro data implicate the NER enzyme XPG as a cofactor for hNTH1 (25, 26). In a role quite distinct from that in NER, the XPG protein appears to help load this DNA glycosylase onto damaged DNA. Interestingly, XPG deficiency results in a severe phenotype with developmental defects not observed for defects in other XP genes (27), and XPG deletion mutations in humans cause serious and complex clinical disease, combining features of XP and Cockayne syndrome (CS); these effects may be related to the inefficiency of repair of pyrimidine glycol residues in DNA (25, 26). The mode of interaction between XPG and hNTH1 at damaged DNA appears similar to that between the bacterial DNA mismatch repair functions MutL and MutS, where MutL enhances binding of MutS to DNA mismatches (28).

MPG glycosylase (29) is a broad-spectrum enzyme that excises various derivatized adenine residues. The main substrate is probably the alkylation product 3-methyladenine. The main recognition feature may be a delocalized positive charge in the purine ring rather than a structural distortion. The versatility of this DNA glycosylase means that absolute discrimination between normal and damaged bases is particularly difficult; as a result, the mammalian enzyme can contribute to the release of normal purines from DNA in vivo, although the effect is much smaller than that of nonenzymatic hydrolytic depurination of DNA (30). The problem seems most acute for the MPG enzyme of Saccharomyces cerevisiae, called MAG1, which shows relatively poor ability to discriminate between damaged and undamaged purines in DNA (30). In apparent consequence, overexpression of this yeast enzyme results in a strong mutator phenotype attributable to the error-prone bypass of an abundance of apurinic sites (31).

How many different DNA glycosylases exist? Although the extensive biochemical studies on this problem over the past 25 years have possibly uncovered most of these enzymes, such that the list in Table 1of human activities is relatively close to completion, the novel proteomics approach used in the recent discovery of the hSMUG1 enzyme promises to provide a general method for isolation of additional DNA glycosylases without initial knowledge of their substrates. Because DNA glycosylases remain bound to abasic sites in DNA after cleavage of the base-sugar bond, DNA containing noncleavable analogs of abasic sites can be used to specifically bind such enzymes in protein libraries obtained by expression cloning, followed by cDNA sequencing (17).

The main human apurinic/apyrimidinic (AP) endonuclease APE1 occupies a pivotal position in BER of anomalous residues such as uracil, recognizing and cleaving at the 5′ side of abasic sites generated both by spontaneous hydrolysis and by DNA glycosylases. Abasic sites generated by nonenzymatic depurination probably greatly outnumber those generated by all the DNA glycosylases; consequently, APE1 and subsequent key proteins in the BER pathway are essential, whereas mice with knockouts of various DNA glycosylases so far investigated have been viable (32). Structurally, APE1 belongs to the superfamily of nucleases that also contains pancreatic DNase I (33), but additional protein loops in the active-site region ascertain complete specificity for abasic sites in DNA. In a substrate recognition process similar to DNA glycosylases, APE1 flips out the base-free deoxyribose residue from the double helix before chain cleavage (14, 33). When bound to DNA, the APE1 protein interacts with the next enzyme in the BER pathway, POL β, and recruits the polymerase to the site of repair (34). POL β and subsequent protein factors in the main mammalian BER pathway have no direct counterparts in microorganisms, which makes genetic studies of the completion of BER in mammalian cells time-consuming. POL β has two distinct domains that are well suited for its main physiological role as the polymerase used for DNA gap-filling during BER. The larger domain is the polymerase domain itself, whereas a small basic NH2-terminal domain contains an AP lyase activity that excises the abasic sugar-phosphate residue at the strand break (35). POL β also interacts with the noncatalytic XRCC1 subunit of the XRCC1–DNA ligase III heterodimer. Consequently, XRCC1 acts as a scaffold protein by bringing the polymerase and ligase together at the site of repair; further stabilization of the complex may be achieved by direct binding of the NH2-terminal region of XRCC1 to the DNA single-strand break (36).

In cases where the terminal sugar-phosphate residue has a more complex structure that is relatively resistant to cleavage by the AP lyase function of POL β, DNA strand displacement may instead occur, involving either POL β or a larger polymerase such as POL δ, for filling-in of gaps a few nucleotides long (37). The FEN1 structure-specific nuclease removes the displaced flap and the PCNA protein stimulates these reactions (38), acting as a scaffold protein in this alternative pathway in a way similar to that of XRCC1 in the main pathway. Another replication factor, DNA ligase I (LIG1) then completes this longer patch form of repair. An important property of FEN1 here, in addition to the processing of 5′ ends of Okazaki fragments during lagging-strand DNA replication, is to minimize the possibility of hairpin loop formation and slippage during strand displacement and subsequent DNA synthesis, which might otherwise result in local expansion of sequence repeats (39). Temporary inefficiency in this process during early mammalian development could explain the origin of several human syndromes such as Huntington's disease, which are associated with expansion of triplet repeats in relevant genes.

A series of pairwise interactions between the relevant proteins in BER seems to occur (Fig. 1), in most cases without any direct strong protein-protein interactions in the absence of DNA. The XRCC1-LIG3 heterodimer is the only preformed complex, and no large preassembled multiprotein BER complex is likely to exist. Nevertheless, the consecutive ordered interactions may serve to protect reaction intermediates and ensure efficient completion of the correction process after the initial recognition of DNA damage.

Repair of Strand Breaks

Reactive oxygen species cause DNA base damage and also produce chain breaks by destruction of deoxyribose residues. Such single-strand interruptions are processed and rejoined by the same enzymes that are responsible for the later stages of BER, sometimes with the additional steps of exonucleolytic removal of frayed ends and phosphorylation of 5′-termini by DNA kinase. In contrast to the continuous protection of DNA reaction intermediates when an altered base residue is replaced, however, the initial strand break is fragile and attracts unwelcome recombination events. An abundant nuclear protein, poly(ADP-ribose) polymerase-1 (ADP is adenosine diphosphate), called PARP1, appears to have as its main role the temporary protection of DNA single-strand interruptions, and consequently acts as an antirecombinogenic factor (40). PARP1 rapidly shuttles on and off strand breaks in DNA, with NAD-dependent synthesis of poly(ADP-ribose) as its release mechanism. Although not detected in yeast, PARP protein is present in mammalian cells as well as in lower eukaryotes (such as dinoflagellates) that carry a substantial proportion of tandemly repeated sequences in their genomes. PARP1 knockout mice are viable but show increased numbers of spontaneous sister chromatid exchanges and sensitivity to ionizing radiation. Extracts from cells of such mice still contain low concentrations of other PARP enzymes, which may have distinct unknown roles but also could serve in backup functions. Interestingly, crosses of PARP1 knockout mice with severe combined immunodeficiency disease (SCID) knockout mice that lack DNA-dependent protein kinase, which is required for V(D)J recombination during lymphocyte development, alleviates the DNA processing defect in the latter mice and allows some low-fidelity recombination (41). PARP1 plays no clear role in the BER process itself, but as do POL β and LIG3, it interacts with the scaffold protein XRCC1 and may in this way accelerate the recruitment of these repair enzymes to strand interruptions (42).

In addition to single-strand breaks, 1/15 to 1/20 as many DNA double-strand breaks are generated after exposure to ionizing radiation; such breaks are also intermediates during site-specific recombination by nonhomologous end-joining. An unprotected double-strand break is a dangerously cytotoxic lesion. A surprisingly large number of nuclear proteins bind specifically to double-strand breaks. Besides protecting the lesion, they apparently serve to signal the presence of such damage and to instruct cell-cycle control proteins about the imminent hazard. These proteins include the DNA dependent protein kinase with its DNA binding Ku subunits, the related large protein kinases ATM and ATR, and also PARP, the three-component exonuclease RAD50-MRE11-p95, the DNA ligase IV–XRCC4 heterodimer, and the homologous recombination factor RAD52 (43). Repair of double-strand breaks by homologous recombination with another allele can be achieved with high fidelity, whereas repair by nonhomologous end-joining, the principal pathway in mammalian cells, may result in lost or changed genetic information. The balance between these two pathways is apparently influenced by the relative amounts of RAD52 and Ku (44).

Nucleotide Excision Repair

NER acts on a wide variety of helix-distorting DNA lesions such as the pyrimidine dimers caused by UV light and such chemical adducts as those caused by benzpyrene, aflatoxin, and cisplatin. The most important function of NER in humans is to remove UV-induced damage from DNA. This is apparent from the existence of the inherited disorder XP, where NER-defective individuals have a 1000 times as great risk of skin cancer as normal individuals (45). The incidence of common internal cancers in XP patients is increased by at most only a small fraction, indicating that DNA damage created by agents capable of initiating such tumors is usually corrected by routes other than NER. A rare but heterogeneous condition, XP includes seven genetic complementation groups (XP-A through XP-G) that represent different proteins in the NER pathway as well as a separate form, XP-V or variant.

NER in human cells involves recognition of DNA damage, incision of the DNA strand containing a lesion, and DNA synthesis and ligation to replace an excised oligonucleotide (46). Six core factors, comprising 15 to 18 polypeptides, are required for dual incision of damage, and another dozen or so polypeptides are needed for the repair synthesis step (Fig. 3). The dual-incision factors are the XPA protein, the single-strand DNA binding heterotrimer RPA, the XPC-hHR23B complex, the 6-9 subunit TFIIH complex, and two nucleases, XPG and the heterodimeric ERCC1-XPF. A key intermediate is an open, unwound structure formed around a lesion in a reaction that uses the ATP-dependent helicase activities of XPB and XPD, two of the subunits of TFIIH. This creates sites for cutting by the XPG and ERCC1-XPF enzymes, which recognize junctions between single-strand and duplex DNA and cut with specific polarities. A 24- to 32-residue oligonucleotide is released, and the gap is filled in by POL δ or ɛ holoenzyme (47) and is sealed by a DNA ligase, probably LIG1 (48).

Figure 3

Nucleotide excision repair in nontranscribed regions (the bulk of DNA). Initially, a distortion is recognized, probably by the XPC-hHR23B protein (A). An open bubble structure is then formed around a lesion in a reaction that uses the ATP-dependent helicase activities of XPB and XPD (two of the subunits of TFIIH) and also involves XPA and RPA (B). Formation of this open complex creates specific sites for cutting on the 3′ side by the XPG nuclease and then on the 5′ side by the ERCC1-XPF nuclease (C). After a 24- to 32-residue oligonucleotide is released, the gap is filled in by PCNA-dependent POL ɛ or δ and sealed by a DNA ligase, presumably LIG1 (D).

The mechanism of DNA damage recognition in NER is a long-standing problem. The efficiency of repair of different kinds of lesions varies over several orders of magnitude. To a first approximation this roughly correlates with the extent of distortion caused by an adduct. However, to be well-repaired by NER, a lesion must both distort the structure and covalently modify the DNA. Distortion alone is not sufficient, given that very disruptive lesions such as small loops and mismatches are repaired very poorly, if at all (49). Conversely, nondistorting adducts such as seemingly harmless modifications of sugar residues are readily removed, if the altered nucleoside is placed within a mismatch (50). The best way to explain these observations currently is by a “bipartite” or two-step model for recognition. In the first step, a distortion is recognized; in the second, the damaged strand and chemical alteration are located. The second step seems likely to involve interruption of the strand translocation activities of the TFIIH helicases when a damaged site is encountered (51).

The XPC-hHR23B, XPA, RPA, and TFIIH factors all have some preference for binding to damaged over nondamaged DNA. Among these, XPC-hHR23B has by far the strongest discrimination for damaged sites; several lines of evidence point to it as the earliest distortion recognition factor (52), although this view has been challenged (53). The homologous complex in S. cerevisiae, Rad4–Rad23, also has a high discrimination for damage (54). XPC-hHR23B is an ATP-independent binding factor, and the wide variety of substrates repaired by NER makes it unlikely that it senses individual DNA lesions, or even that it can alone determine which of the two strands is damaged.

Another cellular activity with a high affinity and discrimination for some lesions in DNA is the UV-DNA damage binding factor UV-DDB. It is formed of two subunits, p125 and p48, and causative mutations in the p48 subunit occur in XP group E cells (55). UV-DDB expression clearly contributes to the efficiency of pyrimidine dimer removal in cells, but its role in DNA repair is enigmatic (56). One possibility is that UV-DDB is specialized for detecting damage within chromatin (57). Almost certainly, factors other than those in Fig. 3 will be needed to facilitate access of NER enzymes to damaged DNA in the tightly packed chromatin of cell nuclei. Some of the same chromatin-modifying complexes that are coupled to transcription and DNA replication may be involved.

It is increasingly evident that the overall strategy for NER in eukaryotes has many similarities to the process initiated by the UvrABC nuclease in bacteria, even though the latter uses many fewer enzyme subunits. In both human cells and E. coli, there is an energy-independent distortion recognition factor (XPC in humans and UvrA in E. coli), followed by energy-dependent recognition of DNA damage using DNA helicases (TFIIH in humans and UvrB in E. coli). In both cases, the helicases create an open preincision complex that is cleaved by structure-specific nucleases and an oligonucleotide is then released by dual incision (58).

The model (Fig. 3) is a general one used for nucleotide excision repair of nontranscribed DNA (the bulk of the genome). In the special case of active genes, the transcribed DNA strand is corrected up to 5 to 10 times as fast as the nontranscribed strand, and this type of NER is termed transcription-coupled repair (59). All of the factors required for NER of nontranscribed DNA are used except for XPC. This shows that XPC is not an obligatory factor required to recruit the NER complex to a lesion. The arrest of RNA POL II progression at a lesion in the template serves as an alternative damage-recognition signal, and the rest of the NER factors are then attracted. Such recruitment involves additional factors, including the A and B proteins of CS, which serve to couple RNA POL II stalling to repair and perhaps to polymerase displacement. In every case, however, TFIIH is required, showing that this factor truly carries out two quite separate functions in cells, using its strand-separating helicases to create an open structure around DNA lesions during NER, as well as to open the DNA near promoters during mRNA transcription initiation. Transcription coupling of BER has also emerged and is of crucial importance in repair of oxidative base lesions (25).

One fascinating feature of mammalian NER proteins is that most of them have dual functions, participating in other aspects of DNA metabolism. For example, RPA is the major single-strand DNA binding protein in cells, necessary for semiconservative replication and recombination as well as NER. Disruptions of RPA genes are incompatible with life, as shown in yeast. Similarly, complete disruptions of the TFIIH subunits XPB or XPD are not viable because they knock out the vital transcription function. More subtle alterations in XPD are tolerated, and mouse models for the inherited disorders XP, trichothiodystrophy, and CS have been obtained with directed mutations similar to those found in human XP-D patients (60). Disruption of the ERCC1 subunit of the ERCC1-XPF nuclease has severe consequences. Animals with this disorder are abnormally small, die before weaning, and show chromosome and tissue abnormalities in the liver and other organs (61). The explanation apparently rests with a second function of ERCC1-XPF in a pathway of double-strand break repair, where the enzyme trims away nonhomologous 3′ single-strand tails (62). Knockouts of XPG in mice and in humans are also very severe. Such mice die early because of failure to properly develop the intestine (27); affected humans have a shortened life-span and developmental defects usually classified as CS (63). These effects may be related to the finding that XPG promotes the activity of hNTH1 DNA glycosylase. Homozygous disruptions of the XPA and XPC genes are the most straightforward in the NER group, leading to mice that are prone to mutagen-induced skin cancer but are otherwise normal (64). XP-A mice are more sensitive than XP-C mice because the latter can still perform transcription-coupled NER. XPA protein is a core factor in the NER complex, showing some affinity for damaged DNA and key interactions with RPA, ERCC1, and TFIIH. XPA is the only factor in which disruption completely obliterates NER and that has no other known function in metabolism. From this point of view, it seems to present the most suitable and specific target for disrupting NER by inhibition of its function. It is intriguing that some testis tumor cells that are readily curable by cisplatin show both a reduced rate of removal of cisplatin lesions and a reduced content of XPA protein (65). One prospect is that reducing XPA activity in other tumors would make cisplatin therapy more effective.

Although many physical interactions between different NER proteins are known, it is not clear whether the NER reaction is accomplished by a sequential assembly of freely diffusing individual factors, by subcomplexes of them, or even by a completely preassembled repairosome. As for BER, at present, no convincing evidence is available that a large preassembled protein complex carries out NER in human cells. In a recent study with green fluorescent protein–labeled NER proteins and fluorescence recovery after photobleaching, the authors concluded that the ERCC1-XPF, XPA, and TFIIH enzymes diffuse freely in the nucleus, behaving as individual factors separate from a large complex, and that the enzymes are only temporarily immobilized during repair (66). DNA replication and transcription take place in localized factories within cell nuclei (67), which seems reasonable for processes that require a systematic progression along DNA. Repair might operate more efficiently, however, by relying on diffusion of factors through the nucleus and formation of transient complexes at sites of damage only when correction is needed. In cell-free lysates, NER is a distributive process, and this is also likely to be true within nuclei (68).

Accuracy in DNA Gap Filling

Replicative DNA polymerases bind efficiently to DNA and copy the template in a processive reaction of high fidelity, approaching only one error in 105 nucleotides. Additional accuracy is provided by the intrinsic 3′ exonuclease editing function of the main replicative enzyme, POL δ, and by a similar activity of POL ɛ. During NER, gaps of ∼30 nucleotides are filled in by one of these enzymes (47), so the reaction should proceed at a level of faithfulness similar to that seen in replication.

The absence of 3′ exonuclease activity of two DNA polymerases, POL α and POL β, raises a major problem of fidelity, which is solved by entirely different strategies in the two cases. POL α is a DNA primase/polymerase required during initiation of DNA synthesis at the replication origin and also during lagging-strand synthesis. The enzyme initiates Okazaki fragments by synthesizing the short RNA primer and then continues to add ∼15 deoxyribonucleotides before POL δ takes over and synthesizes most of the ∼100-residue-long fragment. Thus, the first ∼15% of the Okazaki fragment has initially been made with the lower fidelity of a polymerase that lacks a 3′ exonuclease function. The RNA primer, except for the final ribonucleotide adjacent to the DNA sequence, apparently is digested away by RNase H1, with the structure-specific 5′ exonuclease FEN1 then removing the final ribonucleotide in a reaction essential to complete DNA replication. A second important role of FEN1 in the process has recently been shown to be a 5′-editing function (69). If POL α occasionally misreads the template and includes a mismatched deoxyribonucleotide residue close to the 5′ end of the Okazaki fragment, the decreased stability of the product allows FEN1 to excise the short 5′ terminal sequence up to the mismatch; completion of synthesis of the preceding Okazaki fragment by POL δ will then include the missing sequence. By this strategy, high-fidelity DNA synthesis of the lagging strand can be achieved without the need for a 3′ exonuclease activity to complement the role of POL α.

During BER, a different approach is needed, because the major pathway involves filling in a one-nucleotide gap by POL β. The enzyme shows the relatively high error frequency of one mismatched nucleotide per 3000 to 5000 residues (70), despite discrimination against mismatches by an induced fit mechanism at the active site. The required improvement in accuracy (71) is apparently achieved by two combined events. First, the DNA ligase involved, DNA ligase III (48), discriminates against joining a strand interruption with a 3′ mismatched residue (72), thereby creating a window of time for excision of the terminal nucleotide. Then, following the same strategy as used by the E. coli Pol III holoenzyme where the dnaE and dnaQ genes encode separate subunits for polymerization and editing (73), a distinct mammalian 3′ exonuclease can remove the mismatched residue, creating a second opportunity for correct gap-filling. A human homolog of the DnaQ 3′ exonuclease was identified recently that allows error-free processing of mistakes during DNA short-patch gap filling by POL β in vitro, making the new enzyme the main candidate for an editing role during BER (74). The high amount of hydrolytic DNA depurination in human cells combined with the low accuracy of POL β makes such a proofreading step during BER obligatory to avoid an unacceptably high spontaneous mutation rate caused by synthesis errors during correction of endogenous DNA damage.

DNA Polymerases That Bypass Damage

Cell-cycle checkpoints are often thought of as a mechanism for cells to repair damage in their DNA before replication or cell division proceeds. But this is an oversimplification, and it is becoming widely appreciated that cells can tolerate much damage in their genomes without removing it. For example, human XP-A fibroblasts cannot remove UV-induced pyrimidine dimers from their DNA. Yet after irradiation with a dose of 0.3 J/m2 of UV light, which creates 10,000 cyclobutane pyrimidine dimers in the genome, 50% of the XP-A cells survive and divide to form colonies. Normal cells divide with an even larger amount of UV damage (75). Pyrimidine dimers are effective blocks to the progression of replicative DNA polymerases, so how is such tolerance achieved? It is now clear that part of the solution is supplied by specialized enzymes that can bypass DNA damage and extend replication forks through damaged sites. DNA polymerase ζ in the yeast S. cerevisiae is composed of a catalytic subunit Rev3 (in the same family as POL α, δ, and ɛ) and an accessory factor Rev7, which together have the ability to bypass pyrimidine dimers and other adducts in DNA (76). The enzyme often incorporates incorrect nucleotides during bypass of damage. Remarkably, most mutagenesis induced by UV light and some chemical agents is dependent on Rev1, 3, and 7 in yeast. Human cells also contain DNA polymerase ζ, with a large catalytic subunit of 3130 amino acids, and inhibition of its expression reduces UV-induced mutagenesis (76).

Rev1 (hREV1 in humans) is a different type of enzyme, encoding a template-dependent dCMP transferase that can insert a C residue opposite a site without a base (77). Rev1 falls into a family of other recently described DNA polymerases that are related to terminal transferase but are dependent on a template. They are devoid of proofreading activity. The prototype in this family is the UmuD′2C complex, which facilitates bypass of DNA damage in bacteria and is responsible for most damage-induced mutagenesis inE. coli. UmuD′2C, (E. coli Pol V), can insert a base opposite a lesion and extend the template a short distance further (78). A related enzyme first described inE. coli is DinB (Pol IV), which has a propensity to extend misaligned templates and promote frameshift mutagenesis (79). These enzymes are apparently used only to bypass damage before a replicative polymerase takes over.

An important human DNA polymerase in this family is designated DNA polymerase η. Consisting of 713 amino acids, this protein is homologous to yeast Rad30 (80). POL η can bypass thymine-thymine dimers, and usually two A residues are correctly inserted opposite the lesion. Human xeroderma pigmentosum variant (XP-V) cells lack this particular bypass activity because of inactivating mutations in the POL η gene (81). This explains previous observations of a defect in bypass of damage in XP-V cells and cell extracts (82) and accounts for the marked hypermutability of XP-V cells in response to UV light and some chemical agents. POL η is missing in most or all individuals with XP-V, indicating that it is not essential for life. The clinical picture is very similar in the NER-deficient (XP-A to XP-G) and XP-V forms of xeroderma pigmentosum, apparently because many unrepaired lesions in XP-A to XP-G cells are bypassed by translesion DNA synthesis (Fig. 1). Lesions bypassed by polymerases that insert incorrect residues result in mutations, leading to the high incidence of skin cancer. In XP-V cells, NER can remove a large fraction of UV-induced thymine-thymine dimers, but because pol η is missing, any remaining dimers are more likely to be bypassed by polymerases such as pol ζ that incorporate incorrect residues.

Presumably, further activities exist for bypass of DNA damage in human cells. Recent candidates in the POL η family are the distinct gene RAD30B and a DinB homolog (79). Many questions remain to be answered. So far, bypass has been studied for only a few lesions, and only cyclobutane T-T dimers have been looked at in detail. Does POL η also somehow insert the correct bases opposite other lesions, such as cyclobutane T-C dimers and (6-4) photoproducts? Perhaps POL η adheres more faithfully than does POL ζ to the tendency of many polymerases to incorporate A residues opposite noncoding sites (83). Consistent insertion of A residues opposite a T–T dimer would limit UV mutagenesis, given that these dimers are the predominant photoproducts formed. The polymerases may have very different influences on mutagenesis at other lesions. In any case, it is not clear why cells retain both mutagenic and nonmutagenic bypass polymerases for the same type of damage. Perhaps some bypass polymerases are reserved for lesions that are particularly difficult to traverse or for cases where coding information has been completely lost.

Cancer and Quality Control by DNA Repair

The existence of human diseases associated with defects in DNA repair graphically illustrates the importance of this process of quality control. The disorder XP shows the key role of NER in human avoidance of skin cancer. Inherited diseases associated with altered processing of double-strand breaks such as ataxia telangiectasia and Nijmegen breakage syndrome underline the relevance of defense pathways against other types of carcinogenic DNA damage. Although complete disruption of BER seems to be incompatible with viability, it remains possible that mutations leading to partially reduced function of this repair pathway might be found in humans. A systematic study of human polymorphisms in DNA repair genes (84) should reveal the extent to which mutations in repair genes are a risk factor for cancer in the general population.


Stay Connected to Science

Navigate This Article