A Structural Explanation for the Recognition of Tyrosine-Based Endocytotic Signals

See allHide authors and affiliations

Science  13 Nov 1998:
Vol. 282, Issue 5392, pp. 1327-1332
DOI: 10.1126/science.282.5392.1327


Many cell surface proteins are marked for endocytosis by a cytoplasmic sequence motif, tyrosine-X-X-(hydrophobic residue), that is recognized by the μ2 subunit of AP2 adaptors. Crystal structures of the internalization signal binding domain of μ2 complexed with the internalization signal peptides of epidermal growth factor receptor and the trans-Golgi network protein TGN38 have been determined at 2.7 angstrom resolution. The signal peptides adopted an extended conformation rather than the expected tight turn. Specificity was conferred by hydrophobic pockets that bind the tyrosine and leucine in the peptide. In the crystal, the protein forms dimers that could increase the strength and specificity of binding to dimeric receptors.

The localization and movement of compartment-specific proteins within the cell is largely achieved through the recognition of short sequence motifs by targeting proteins. One of the most studied processes involving such signal recognition is clathrin-mediated endocytosis, which occurs in vesicle trafficking and the internalization of nutrient and growth factor receptors when bound to their appropriate cargo molecules [reviewed in (1)]. During the internalization of activated growth factor receptors such as the epidermal growth factor receptor (EGFR) tyrosine kinase [reviewed in (2)], receptors are removed from the cell surface in clathrin-coated vesicles and ultimately directed to the endosome and lysosome, where they are inactivated by proteolytic degradation (3, 4).

The first stage of endocytosis is the formation of a clathrin-coated pit, when mechanical invagination of a patch of membrane by clathrin occurs as it forms a polyhedral lattice, as does the preferential sorting of selected transmembrane proteins into the pits by adaptor complexes (APs). At least three similar AP complexes (AP1, AP2, and AP3) have been identified and appear to be associated with different cell compartments. The APs comprise four types of subunit: two large (∼100 kD) (α and β2 in AP2), one medium (∼50 kD) (μ2 in AP2), and one small (∼17 kD) (σ2 in AP2). AP2 adaptors link the proteins to be endocytosed (via the μ2 subunit) with the nascent clathrin coat (via the α and β2 subunits), and via the α subunit, they recruit the components (such as EPS15, amphiphysin, and dynamin) needed to drive and regulate the formation of clathrin-coated vesicles [reviewed in (5, 6)]. The short linear sequence motifs that act as internalization signals mainly fall into two classes. The first, and most common, contains a critical tyrosine residue, and members of this group mostly conform to the consensus sequence YxxØ [where Ø is a bulky hydrophobic residue (Leu, Ile, Met, or Phe) (7)] that binds directly to μ2 subunits (8); the second is the “dileucine” motif DxxxLL (9), which interacts with the β1 subunit of AP1 (10) but may also bind indirectly to the μ subunit via an “adaptor” protein (11, 12).

To investigate the nature and selectivity of the binding of YxxØ internalization signals to APs, we have solved the crystal structures to 2.7 Å resolution of the signal binding domain of μ2 (residues 158 to 435) (13) complexed with the internalization signal peptides from EGFR (FYRALM) (14) and TGN38 (DYQRLN) (15, 16). The protein has an elongated, banana-shaped, all β-sheet structure. It can be considered as two β-sandwich subdomains (A and B), with subdomain B inserted between strands 6 and 15 of subdomain A, and joined edge to edge such that the convex surface is a continuous nine-stranded mixed β sheet that runs the whole length of the molecule (Figs. 1and 2).

Figure 1

The structure. (Aand B) Orthogonal views of μ2 with subdomain A shown in gold, subdomain B in blue, and the peptide in magenta. Dotted lines represent disordered loops. The strands of the β sheet (arrows) are numbered. The two subdomains are linked into a continuous β sheet through strands 14 and 16/17. (C) Sequence alignment of μ2 from rat (Rat), human (Humn), Drosophila (Drph),Caenorhabditis elegans (Celg), Dictyostelium(Dict), Arabidopsis thaliana (Plnt), Saccharomyces pombe (Spmb), μ1 (AP47) from rat, and μ3A (p47A) from rat. Identical residues are shaded in red, conserved residues are in gold, and those involved in internalization signal binding are in blue.

Figure 2

(A) Stereo view of the binding site for the tyrosine residue in the EGFR internalization signal FYRALM, showing part of the experimental electron density map, with phases calculated using the peptide complex data as native with the Xe and EMTS derivatives, and solvent flattening with a 70% solvent content. The peptide is represented with magenta bonds, and the residues at the top right with green bonds come from the other subunit in the crystallographic dimer. (B) Stereo view of the binding site for the TGN38 internalization signal DYQRLN, in the same view as (A). The difference electron density shown was calculated using the model from the FYRALM peptide structure with the peptide removed; density for the arginine in the Y+2 position is clearly visible, packed against Trp421. [Drawn with BOBSCRIPT (32)]

The two peptides bind in an identical manner to a site on the surface of two parallel β-sheet strands (β1 and β16) in subdomain A (Fig. 3). The peptide assumes an extended conformation when bound, not a tight β turn as has been proposed (17). Hydrophobic pockets exist for the binding of both the tyrosine and the Ø residue on either side of edge strand β16. These pockets are positioned such that when the side chains of the target peptide are correctly bound, three additional hydrogen bonds are made between the backbone of the peptide and β-strand 16, forming an extra strand on the inner edge of the nine-stranded β sheet (represented schematically in Fig. 3C). A similar mechanism of increased strength of binding through β-strand formation on correct recognition of key side chains has been demonstrated in a number of cases, including the interactions of protein kinases with their substrates (18) and protein phosphatases with their regulatory subunits (19).

Figure 3

The peptide binding site. (A) The binding of the tyrosine residue of the internalization signal peptide is in a hydrophobic pocket created by Phe174, Trp421, and Arg423, with a hydrogen-bonding network between the tyrosine OH and Asp176, Lys203, and Arg423. The structure shown is that of the DYQRLN TGN38 peptide. (B) The binding pocket for the bulky hydrophobic residue at Y+3 (Leu in both peptides) is lined with aliphatic side chains of Leu173, Leu175, Val401, Leu404, Val422, and the aliphatic portion of Lys420. ArgY+2 of the TGN38 peptide is packed against Trp421. (C) Schematic representation of the interactions between the internalization signal of TGN38 and μ2, showing both side chain contacts and the short stretch of β sheet formed between the peptide and β strand 16. The peptide is shown with bold lines.

The tyrosine residue of the internalization peptide makes extensive interactions with side chains in its binding pocket. There are hydrophobic interactions between the tyrosine ring and Trp421 and Phe174 as well as stacking on the guanidinium group of Arg423. The hydroxyl group of the tyrosine participates in a network of hydrogen bonds with Asp176, Lys203 (from β2), and again Arg423, explaining why a Phe at this position gives only poor binding (20). As well as contributing directly to the strength of binding via a direct hydrogen bond to the tyrosine OH, Asp176 appears to play an important role in correctly orienting the guanidinium group of Arg423. The critical role of Asp176 is reflected in its absolute conservation among all μ2, μ1, and μ3 sequences (Fig. 1C). The other major determinant, as defined by sequence and combinatorial peptide library analysis of internalization signals, is the presence of a bulky hydrophobic residue at the Y+3 position (7). The binding site for this residue is a cavity lined with aliphatic residues (Fig. 3B). The size and flexibility of the side chains within this pocket would allow for the accommodation of any of the residues (Leu, Phe, Met, Ile) that are possible at this position.

Peptide library screening has revealed a preference for an arginine residue at either Y+2 (strong) or Y+1 (weak) (7). In the DYQRLN (TGN38) complex, the arginine forms hydrophobic interactions mainly with Trp421 but also with Ile419 (Fig. 3), with its guanidinium group exposed to solvent and making a hydrogen bond between Nε and the carbonyl of Lys420 (Fig. 2B): The favorable hydrophobic interaction outweighs the unfavorable electrostatic interaction with the marked positive potential of the peptide binding surface (Fig. 4, C and D). The FYRALM (EGFR) peptide contains an arginine at the Y+1 position that is not well ordered, implying that it has no significant interaction with μ2. The nature and disposition of the pockets explains why the dileucine type of internalization motif is unable to bind to μ2, because there would be no residue capable of filling the tyrosine binding pocket. It also indicates that if the low density lipoprotein receptor internalization signal NPVY does bind weakly to μ2 (7) and not via an adaptor protein, it would have to do so in the reverse orientation, that is, with its Asn residue in the Y+3 pocket.

Figure 4

The crystallographic dimer. (Aand B) Orthogonal views of the dimer formed in the crystal, along and perpendicular to the crystallographic twofold axis. The A subdomains are colored gold and green; the B subdomains are blue and purple. (C and D) The surface of the μ2 dimer colored according to electrostatic surface potential (blue positive, red negative; scale from –30 to +30 kT e–1) in the same view as (A) and (B). The planar face at the top of (D) may interact with the membrane. [Drawn with GRASP (33).]

Src homology region 2 (SH2) domains bind similar YxxØ motifs in an extended conformation, with the tyrosine phosphorylated (21, 22), but there is no homology either in the structure of the proteins or in their mode of binding. In the case of SH2 domains, the specificity and strength of binding to the target peptide arise predominantly from ionic interactions with the phosphate moiety. The structure of the complex demonstrates that if the tyrosine residue were to be phosphorylated, it would be incapable of binding to μ2, both because the size of the tyrosine pocket is too small and because Asp176 would repel the phosphate. This is supported by data suggesting that phosphorylated peptides will not bind to μ2 subunit (20) and that phosphotyrosine cannot displace EGFR that is bound to AP2 (23).

The residues involved in signal recognition are conserved in μ2 subunits from all species (Fig. 1C). The binding sites in the μ1 subunit of AP1 (AP47) are also very similar, although the change Lys420 → Pro may alter the specificity for the Y+3 residue. In the AP3 homolog (μ3A or p47A), the residues Lys203 and Arg423 in μ2 involved in binding the tyrosine of the YxxØ motif are replaced by Cys and Lys, respectively, which would be expected to reduce the affinity for tyrosine signals to μ3A. The substitutions Leu173 → Ala and Leu175 → Phe in the Y+3 pocket (Fig. 1C) may alter the selectivity for residues at this position. The exchange of Trp421 in μ2 for a glycine in μ3A would remove the specificity for an arginine at the Y+2 position.

How does the machinery of endocytosis recognize a relatively nonspecific signal such as the sequence YxxØ? One possibility arises from the observation that most receptors are internalized as dimers, often induced by ligand binding on the outside of the cell, which could place two internalization signals adjacent to each other. Recognition of this dimer would increase the avidity of binding, relative to the monomer, without necessarily precluding binding of monomeric receptors. In the crystal structure, the μ2 molecules form a dimer around a crystallographic twofold axis, placing the internalization signal peptides close to each other in a large groove (Fig. 4). The dimer buries 1100 Å2 of accessible surface, which is smaller than most stable dimer interfaces (typically at least 1200 Å2), but μ2 is only a small part of the whole AP2 molecule, and additional interactions may be formed between other subunits of AP2 in a dimer. This provides an attractive explanation for the recognition of dimeric receptors, particularly as peptide binding would favor dimerization because the peptide contributes 17% of the interface. Dimerization of AP2 complexes has been suggested by the observation that they bind in a 1:1 molar ratio with ligand-activated, and therefore dimeric, EGFRs (24). Binding of dimeric receptors to AP2 dimers, which in turn bind multimers of clathrin, provides an implicit mechanism for the formation of the clathrin lattice. The position of the peptide binding sites in the groove of the dimer predicts that the internalization signal must be presented as an accessible region without defined secondary structure, which is in agreement with the observation that EGFR binding to AP2 is increased by the presence of urea (23).

The striking positive electrostatic potential of the μ2 dimer may reflect an ability to interact with negatively charged moieties, including proteins (for example, the domain after the internalization signal in EGFR) or the head groups of negatively charged phospholipids such as phosphatidylserine. The planar face (Fig. 4D, top) would provide a large nonspecific ionic interaction with the membrane—which would increase the strength of binding to membrane proteins containing appropriately positioned internalization signals in a manner similar to proteins such as Src and HIV-1 gag (25)—and may also contribute in recruiting AP2 complexes to the plasma membrane.

The novel structure of the μ2 subunit of the plasma membrane AP2 complexed with the FYRALM and DYQRLN peptides explains the specific binding of YxxØ internalization motifs and the absolute requirement for the motif to be in an extended β-strand conformation and for the tyrosine residue to be nonphosphorylated. The dimeric packing of the molecules in the crystal suggests that the strength and selectivity of binding of receptors may be enhanced by their binding as dimers to dimeric μ subunits.

  • * To whom correspondence should be addressed. E-mail: pre{at}


Statistics on data collection and phasing. Data collection values in parentheses are for the high-resolution shell.

View this table:
View Abstract

Stay Connected to Science

Navigate This Article