Antigen Recognition by Variable Lymphocyte Receptors

See allHide authors and affiliations

Science  26 Sep 2008:
Vol. 321, Issue 5897, pp. 1834-1837
DOI: 10.1126/science.1162484


Variable lymphocyte receptors (VLRs) rather than antibodies play the primary role in recognition of antigens in the adaptive immune system of jawless vertebrates. Combinatorial assembly of leucine-rich repeat (LRR) gene segments achieves the required repertoire for antigen recognition. We have determined a crystal structure for a VLR-antigen complex, VLR RBC36 in complex with the H-antigen trisaccharide from human blood type O erythrocytes, at 1.67 angstrom resolution. RBC36 binds the H-trisaccharide on the concave surface of the LRR modules of the solenoid structure where three key hydrophilic residues, multiple van der Waals interactions, and the highly variable insert of the carboxyl-terminal LRR module determine antigen recognition and specificity. The concave surface assembled from the most highly variable regions of the LRRs, along with diversity in the sequence and length of the highly variable insert, can account for the recognition of diverse antigens by VLRs.

In the lamprey and hagfish, the only surviving jawless vertebrates, variable lymphocyte receptors (VLRs) play the major role in recognition of foreign antigens (1, 2). In contrast to the variable, diverse, and joining gene segments (VDJs) of immunoglobulins in jawed vertebrates, the jawless vertebrates have solved the receptor diversity problem by somatic DNA rearrangement of diverse leucine-rich repeat (LRR) modules into incomplete vlr genes. The resulting mature vlr genes encode an N-terminal LRR capping region (LRRNT), the first LRR (LRR1), up to seven 24-residue variable LRRs (LRRVs) (3), a terminal or end LRRV (LRRVe), a connecting peptide (CP), a C-terminal LRR capping region (LRRCT), and a threonine/proline-rich stalk region that connects the protein to a glycosylphosphatidylinositol (GPI) anchor and a hydrophobic tail (Fig. 1A) (1, 2, 4).

Fig. 1.

Overall architecture of the VLR RBC36-ECD in complex with the H-trisaccharide. (A) Schematic diagram of RBC36. Regions from left to right: signal peptide (SP), N-terminal LRR (LRRNT), five variable LRRs (LRR1, LRRVs), connecting peptide (CP), C-terminal LRR (LRRCT), threonine/proline-rich stalk region, GPI anchor, and hydrophobic tail. (B) Ribbon diagram of RBC36-ECD in complex with H-trisaccharide. LRRNT, LRRs, and LRRCT are colored blue, green, and red, respectively. Carbons, nitrogens, and oxygens of the H-trisaccharide are colored yellow, blue, and red, respectively. Disulfide bridges are shown in orange. Green dotted lines represent hydrogen bonds; black dotted lines indicate hydrophobic effects. (C) View rotated 90° from (B) that highlights the continuous β sheet and the H-trisaccharide binding site on the concave surface.

From these somatic gene rearrangements, a potential repertoire of about 1014 unique VLRs has been estimated (2), which compares favorably with the equivalent diversity attainable through VDJ recombination in antibodies. Different numbers and combinations of LRR modules, coupled with amino acid sequence variation in the LRR segments, thereby contribute to VLR diversity. The LRR repeats form a curved solenoid, as in Toll-like receptors (TLRs) (5, 6), and its concave surface has been suggested as the antigen-binding site from evolutionary, sequence, and mutational analyses (2, 7, 8). Crystal structures of three unliganded hagfish VLRs with different numbers of LRRV modules have been determined (7), whereas antigen-binding specificity [erythrocyte H-trisaccharide (9) and Bacillus collagen-like protein of B. anthracis (BclA) (8)] has been reported only for lamprey VLRs. However, the mode of antigen recognition has not yet been determined in either system, nor has it been shown whether complementarity-determining region (CDR) equivalents are present in VLRs that would endow them with specificity and affinity for any given antigen, as for antibodies.

We determined the crystal structure of the VLR RBC36 ectodomain (ECD) in complex with the H-trisaccharide derived from the H-antigen of human blood group O erythrocytes at 1.67 Å resolution by molecular replacement, using our lamprey VLR2913 crystal structure [Protein Data Bank (PDB) ID 2R9U]. Lampreys were previously shown to produce high-titer agglutinins against the H-antigens of human O erythrocytes (10, 11). When lampreys were immunized with human blood group O erythrocytes, they elicited VLRs that recognize the dominant H-trisaccharide antigen on Chinese hamster ovary cells transfected with 1,2-fucosyltransferase (9). H-antigens contain the characteristic disaccharide α-l-Fucp-(1→2)-β-d-Galp-OR, where R is glycoprotein or glycolipid (12). The type II H-antigen trisaccharide, α-l-Fucp-(1→2)-β-d-Galp-(1→4)-β-d-GlcNacp-OH, was used as the antigen in the crystal structure with the RBC36-ECD (Fig. 1, Fig. 2, A and B, and fig. S1) (13).

Fig. 2.

The H-trisaccharide binding site of RBC36. (A) VLR residues involved in recognizing the H-trisaccharide. After refinement, a 2FobsFcalc electron density map was calculated and contoured at 2σ as a blue mesh around the H-trisaccharide. Colors are as in Fig. 1. (B) H-trisaccharide interaction with RBC36 including solvent molecules, modified from the ligand interaction calculation by the program MOE (33). O in a circle represents waters. Hydrogen bonds with RBC36 residues and solvent molecules are drawn with green and pale green lines, respectively. Trp204, which is important for stabilizing the galactose via hydrophobic and stacking effects, is shown in a green circle beside the galactose sugar ring (14). (C) Sequence variability plot for amino acid residues from LRRNT to LRRCT of known VLRs. Green bars on the bottom represent residues on the concave surface; the red bar in the LRRCT shows the location of the highly variable insert. (D) Sequence alignment of LRR modules of RBC36 (14). The green bar shows the residues on the concave surface. Blue and red asterisks represent residues forming the β sheet and side chains that face the concave surface, respectively. Letters on yellow, blue, and red backgrounds show conserved hydrophobic residues, asparagine residues, and residues in the highly variable insert, respectively. Key residues on the concave surface (Asp103, Asp152, and Gln153) for the H-trisaccharide interaction are shown as red letters; Trp204 in the highly variable insert is indicated by a black asterisk at the left. (E) The conformation of the LRRVe module highlights the tight packing of the conserved hydrophobic residues, with their van der Waals radii outlined in dots. The seven residues that form the concave surface are numbered from the N terminus to the C terminus of the LRR.

Lamprey RBC36-ECD (residues 22 to 238, lacking the N-terminal signal sequence) forms a horseshoe-shaped assembly that is more abbreviated and crescent-shaped relative to TLRs. This assembly consists of an LRRNT, an 18-residue LRR1, three LRRVs, an LRRVe, a CP, and an LRRCT, all of which adopt a right-handed solenoidal structure, except for LRRCT. The inner, concave surface is formed from eight β strands (two from LRRNT, five from LRRs, and one from CP), which assemble into a continuous β sheet. The convex (outer) surface is composed of the more diverse secondary structure elements, including loops of varying length, one α helix, and six 310 helices (Fig. 1). Lamprey RBC36-ECD contains five canonical LRR-signature motifs, xL2xxL5xxL8xL10xxN13Q14L15xxL18P19xG21 V22F23D24 [where L represents obligate hydrophobic residues—which, for RBC36-ECD, include leucine (most prevalent), isoleucine, or methionine— and N, Q, P, G, F, and D are conserved asparagine, glutamine, proline, glycine, phenylalanine, and aspartic acid residues, respectively] (Fig. 2D) (4, 14). The side chains of nine conserved residues in each LRR (at relative positions 2, 5, 8, 10, 13, 15, 18, 22, and 23) assemble within the solenoidal structure and form a tight hydrophobic core that laterally stabilizes the repeating LRR modules (Fig. 2E). In RBC36-ECD, LRRNT comprises residues 22 to 52, in which Thr32, Val33, and Asp34 initiate an antiparallel β strand that extends the continuous parallel β sheet, and LRRNT and LRRCT cover the exposed edges of the hydrophobic core of the solenoidal LRR structure, as observed in other LRR proteins, including TLRs (5, 6). In LRRNT, the characteristic four-cysteine motif (CxnCxCxnC) forms two sets of disulfides (Cys22 to Cys28 and Cys26 to Cys35), whereas in LRRCT, a similar motif (CxCxnCxnC) gives rise to disulfides Cys182 to Cys217 and Cys184 to Cys237 (Fig. 1).

After accounting for the protein, extra electron density on the concave surface remained, which corresponded to the H-trisaccharide antigen (fig. S2). Specificity between RBC36 and H-trisaccharide is mainly mediated by four hydrogen bonds (Fig. 2A) on the inner concave surface: between Asp103 OD2 and N-acetylglucosamine OAZ, between Asp152 OD1 and galactose O4′, between Asp152 OD2 and galactose O3′, and between Gln153 NE2 and fucose OAL (Fig. 2, A and B). Asp103 is located on LRRV2, and Asp152 and Gln153 on LRRVe.

With the concave surface of RBC36 firmly established as the antigen-binding site, we analyzed the variability in amino acids represented on this surface in other VLRs. From BLASTP searches (15) with RBC36, 24 VLR sequences were found with three LRRVs and sequence identity of >60%. The amino acid variation was higher on the concave surface of each LRR module (Fig. 2C, fig. S3, and table S2) and hence, to some extent, is analogous to the hypervariable regions (CDRs) in antibodies. In each canonical 24-residue LRR module, seven residues, xxL8xL10xx, are located on the concave surface and, of these, only the two obligate hydrophobic residues (L) face inward to form the hydrophobic core of the solenoidal structure (Fig. 2, D and E). Consequently, the other five residues could potentially contribute to antigen recognition, and correspond to the first, second, fourth, sixth, and seventh positions of this seven-residue segment in each LRR. Asp103, Asp152, and Gln153, which contribute significantly to the interaction between RBC36 and H-trisaccharide, represent the fourth residue of the LRRV2 and the sixth and seventh residues of the LRRVe concave surfaces. Eight other residues on the concave surface (His57, Tyr79, Thr106, Phe127, Cys129, Ala150, Tyr174, and Phe176) stabilize the H-trisaccharide interaction via 16 van der Waals contacts, as calculated with CONTACSYM (16). The carbohydrate antigen buries ∼303 Å2 on the VLR, whereas the corresponding buried surface on the antigen is ∼246 Å2 calculated with a 1.4 Å probe radius (17, 18), which is comparable to buried surfaces of haptens (∼150 to 350 Å2) with antibodies (19).

Another key interaction with the H-trisaccharide is between Trp204 and the galactose. The Trp204 indole is stacked parallel to the sugar ring, as observed in other sugar-protein complexes (Fig. 1 and Fig. 2A) (20). Trp204 is located in the middle of LRRCT, where the VLR sequences are extremely diverse and a highly variable insert is often present (21). The variability plot (Fig. 2C) illustrates that highly variable inserts of 2 to 12 residues occur in LRRCT (fig. S3). In RBC36, a 10-residue insert forms a β hairpin and Trp204 is located at the end of the first β strand, prior to the β-hairpin turn (Fig. 1 and Fig. 2A). Superposition of the crystal structures of lamprey RBC36 and the three hagfish VLRs reveals not only high sequence variability, but also secondary structure variation in their inserts. The VLRB.59 eight-residue insert is a loop, but similar in overall shape to the RBC36 β hairpin, whereas the VLRA.29 three-residue insert points toward the horseshoe side rather than its concave surface; VLRB.61 has no insert (Fig. 3A).

Fig. 3.

Highly variable inserts of VLRs. (A) Crystal structures of lamprey RBC36 and three hagfish VLRs are superposed. Cα trace for different VLRs: RBC36 in green, VLRA.29 in blue (PDB ID 2O6Q), VLRB.59 in orange (PDB ID 2O6S), and VLRB.61 in magenta (PDB ID 2O6R), respectively. Highly variable inserts are drawn in cartoon representation. (B) Superposition of RBC36–H-trisaccharide complex and GpIbα-VWF A1 domain complex (PDB ID 1M10). Overall RBC36-ECD structure is rotated 180° vertically from (A) to highlight the comparison of the highly variable insert of RBC36 and the β switch of GpIbα. RBC36 is depicted as a green trace, GpIbα as an orange trace, VWF A1 domain as a surface representation in cyan, and the H-trisaccharide as in Fig. 1. The β hairpin of the highly variable insert of RBC36 and the β switch of GpIbα are shown in cartoon representation.

The RBC36 insert lies in close proximity to the concave β sheet, and its overall conformation is remarkably similar to the β switch in the C-terminal flank region of human glycoprotein Ibα (GpIbα) that interacts with von Willebrand factor (VWF) A1 domain (22) (Fig. 3B). A search for structural homologs of RBC36, using the DALI server (23), selected GpIbα (PDB ID 1M10) as one of the top three hits along with hagfish VLRB.59 (PDB ID 2O6S) and human Slit protein (PDB ID 2V9T), all of which are LRR-containing proteins. Interestingly, GpIbα and Slit also have crystal structures with their binding partners, in which their mode of interaction is very similar to the RBC36–H-trisaccharide complex where the same first, second, fourth, sixth, and seventh residues on the concave surface in each LRR module play the key role in ligand binding, but the insert in the Slit LRRCT does not contact its ligand (Fig. 3B and fig. S4). Considering that the secondary structures of the highly variable inserts of VLRs in the PDB are diverse (Fig. 3A), and the equivalent β switch of human GpIbα adopts a loop structure when GpIbα is crystallized by itself (22), these structure differences may also play an important role in antigen selection, recognition, and affinity. Conformational changes or isomerism in the inserts would also increase possible binding modes in much the same way as induced fit in the CDR loops of antibodies, especially for CDR H3 (24), or equivalent loops (α3, β3) in T cell receptors (25).

The concave molecular surface area of RBC36 is estimated to be ∼1720 Å2 calculated with a 1.4 Å probe radius (17, 18), compared with a buried surface of ∼700 to 1000 Å2 on average for the antigen-binding regions of immunoglobulins for proteins and other large antigens. Considering that the number of LRRVs in VLRs can be as many as seven and the concave surface area of one LRRV is about 220 Å2, the total antigen-binding surface could extend to around 2600 Å2 and could potentially accommodate binding sites for multiple antigens. However, this size could well be an overestimate as were the corresponding early predictions for the antigen-binding surfaces of antibodies (26), although the possibility of multiple paratopes in VLRs is certainly an intriguing concept.

With few exceptions, five residues on the concave surface of each LRR module are available for antigen binding (Fig. 2E and Fig. 4A). Hence, we mapped these key residues for antigen recognition onto a coordinate system that corresponds to the LRR modules along the x axis and key residue positions on the concave surface along the y axis (Fig. 4B). This interaction matrix should also be useful when other VLR-antigen complexes are determined.

Fig. 4.

The Interaction matrix of VLRs. (A) Seven residues on the concave surface of each LRR modules are shown as a main-chain stick model in the same view with Fig. 1C, and the Cα atoms are connected by black lines. Five residues of the seven on the concave face (1, 2, 4, 6, and 7) are available for the antigen recognition and are connected laterally by black lines; the third and fifth residues face inward to the hydrophobic core and are connected by yellow lines. (B) The simplified interaction matrix of (A). Residues involved in hydrogen bonds or in van der Waals contacts are labeled in a circle or in a square, respectively.

Antigen recognition by antibodies in the vertebrate immune system is well documented and has revealed how the immunoglobulin (Ig) fold with its CDR loops can form a high-affinity binding site for virtually any antigen it encounters, whether natural or synthetic (24). Similarities and differences can now be assessed for antigen recognition by the Ig fold and the LRRs of VLRs, as well as TLRs. High sequence variability in the Ig fold is concentrated in CDRs H1, H2, H3, L1, L2, and L3, whereas that of VLRs is confined to the concave surface of each LRR module (Fig. 2C). In the Ig fold, a wide range of specificities and affinities for the antibody-combining sites is ensured not only by variability in amino acid composition, but also by insertions in the CDRs, especially in CDR H3 (27). However, VLRs contain few insertions on the concave surface of each LRR module, although diversity is attained by variation in the number (up to seven) and amino acid composition of the LRRV modules. The only insertion in VLRs is observed in the middle of LRRCT, which shows highly variable amino acid composition, length, and secondary structure. To what extent the highly variable insert in LRRCT contributes to the specificity, affinity, and shape of the antigen-binding site of VLRs awaits further VLR-antigen complex structures.

Recently, three crystal structures of TLR-ligand complexes—TLR4-MD2–Eritoran (28), TLR1-TLR2–lipopeptide (29), and TLR3-dsRNA (30)—have been determined. So far, only the binding mode of the TLR4-MD2 complex is similar to that of antigen recognition by VLRs, in that residues on the concave surface of the N-terminal and central domains of TLR4 interact with MD2. However, no interaction is seen between LRRCT of TLR4 and MD2 as observed between the highly variable insert in LRRCT of VLRs and antigens. Because we do not yet have sufficient VLR and TLR complex structures to make statistically significant conclusions, and the number of LRR modules in TLRs is much greater than in VLRs, it may be too early to infer evolutionary relationships between VLRs and TLRs.

The crystal structure of RBC36-ECD in complex with the H-trisaccharide has provided structural insight into how VLRs recognize their antigens and provides a basis for rational design and modification of other antigen-specific VLRs. This VLR-antigen structure sheds light on the adaptation and evolution of primordial LRR proteins into their more specialized roles in pathogen recognition (e.g., TLRs) by the mammalian innate immune system.

Supporting Online Material

Materials and Methods

Figs. S1 to S4

Tables S1 to S2


References and Notes

View Abstract

Navigate This Article