Crystal Structure of Hemolin: A Horseshoe Shape with Implications for Homophilic Adhesion

See allHide authors and affiliations

Science  14 Aug 1998:
Vol. 281, Issue 5379, pp. 991-995
DOI: 10.1126/science.281.5379.991


Hemolin, an insect immunoglobulin superfamily member, is a lipopolysaccharide-binding immune protein induced during bacterial infection. The 3.1 angstrom crystal structure reveals a bound phosphate and patches of positive charge, which may represent the lipopolysaccharide binding site, and a new and unexpected arrangement of four immunoglobulin-like domains forming a horseshoe. Sequence analysis and analytical ultracentrifugation suggest that the domain arrangement is a feature of the L1 family of neural cell adhesion molecules related to hemolin. These results are relevant to interpretation of human L1 mutations in neurological diseases and suggest a domain swapping model for how L1 family proteins mediate homophilic adhesion.

Insects have developed highly efficient innate forms of immunity against invading microorganisms such as bacteria and fungi (1). In the giant silkmothHyalophora cecropia and the tobacco hornworm Manduca sexta, many proteins are up-regulated in larvae or pupae upon bacterial infection. Hemolin is present in low amounts in the hemolymph of naı̈ve insects, but is highly induced upon bacterial infection, and is assumed to be an integral component of the insect immune response (2).

Hemolin is a member of the immunoglobulin superfamily (IgSF), containing four Ig-like domains (3). It shares significant sequence similarity with the first four domains of the IgSF portion of transmembrane cell adhesion molecules (CAMs) of the L1 family, whose extracellular regions consist of six IgSF domains followed by five fibronectin III repeats (4) [∼38% amino acid sequence identity between hemolin and the four NH2-terminal IgSF domains of neuroglian (3), the insect ortholog of mammalian L1]. L1 family members mediate homophilic and heterophilic adhesion events that facilitate neurite outgrowth and fasciculation. Mutations in the human L1 gene are found in a variety of neurological disorders (4, 5).

Protein phylogenetic analyses suggest that hemolin evolved from an L1-like ancestor, developing an immune system function independently of vertebrate members of the IgSF (6). Although its exact function in insect immunity remains elusive, it shares homophilic adhesion properties with related neural CAMs; for example, hemolin is found in a membrane form on hemocytes and can mediate homophilic adhesion (7). Secreted hemolin binds to hemocytes, inhibiting their aggregation (3, 8), perhaps by preventing homophilic interactions of the membrane form (7). Hemolin also binds to bacteria (3, 9) and to lipopolysaccharide (LPS) (10), a component of bacterial outer membranes.

We determined the 3.1 Å crystal structure of H. cecropiahemolin (Table 1) (11, 12). Hemolin is composed entirely of β-structure, consisting of four Ig-like domains (D1, D2, D3, and D4) that adopt the I-set folding topology (Fig. 1A) (13). The D2-D3 interface of hemolin includes a bound phosphate ion near two positively charged residues (D2 Arg153 and D3 Arg266) (Fig. 1B) and is primarily associated with a loop between D3 strands D and E that contains residues typically found in phosphate-binding sites (Fig. 2A legend) (14). The two arginines and the sequence of the D-to-E loop are conserved in hemolin sequences, but not in the related L1 family members (Fig. 2A). Phosphate binding may be a feature related to hemolin's interactions with negatively charged LPS (10), such that this site, which is in a particularly basic portion of the protein (Fig. 1B), could represent the binding site for phosphate groups of LPS.

Figure 1

(A) Structure of hemolin compared with other IgSF or IgSF-related structures (15) (ABE sheets are green; GFC sheets are purple). Bend angles (indicated with blue arrows between domains related by the angle) were calculated by determining the angle between the long axes of adjacent domains, approximated by ellipsoids calculated from the coordinates using the program Dom_angle (23). Arrows beside domain names indicate the NH2- to COOH-terminal directions. KIR and human growth hormone receptor (hGHR) were oriented based upon the superposition of their D1 domains upon the hemolin D2 domain. (B) Bound phosphate ion is shown on the molecular surface of hemolin (left) [colors highlight the electrostatic potential calculated by GRASP (12); negative potential is in red and positive potential is in blue] and on a 3.1-Å 2F obsF c annealed omit electron density map (12) (right; contoured at 0.9σ).

Figure 2

(A) Sequence alignments (3,24). Numbers refer to H. cecropia hemolin. Locations of β-strands in hemolin are indicated above the sequences with letter names. Residues at the D2-D3 and D1-D4 interdomain interfaces (those that contribute more than 10 Å2 of buried surface area to the interfaces) are indicated with an asterisk and are green if they are identical or chemically similar in L1 and a hemolin and neuroglian sequence, blue if they are identical or chemically similar in a hemolin and neuroglian sequence, and red if they are identical or chemically similar in a hemolin and L1 sequence. Many of the highlighted residues at the D2-D3 and D1-D4 interfaces are also conserved in axonin 1, a vertebrate axon surface protein with which hemolin shares significant sequence identity (28%) (7). Substituted residues in L1 mutants (4, 5) are indicated below the L1 sequence. Hemolin's interactions with the bound phosphate ion include the side chains of His264, Arg153, and Tyr243, and the main-chain nitrogens of Asn265, Arg266, Thr267, and Ser268. (B) Stereoview of the Cα backbone of hemolin. Highlighted residues at the D2-D3 and D1-D4 interfaces [color-coded as in (A)] are identical or chemically similar in hemolin and L1 family proteins. Cα atoms at positions corresponding to pathological mutations in human L1 (4, 5) are marked with a black sphere.

Table 1

Data collection, phasing, and refinement statistics. Purified hemolin (11) was crystallized (space group P212121, two molecules per asymmetric unit) using macroseeding from protein solutions at ∼6 mg/ml in 50 mM phosphate, 0.15 M NaCl, and 1.8 M Na,K phosphate (pH 8.1). Native and heavy-atom derivative data sets were collected at room temperature on a Xentronics multiwire area detector mounted on a Siemens rotating anode generator. Crystals soaked in 1.6 M (NH4)2SO4 (Native I;a = 85.0 Å, b = 90.3 Å,c = 143.1 Å) diffract to higher resolution but are nonisomorphous with untreated crystals (Native II; a = 87.3 Å, b = 90.3 Å, c = 141.3 Å). Untreated crystals were used for MIR phase determination, and the data from (NH4)2SO4-soaked crystals were used for refinement. Crystals derivatized with xenon were collected as described (12). Data were processed with XDS and merged and scaled using CCP4 programs (12). Heavy-atom refinement and phasing were performed with the CCP4 version of MLPHARE to a figure of merit of 0.398 to 3.5 Å resolution. The MIR map was improved using NCS averaging and solvent-flipping using the program Solomon (12). The program O (12) was used for all model building, and the model was refined as described (12). Statistics in parentheses refer to the highest resolution bin.

View this table:

Although the individual hemolin domains resemble IgSF domains in other proteins, they are arranged into an unusual globular shape resembling a horseshoe. A sharp bend at the D2-D3 domain interface is responsible for the horseshoe shape, such that the almost linearly arranged D3-D4 segment folds back upon the D1-D2 segment, which is also almost linearly arranged. Hemolin's shape resembles the four domain structures of T cell receptors or the Fab portions of antibodies rather than the “beads on a string” arrangement of domains in CD4, the only other single-chain four-domain IgSF protein of known structure (15). Unlike T cell receptors and Fab's, however, the interacting segments of hemolin are antiparallel to each other (Fig. 1A).

The angle relating the D2 and D3 domains of hemolin is the most acute interdomain angle in the available structures of IgSF or IgSF-related proteins. Specifically, hemolin D2 and D3 are related by an angle of 25°, whereas other bent IgSF structures show interdomain angles of 65° or more (Fig. 1A). As a result of the sharp bend between D2 and D3, the hemolin domains interact strongly in pairwise combinations: D1 with D4 and D2 with D3. A total of 1217 and 1382 Å2 of surface area is buried (12) at the D1-D4 and D2-D3 interfaces, respectively; this is slightly more than is buried at the killer inhibitory receptor (KIR) D1-D2 interface (1076 Å2) (15), but less than is buried at the interfaces in an Fab [the VH-VLinterface buries 1634 Å2 and the CH1-CL interface buries 2051 Å2 in the Fab (15) shown in Fig. 1A]. Both the D1-D4 interface of hemolin and the KIR interdomain interface are formed largely by contacts between the β-sheets containing strands G, F, and C (GFC sheets), whereas the ABED sheets of hemolin mediate the D2-D3 contact (Fig. 1A). The use of opposite faces to mediate the two lateral interdomain contacts is reminiscent of the organization of an Fab. By contrast to the extensive interactions between the paired hemolin domains, the adjacent hemolin interdomain interfaces (D1-D2 and D3-D4) bury little surface area: 576 and 280 Å2, respectively, similar to surface areas buried between adjacent domains in elongated rod-like multidomain IgSF structures such as CD4, CD2, and VCAM-1 (∼400 to 950 Å2) (15). Thus in isolation, the D1-D2 and D3-D4 segments of hemolin resemble more elongated IgSF molecules such as CD4. Hemolin's unusual shape is therefore a direct result of the sharp D2-D3 bend and pairwise interactions between the D2-D3 and D1-D4 domains.

The significant sequence identity between hemolin and the first four domains of neuroglian ensures that the hemolin-related domains of L1 family members fold into tertiary structures resembling their counterpart hemolin domains (16). To determine if the four hemolin-related domains of L1 proteins share a common interdomain quaternary arrangement with hemolin (that is, an antiparallel interaction of D1 with D4 and D2 with D3), we compared residues at the hemolin D1-D4 and D2-D3 interfaces with their L1 and neuroglian counterparts. Many are identical or chemically similar in hemolin, neuroglian, and human L1 (Fig. 2, A and B). Sequence conservation is notable in the strand F and G regions of D1 and D4, the strand D to E region of D2, and the strand B to C region of D3, which are main areas of interdomain contacts. The length (but not the sequence) of the D2-D3 linking region is conserved between hemolin and the L1 family, allowing the third and fourth domains of L1 proteins to fold back and make antiparallel interactions with the first two domains. The conservation of critical residues at the hemolin and L1 interdomain interfaces justifies use of the hemolin structure as a model for the organization of the first four domains of L1 family proteins.

By using velocity sedimentation analytical ultracentrifugation to compare the shapes of hemolin and a soluble version of the related portion of Drosophila melanogaster neuroglian (Nrg-4D) (11), we also obtained experimental evidence that domains 1 through 4 of an L1 family member are arranged similarly to hemolin. Sedimentation coefficients for each protein were determined and used to calculate frictional coefficients (17). The resulting comparison of hemolin and Nrg-4D with elongated and globular proteins of similar molecular masses (17) supports the conclusion derived from the sequence data, corroborating that the solution structures of both hemolin and Nrg-4D are the horseshoe shape observed in the hemolin crystal structure. In addition, recent studies of homophilic adhesion mediated by Drosophila neuroglian are consistent with the hypothesis that the hemolin-related domains of neuroglian are folded into a shape requiring all four domains for structural stability and function. In these studies, the first four neuroglian domains were both necessary and sufficient to mediate homophilic adhesion when expressed at the surface of S2 cells, whereas single domains alone or molecules in which any single domain was deleted did not mediate significant adhesion (18). The final confirmation of the postulated structure of Nrg-4D awaits a crystallographic analysis, because the limited resolution of electron microscopic studies precludes detailed structural interpretations of interdomain arrangements (19).

Having obtained sequence-based and experimental evidence that hemolin and the related domains of L1 proteins are structurally similar, we can use the hemolin structure as a first-order model to predict the structural effects of pathological missense mutations in this region of human L1. L1 mutations were previously mapped onto models of individual domains (5). The structural consequences of substitutions can now be interpreted, assuming a horseshoe shape for the hemolin-related L1 domains. Six of 13 mutations within the D1 to D4 region of L1 (all six of which affect residues that are identical or chemically similar in hemolin and human L1) fall at the predicted D2-D3 or D1-D4 interfaces (Fig. 2, A and B), consistent with the hypothesis that pairwise D1-D4 and D2-D3 interactions are important for the functions of L1 proteins in cell adhesion.

Structure-based models proposed for cell adhesion mediated by other proteins include head-to-head interactions, as postulated for CD2-related proteins (15), or formation of a zipperlike structure, as proposed for homophilic recognition by cadherins (20). These models were based on interactions observed in crystals, in which the millimolar protein concentration environment was assumed to reproduce weak adhesive interactions that would normally occur only at the cell surface. The packing in the hemolin crystals does not suggest any obvious mechanism for homophilic adhesion that would be induced solely by high concentrations of protein (21). However, the significance of the antiparallel packing of tandem Ig-like domains observed in hemolin and predicted for L1 family members may lie in the potential ability of such structures to form oligomers that could function in homophilic adhesion events. Hemolin dimers and oligomers with the same interdomain contacts as observed in the monomeric version that was crystallized could form by the mechanism of three-dimensional (3D) domain swapping, which has been documented for a variety of protein structures as occurring when a domain from a monomeric protein is replaced by the same domain from an identical protein chain (22). A domain-swapped dimer of hemolin or an L1 family member would consist of intermolecular (rather than intramolecular) D2-D3 and D1-D4 pairs, which could be formed between open (straight) monomers with a minor repositioning of the loop joining the D2 and D3 domains (Fig. 3). Repositioning of the D2-D3 loop would also allow formation of higher-order domain-swapped multimers of hemolin-related proteins.

Figure 3

Schematic representation of a mechanism for homophilic adhesion mediated by 3D domain swapping (22) in hemolin and related proteins. On the left, the four NH2-terminal domains of an L1 protein or the hemolin monomer (color coded interfaces as in Fig. 1A) are depicted in the closed (bent) conformation. The black line indicates the remaining Ig-like, fibronectin type III, and transmembrane domains in the case of the L1 proteins (4) or attachment to the membrane by posttranslational modification in the case of hemolin (7). Transient formation of an open form would lead to formation of domain-swapped dimers (middle) or multimers (right) through homophilic interactions with open proteins on another cell. This model predicts that formation of a ribbon of domain-swapped proteins is more likely on a membrane than in solution: on a membrane where molecules are tethered to a surface, a ribbon of domain-swapped proteins could be nucleated through interactions of neighboring molecules with open proteins, rationalizing why soluble hemolin and Nrg-4D are monomeric (21) and soluble versions of homophilic IgSF-containing CAMs are generally monomeric. An antiparallel interaction of IgSF domains may be a general mechanism for homophilic adhesion mediated by neural CAMs in addition to proteins in the L1 family. For example, recent studies of homophilic adhesion mediated by N-CAM are consistent with an interaction of its five IgSF domains to create antiparallel D1-D5, D2-D4, and D3-D3 pairs (25), resulting in N-CAM dimers or multimers similar to those depicted in this figure.

Formation of domain-swapped multimers of hemolin and L1 family proteins is an attractive model for a structural mechanism of homophilic adhesion between two cell membranes, which can be stated as follows: (i) At the cell surface, the closed (bent) and open (straight) forms of the protein are in equilibrium favoring the bent form. (ii) Transient formation of the open (straight) form (perhaps in response to an extracellular signal) results in pairing with an open (straight) protein on another cell. (iii) Formation of additional domain-swapped dimers is facilitated by the close proximity of the adhering cell membranes. Alternatively, unpaired domains of the open (straight) proteins could nucleate formation of a ribbon of domain-swapped proteins, resulting in cell adhesion. Structural features of hemolin and L1 family members that are relevant for this model are that the D2-D3 linker is long enough to allow repositioning to form dimers or oligomers, or both, and that the D1-D4 and D2-D3 domain interfaces are fairly hydrophilic (Fig. 2A) and could therefore tolerate transient formation of open (straight) proteins before oligomerization.

The hemolin structure reveals a new arrangement of IgSF domains that may be shared by related portions of neural CAMs. The antiparallel arrangement of tandem IgSF domains observed in hemolin and hypothesized to occur in L1 family members suggests a testable model for the mechanism of homophilic adhesion.

  • * Present address: Department of Molecular Biophysics, Center for Chemistry and Chemical Engineering, Post Office Box 124, Lund University, S-221 00 Lund, Sweden.

  • Present address: AFMB, CNRS UPR 9039, 31 Chemin Joseph Aiguier, 13402 Marseille Cedex 20, France.

  • Present address: Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA.

  • § To whom correspondence should be addressed. E-mail: bjorkman{at}


View Abstract

Stay Connected to Science

Navigate This Article