Review

The Extracellular Matrix: Not Just Pretty Fibrils

See allHide authors and affiliations

Science  27 Nov 2009:
Vol. 326, Issue 5957, pp. 1216-1219
DOI: 10.1126/science.1176009

Abstract

The extracellular matrix (ECM) and ECM proteins are important in phenomena as diverse as developmental patterning, stem cell niches, cancer, and genetic diseases. The ECM has many effects beyond providing structural support. ECM proteins typically include multiple, independently folded domains whose sequences and arrangement are highly conserved. Some of these domains bind adhesion receptors such as integrins that mediate cell-matrix adhesion and also transduce signals into cells. However, ECM proteins also bind soluble growth factors and regulate their distribution, activation, and presentation to cells. As organized, solid-phase ligands, ECM proteins can integrate complex, multivalent signals to cells in a spatially patterned and regulated fashion. These properties need to be incorporated into considerations of the functions of the ECM.

All cells make close contact with the extracellular matrix (ECM), either continuously or at important phases of their lives (for instance, as stem or progenitor cells or during cell migration and invasion). The ECM is well known for its ability to provide structural support for organs and tissues, for cell layers in the form of basement membranes, and for individual cells as substrates for migration. The role of the ECM in cell adhesion and signaling to cells through adhesion receptors such as integrins has received much attention (13), and, more recently, mechanical characteristics of the matrix (stiffness, deformability) have also been recognized to provide inputs into cell behavior (4, 5). Thus, ECM proteins and structures play vital roles in the determination, differentiation, proliferation, survival, polarity, and migration of cells. ECM signals are arguably at least as important as soluble signals in governing these processes, and probably more so. Here, I will emphasize different contributions of the ECM and ECM proteins to cell and tissue behavior, namely their roles in binding, integrating, and presenting growth factor signals to cells.

The Complex Domain Structures of ECM Proteins

There are hundreds of ECM proteins encoded in vertebrate genomes. Many of the genes are ancient, such as those composing the basement membrane toolkit (type IV collagens, laminins, nidogen, perlecan, and type XV/XVIII collagen), which is found in most metazoa, and one can argue that basement membranes were crucial to the evolution of metazoa (6). However, many vertebrate ECM proteins and genes evolved much more recently, during evolution of the deuterostome lineage, and that expansion includes not only elaboration of preexisting families (for example, laminins and collagens) but also novel proteins [e.g., fibronectins (FNs) and tenascins]. What purposes are served by this proliferation of ECM proteins? ECM proteins are large and complex, with multiple distinct domains, and are highly conserved among different taxa (Fig. 1). It is not necessary for proteins to be large or complex to generate strong, stable fibrils—intermediate filament proteins and type I collagen provide notable examples to the contrary. So why are most ECM proteins so large, complex, and conserved? Many ECM proteins have dozens of individually folded domains, but in most cases, we understand the functions of only a few of them. What is the purpose of the other domains? The conserved domains are arranged in specific juxtapositions, sometimes controlled by highly regulated alternative splicing. The clear implication is that the specific domains and architectures of ECM proteins contain information of biological importance and evolutionary value. This article will explore that hypothesis in light of recent discoveries concerning representative ECM proteins.

Fig. 1

The complex domain structures of ECM proteins. Representative ECM proteins illustrating multiple, independently folded domains, which occur in differing combinations in different ECM proteins through exon shuffling during evolution. Domain structures were generated with SMART (36) and edited for details of individual proteins. (A) Fibronectin. Encoded by a single gene but alternatively spliced at three regions [blue circles and box and V (variable) segment] to generate 12 proteins in rodents and 20 in humans. FN3 domains are widespread in ECM proteins. Binding sites for other matrix proteins are marked. The heparan sulfate–binding site can interact with PGs or with syndecan, an integral-membrane PG. Integrin-binding sites; RGD (indicated by an asterisk) and LDV (Leu-Asp-Val, indicated by a pound sign). FN is a proangiogenic molecule, whose function depends on both the RGD site and the two alternatively spliced FN3 domains (37, 38). FN also binds the proangiogenic growth factors VEGF and HGF (16, 17). (B) Fibrillin-1. Fibrillins include EGF-like domains, found in many ECM proteins, as well as TB (TGFβ-binding, denoted by T) and hybrid (H) domains, specific to fibrillins and LTBPs (21, 22). Binding sites for other matrix proteins and growth factors are marked. (C) LTBP-1. Four-gene family with structures related to fibrillins. Known binding sites for TGF-β/LAP latent complex (SLC, blue), fibrillin, and FN are marked. RGD (asterisk) sequences in fibrillins and LTBPs may bind integrins. (D) Thrombospondin-1 (TSP-1). TSPs contain TSP1 repeats (also found in other ECM proteins), EGF-like repeats, and a VWC domain, known in other proteins to bind BMPs. TSP3 repeats (purple) and C-terminal domains are unique to TSPs and bind multiple Ca++ ions. The RGD (asterisk) sequence is known to bind to integrins. TSPs 1 and 2 have the structure shown, and both have antiangiogenic activity located in the TSP1 repeats, which bind to the CD36 receptor (39).

ECM Proteins and Growth Factor Signaling

One long-standing idea is that the ECM binds growth factors, which is certainly true. Many growth factors [e.g., fibroblast growth factors (FGFs) and vascular endothelial growth factors (VEGFs)] bind avidly to heparin and to heparan sulfate, a component of many ECM proteoglycans (PGs). Hence, a generally held view is that heparan sulfate PGs act as a sink or reservoir of growth factors and may assist in establishing stable gradients of growth factors bound to the ECM; such gradients of morphogens play vital roles in patterning developmental processes. It is also often proposed that growth factors can be released from the ECM by degradation of ECM proteins or of the glycosaminoglycan components of PGs. Those models place the ECM in a distal role, acting as localized reservoirs for soluble growth factors that will be released from the solid phase to function as traditional, soluble ligands. However, some growth factors actually bind to their signaling receptors with heparan sulfate as a cofactor. The binding of FGF to its receptor (FGFR) depends on a heparan sulfate chain binding simultaneously (7), and transforming growth factor–β (TGF-β) ligands bind first to integral-membrane PGs that “present” these ligands during signaling (8); effectively they act as solid-phase ligands. Such phenomena may well be more widespread than the few, well-studied examples that are currently known. There are also increasing numbers of examples of growth factors binding to ECM proteins themselves, without the involvement of glycosaminoglycans, supporting the notion that the presentation of growth factor signals by ECM proteins is an important part of ECM function.

There are several related concepts that need to be kept separate in thinking about and analyzing functions of the ECM in signaling to cells. First, standard ECM receptors, such as integrins and discoidin domain tyrosine kinase receptors, are themselves signal transduction receptors. Their ligands are specific domains and motifs embedded in the ECM proteins, and ECM-integrin interactions lead to signal transduction responses that are at least as complex and important as those triggered by soluble ligands such as EGF, platelet-derived growth factor, and VEGF (13). Second, and less clearly, there are numerous reports of “cross talk” and “synergy” between signaling by integrins and by various growth factors (9). In most cases, it is uncertain whether such cross talk involves (i) membrane-proximal interactions or (ii) cooperation in the downstream signal transduction pathways. Another concept is that intrinsic domains within ECM proteins might act as ligands for canonical growth factor receptors. This suggestion arose from the observation that laminin contains multiple EGF-like domains, as do many ECM proteins [e.g., laminins, tenascins, thrombospondins (TSPs), fibrillins; see Fig. 1], which might bind to EGF receptors and signal as solid-phase ligands (10). EGF-like domains from laminin (11, 12) or tenascin (13) presented as soluble ligands can bind to EGFR and modulate its signaling, and it is often hypothesized that fragments of ECM proteins can be released by proteolysis (for instance, by matrix metalloproteases) and act as soluble ligands, similar to the idea that matrix-bound growth factors can be released by ECM degradation. In both cases, the ECM acts as a reservoir of growth factors (bound or intrinsic), which can be released as soluble factors to bind their receptors. However, the interesting idea that intrinsic growth factor–like ligands can act from the solid-phase deserves more intensive investigation and careful experimental distinction from alternatives such as release of bound or intrinsic ligands. We will explore this idea and the related concept that ECM proteins bind and present growth factors as organized solid-phase ligands.

Growth Factor Binding to ECM Proteins

There is increasing evidence for specific, direct binding of growth factors to ECM proteins (13, 14). Both FN and vitronectin bind hepatocyte growth factor (HGF) and form complexes of Met (the HGF receptor) and integrins (the ECM receptors), leading to enhanced cell migration (15). Similarly VEGF binds to specific FN type III (FN3) domains in both FN and tenascin-C, and these associations promote cell proliferation (16, 17). In the case of the FN-VEGF binding, the effect on proliferation requires the binding sites for integrins and VEGF to be in the same molecule, suggesting a requirement for juxtaposition of the two receptors (integrin α5β1 and VEGFR2), rather than merely downstream cross talk (16). FN3 domains are prevalent in many ECM proteins, and membrane receptors and their potential for binding soluble factors need further investigation.

Other widely distributed ECM domains can bind and present growth factors. Drosophila collagen IV binds Dpp [a bone morphogenetic protein (BMP) homolog] and enhances its interactions with BMP receptors; this collagen-BMP interaction is crucial in regulating the dorsoventral axis and the numbers of germinal stem cells in the ovary, both processes that are dependent on gradients of Dpp (18). Collagen IV is a universal constituent of basement membranes, and the key Dpp-binding motif identified in the C-terminal domains of the two Drosophila collagen IV subunits is highly conserved across phyla, suggesting that this interaction may be important in other contexts (18). Another instructive example is collagen II, the major collagen of cartilage that, near its N terminus, contains a chordin-like VWC domain that binds TGF-β1 and BMP-2, two chondrogenic growth factors. The VWC domain is alternatively spliced, included in prechondrogenic mesoderm and early developing cartilage but excluded in mature cartilage (19). The VWC or chordin domain is found in many ECM proteins and in known regulators of BMPs, and it typically acts as a negative regulator of their functions (20). These examples illustrate the capacity of conserved elements of ECM proteins to regulate, either positively or negatively, the functions of diffusible morphogens of the BMP family.

TGF-β Regulation by ECM Binding

The regulation of TGF-β signaling by ECM proteins is one of the best developed examples of this capacity. Each of the precursors of TGF-β isoforms 1 to 3 is cleaved by a furin protease to the mature TGF-β and its propeptide, known as latency-associated peptide (LAP). The LAP and TGF-β remain noncovalently associated in a complex called the small latency complex (SLC), and in this form, TGF-βs are inactive (21, 22). The LAPs then S-S bond to one of the latent TGF-β–binding proteins (LTBPs) to form large latent complexes (LLCs), and many cells secrete TGF-βs already assembled into such complexes. In turn, the LTBPs bind to other ECM proteins (including fibrillins and FNs), thereby incorporating the different TGF-β isoforms into extracellular matrices in latent form (Figs. 1 and 2A). LTBP-mediated incorporation into the ECM is necessary for subsequent effective activation of TGF-βs. There are several mechanisms for activation (Fig. 2B), including degradation of ECM proteins such as fibrillin or LTBPs. Activation can also occur by cleavage or conformational change in LAP, exposing or releasing the TGF-βs to bind and activate their receptors (21, 22). Another ECM protein, TSP, can activate TGF-βs by binding and dissociating LAP or by activating metalloproteases; mice lacking TSP-1 develop pneumonia because of reduced levels of active TGF-β in their lungs (23). Yet another mechanism for activation of TGF-βs involves αvβ6 and αvβ8 integrins, which bind to Arg-Gly-Asp (RGD) sequences in LAP1 and LAP3 (24, 25). αvβ8 integrin appears to cooperate with metalloproteases to release TGF-β. However, αvβ6 integrin activates TGF-β without any requirement for proteolysis. Instead, it binds to LAP and, in the presence of mechanical strain between the cells expressing the integrin and the ECM to which the SLC is attached, deforms LAP to expose the associated TGF-β (Fig. 2B). The activated TGF-β is not released in soluble, diffusible form but appears to act only at short range, perhaps as a bound solid-phase ligand. Thus, the binding, sequestration in latent form, and subsequent activation of TGF-βs all intimately involve a variety of ECM proteins (Fig. 2). The whole assemblage acts like a regulated machine incorporating both negative and positive regulation; incorporation of TGF-β into the matrix anchors and localizes the growth factor in a latent form, which can subsequently be locally activated by proteolysis or by mechanical strain (2125). Mutations in many of the ECM proteins, integrins, and the RGD sites in the LAPs confirm the relevance of these interactions in vivo.

Fig. 2

ECM interactions regulating TGF-β. (A) Incorporation into the ECM. Cleavage by furin protease of Pro–TGF-β to the small latent complex (SLC) comprising TGF-β and LAP (blue) is inhibited by emilin, an ECM protein. The SLC binds to LTBP, via S-S bonding to a TB domain, to form the LLC, in which form the TGF-β is inactive (21, 22). LTBP then binds to fibrillin and to FN (see Fig. 1 for specific interaction domains). Fibulins compete for LTBP binding to fibrillin (40). Fibrillin binds to preexisting FN fibrils or assembles into microfibrils, and both fibrillin and FN undergo further homomeric and heteromeric interactions within the ECM. (B) Activation of ECM-bound latent TGF-β. TGF-β can be activated by proteolysis of the ECM proteins and/or of LAP or directly by thrombospondin (see text). TGF-β can also be activated by mechanical strain (large green arrow). This strain arises from cytoskeletal force applied through αvβ6 integrin, which binds to an RGD site in LAP and requires attachment of the TGF-β/LAP complex through LTBP to the FN-rich matrix, which, in turn, is attached via α5β1 integrin to other cells. Fibrillin might also be attached to cells via integrins.

Further analyses of LAPs, LTBPs, and fibrillins show that the TGF-β–LAP complex binds to LTBP-1 through a specific TGF-binding (TB) domain and adjacent EGF domains (Fig. 1). TB domains, as well as hybrid domains (hybrids of TB and EGF domains), are unique to fibrillins and LTBPs, and there are several in each of those proteins, suggesting that they may be able to bind other BMP family members (Fig. 1). Indeed, proBMP-7 can bind to fibrillin-1 in an N-terminal region containing a hybrid and a TB domain (26). Furthermore, fibrillin-2 and BMP-7 mutations interact in causing syndactyly and polydactyly in mice (27), and a related human disease, congenital contractual arachnodactyly, arises from mutations in fibrillin-2 (28, 29). Other functionally important interactions between members of the TGF/BMP and LTBP/fibrillin families probably remain to be discovered. The interactions of different LTBPs and fibrillins with diverse TGF/BMP family members potentially target different signals to different locations.

The implications of ECM-based regulation of TGF-β function for human disease have recently become abundantly clear in the case of Marfan syndrome, a genetic disease resulting from mutations in the gene for fibrillin-1 (28, 29). Like many other genetic diseases whose target genes encode ECM proteins, this disease is associated with defective assembly of ECM components—in this case, the microfibrils of which fibrillins are components. The phenotype was originally attributed to mechanical consequences of these structural defects. However, the known associations of fibrillins with LTBPs suggested that activation of TGF-βs might also play a role. In mouse models of Marfan syndrome, activation of TGF-β is markedly increased, and many of the phenotypic consequences of mutations in fibrillin-1 can be ameliorated by TGF-β antagonists, an insight that already has clinical applications (28, 29).

ECM Proteins as Localized, Multivalent Signal Integrators

Thus, discrete domains in ECM proteins can bind and regulate functions of canonical growth factors. Many such domains are found in multiple ECM proteins in different combinations and arrangements, and presumably, many more ECM/growth factor interactions remain to be discovered. Other domains and motifs in these ECM proteins have the potential to bind directly to cell surface–adhesion receptors such as integrins. At the very least, the coexistence in the same ECM proteins of sites for cell adhesion and binding sites for growth factors concentrates the growth factors close to their own cell surface receptors. Thus, localization of growth factors at the cellular level by binding to the ECM can localize their signaling, and binding of growth factors to the ECM probably contributes to establishment of stable gradients. According to this model, morphogen gradients are composed jointly of soluble, diffusible factors and the ECM—and both are necessary. ECM-bound growth factors could be released locally or presented as complexes still bound to the ECM proteins; as mentioned earlier, there is also the potential (as yet unproven) for specific intrinsic domains in ECM proteins (such as EGF-like domains) to bind directly to growth factor receptors.

ECM proteins are highly conserved, not only in the sequences of specific domains but also in the arrangements of those domains. Furthermore, specific domains are often inserted or omitted by regulated alternative splicing, thus changing the complement of domains. This could alter the binding of specific growth factors, as in the case of the VWC domain in type II collagen (19), or interactions with cell surface receptors. In the case of agrin, inclusion of two small exons confers on agrin the ability to bind to heparan sulfate and dystroglycan and greatly enhances the clustering of acetylcholine receptors (30). ECM proteins can also synergize with growth factors in affecting cell proliferation and migration (9). Although such synergy does not in principle require juxtaposition, experiments on VEGF binding by FN show that the synergy requires the binding sites for integrins and VEGF to be coupled in the same molecule—presenting them as two separate, substrate-bound fragments of FN does not suffice (16). If such proximity is important, ECM molecules, by virtue of their ordered-domain organization, could act to organize complexes of receptors in the plane of the membrane. Such complexes could enhance membrane-proximal regulation among the receptors and promote integration of the signals transduced (Fig. 3). An instructive parallel can be found in the clustering of immunoregulatory receptors in immunological synapses [which also involve cross talk among integrins and other receptors (31, 32)]. Immunological synapses have substructure: Different receptors occupy different zones within the synapse. ECM-mediated clusters could have highly detailed substructure, and the juxtaposition of different receptors could be driven by the arrangement of domains in the ECM protein at a resolution of several nanometers. One could think of ECM proteins and their associated partners (growth factors and other ECM proteins) as solid-phase growth factors metaphorically playing chords, in contrast with soluble growth factors that one could view as playing single notes (Fig. 3).

Fig. 3

Multidomain interactions of ECM proteins with cells. The example shown is FN (41). Multiple domains are known to bind to integrins, other ECM proteins, and growth factors, as shown. Integrins α5β1 and α4β1 bind, respectively, to RGD and LDV motifs; heparan sulfate chains of syndecan (purple/blue) bind to FN3-13 as does VEGF. Evidence suggests that VEGF (V, yellow) signals through its own receptor (VEGFR2) more effectively when bound to FN (16). The same is proposed here for HGF (H) and its receptor (Met, pink). As shown in Figs. 1 and 2, fibrillin binds to an N-terminal region of FN and, in turn, binds LTBP, which recruits TGF-β in a latent complex with LAP (blue crescent). αvβ6 integrin can bind an RGD site in LAP, activating TGF-β, so that it can bind its own receptors (orange). The proposal is that FN organizes and integrates all these signals at two levels. First, by recruiting growth factors to the ECM, FN localizes those signals at the cellular level. Second, the close juxtaposition of the domains in FN brings the different receptors together into an organized submicron patch in the cell surface membrane. Each domain is 2 to 4 nm in diameter, and the entire FN subunit shown is 60 to 70 nm long, so the receptors will be brought into close apposition such that their signals provide complex, integrated information to the cell—metaphorically generating “chords” and “melodies” in contrast with the “single notes” generated by each receptor. FN is essential for angiogenesis, and most of the bound receptors and ligands have been shown to play roles in angiogenesis. This model suggests that FN and its associated ECM proteins orchestrate and integrate these signals. In addition, alternatively spliced domains of FN (blue circles; see Fig. 1A) are also necessary for proper vascular development, and it is a reasonable hypothesis that they introduce additional ligands and/or receptors into the mix.

The very nature of the ECM imposes spatial context on the signaling. Cells are often polarized by their associations with the ECM—the basement membranes to which epithelial sheets attach define the base and polarity of the cells and confer ability to respond to soluble growth factors such as EGF. The deformability of the ECM also affects the responses of cells (24, 33, 34). ECM molecules are flexible and extendable, and mechanical tension can uncover cryptic sites within them (35). Such mechanically exposed cryptic sites could bind additional cell surface receptors or growth factors. Mechanical extension or the inclusion or exclusion of alternatively spliced domains could also alter the physical relations among other domains, thus affecting the composition and spatial arrangement of the hypothesized organized patches of receptors.

Implications for Future Research

The ideas explored here need further experimental tests. There are relatively few well-documented examples of specific growth factor binding by domains in ECM proteins, but this possibility could be readily investigated. There are even fewer cases where it is clear whether ECM-bound growth factors need to be released to soluble form or can act as solid-phase ligands. The proposition that intrinsic domains of ECM proteins can directly affect canonical growth factor receptors, either as solid-phase ligands or as locally released soluble ligands, needs more study. The idea that specific arrangements of domains confer important information can be tested. The possible effects of mechanical strain on exposure of cryptic binding sites for growth factors, receptors, or other ECM proteins are just beginning to be explored. The nature of ECM-induced receptor complexes in the membrane can be investigated by methods such as single-molecule tracking, fluorescence energy transfer methods, correlation microscopy, high-resolution electron microscopy, and chemical cross-linking. The effects of regulated alternative splicing of ECM proteins on all of these questions and the implications of the diversity within families of proteins such as LTBPs and fibrillins need to be investigated further.

The ECM is a fundamental component of the microenvironment of cells and has been substantially expanded during the evolution of vertebrates. Some of that elaboration has contributed to structural components such as bones and teeth, but it is evident that this is only one role of the ECM. The ECM provides much more than mechanical support and a locus for cell adhesion, with potential roles in basement membranes, stem cell niches, and tumors. All epithelial cells are in association with basement membranes for at least part of their lives, and many stem cell niches include the ECM. ECM composition and organization undergo radical alterations in cancer and could affect survival, proliferation, and other properties of both tumor and stromal cells. Ever since McKusick’s initial cataloguing of a diverse set of genetic diseases affecting the ECM more than 50 years ago, it has been implicitly assumed that the pathological consequences were a direct result of defects in ECM assembly. Although those defects do exist, and no doubt contribute, in Marfan syndrome and related diseases, many phenotypic consequences are indirect effects of dysregulation of TGF-β signaling consequent on the ECM defects. Structural defects are difficult to treat in the absence of gene therapy or stem cell therapies, but growth factor signaling offers simpler and more accessible targets for intervention. Further investigations of the roles of ECM proteins in regulating signaling events should yield additional leads.

References and Notes

  1. I thank A. Naba and K. Certel for constructive criticisms of the text and figures, and I gratefully acknowledge support from the Howard Hughes Medical Institute and NIH.
View Abstract

Navigate This Article