An Analytical Solution to the Kinetics of Breakable Filament Assembly

See allHide authors and affiliations

Science  11 Dec 2009:
Vol. 326, Issue 5959, pp. 1533-1537
DOI: 10.1126/science.1178250


We present an analytical treatment of a set of coupled kinetic equations that governs the self-assembly of filamentous molecular structures. Application to the case of protein aggregation demonstrates that the kinetics of amyloid growth can often be dominated by secondary rather than by primary nucleation events. Our results further reveal a range of general features of the growth kinetics of fragmenting filamentous structures, including the existence of generic scaling laws that provide mechanistic information in contexts ranging from in vitro amyloid growth to the in vivo development of mammalian prion diseases.

Molecular self-assembly is the basis of phenomena ranging from the construction of materials for nanotechnology (1) to the formation of molecular machineries within living cells (2). The assembly of these frequently complex and highly intricate structures typically depends on a series of individual steps that are inherently simple and are therefore amenable in principle to a quantitative analysis based on physical principles. An important class of molecular structures that emerges from the self-assembly of simpler components is that of filamentous assemblies of biological macromolecules. Many proteinaceous aggregates of this type, which are increasingly linked with normal and aberrant biological processes (2), form through a nucleation mechanism followed by a self-templated growth where the ends of existing filaments recruit soluble molecules into aggregates that can themselves multiply through secondary nucleation processes such as fragmentation (Fig. 1A).

Fig. 1

Kinetics of self-templated aggregation. (A) Growth through nucleation (1), elongation (2), and fragmentation (3) leads to sigmoidal kinetic curves (B) for the mass concentration of fibrils as a function of time. Dashed blue line, first moment M(t) computed from the numerical solution of the master equation, Eq. 1; solid blue curve, analytical solution given in Eqs. 2a and 2b, obtained through fixed-point iteration of the early time limit (dotted curve). The parameters are k+ = 5 × 104 M−1 s−1, k = 2 × 10−8 s−1, mtot = 5 × 10−6 M, kn = 2 × 10−5 M−1 s−1, nc = 2, and M(0) = P(0) = 0. The definitions of the lag phase τlag and the maximal growth rate vmax are shown.

One of the key questions in molecular self-assembly phenomena is to determine the relative importance of different microscopic processes and their contribution to the overall reaction (3, 4). Master equation approaches are particularly powerful in this context as they enable the explicit description of microscopic processes and have thus offered a series of insights (510) into phenomena including the formation of amyloid fibrils, species that are of increasing interest particularly because of their association with clinical disorders ranging from Alzheimer’s disease to type II diabetes (2). The lack of analytical solutions to such master equations has, however, represented a challenge in the quest to establish general principles and laws governing filamentous growth. Scaling laws, also known in biology as allometric laws, define the relationships between different properties of a given system, and such laws have enabled a variety of fundamental principles to be revealed in fields ranging from condensed matter physics to ecology and from sociology to economics (11, 12). In this work, we illustrate how the availability of an analytical treatment of a master equation (Eq. 1) enables the rationalization of a wide range of experimental results relating to the self-assembly processes of peptides and proteins based on scaling laws and increases the predictive power of the analysis of such phenomena.

The basic processes that enter the master equation describing filamentous protein aggregation have been established previously. Amyloid formation has almost universally been shown to involve a nucleation-dependent polymerization reaction (13), where the formation of growth nuclei from soluble proteins is slower than the elongation of preformed nuclei. In addition, secondary nucleation mechanisms (3) have been identified (5, 7, 14) that lead to the formation of additional nuclei from preexisting filaments; we include here a common type of such mechanism, namely filament fragmentation (5), and other types can also readily be considered within the same scheme [see Supporting Online Material (SOM)]. These molecular processes lead to a master equation describing the time evolution of the concentration f(t, j) of filaments of length j in the system as an infinite set of coupled nonlinear differential equations of the form (610)f(t,j)t=2m(t)k+f(t,j1)2m(t)k+f(t,j)k(j1)f(t,j)+2ki=j+1f(t,i)+knm(t)ncδj,nc(1)where m(t) is the concentration of monomers. The different terms in Eq. 1 represent the elementary microscopic processes. The first term 2m(t)k+f (t,j–1) accounts for the increase in the number of filaments of length j due to the addition of monomers to either end of a filament of length j–1. The term –2m(t)k+f (t,j) describes the decrease in their number due to filaments of length j growing further to length j+1, the term –k(j–1)f(t,j) reflects the possibility of a filament of length j breaking at any of its j–1 internal links, and the term 2ki=j+1f(t,i) accounts for the fact that there are two links in any filament of length i > j where breakage leads to a filament of length j. The last term represents the spontaneous formation of growth nuclei of size nc as a polynomial form in m(t), which we shall see has the property that it allows the conclusions from the classical theory of homogeneous nucleated growth to be recovered as the limit of our results when setting the fragmentation rate to zero.

Using analytical techniques based on fixed-point mappings (15), we extend linearized early time solutions to describe the full time course of the reaction and reveal [see details in (16)] that the time evolution of the most important experimentally accessible observables, the principal moments of the distribution f(t,j) (the number P=j=ncf(t,j) and mass M=j=ncj×f(t,j) concentrations of filaments) can be written in closed form to a very good approximation (Fig. 1 B) as the integrated rate lawsP(t)=mtot2nc1mtotke(2nc1)ktEi(C+eκt)κ+e(2nc1)ktB2(2a)M(t)=mtot[1exp(C+eκt+Ceκt+knmtotnc1k1)](2b)where κ=2mtotk+krepresents the rate of multiplication of the filament population, mtot = M(t) + m(t) is the total protein concentration,Ei(x)=x (e−y/y)dy denotes the exponential integral function, C±=k+P(0)/κ±M(0)/(2mtot)±(knmtotnc1)/(2k), and the constant B2 is fixed by the initial condition P(0). We note that the form of the rate of filament multiplication κ—which, as discussed below, is a key parameter in our description of filamentous growth—is consistent with the idea that both the elongation rate mtot k+ and the fragmentation rate k have to double for the overall rate of production of new filaments to double. We point out that we have assumed that the fragmentation rate is independent of the length of the filaments; this assumption was primarily made because of the current difficulty of obtaining reliable experimental estimates of length-dependent fragmentation rates. The approach that we describe, however, can be in principle modified to take this effect, and indeed others, into consideration when more accurate measurements become available.

In the following, we demonstrate that the theoretical framework provided by Eqs. 2a and 2b has the capacity both to account for and to predict the outcome of protein aggregation reactions of the type associated with protein misfolding diseases. As a first example, we consider the case of a representative protein, insulin; the data in Fig. 2A show that a comprehensive series of reactions performed in vitro with different initial conditions, and resulting in differently shaped kinetic curves ranging from sigmoidal to convex, can be completely accounted for by the solution Eqs. 2a and 2b, with the same values of the two global parameters corresponding to the microscopic elongation and breakage rate constants. According to our treatment, the sigmoidal growth kinetics can be observed as a result of secondary nucleation, here fibril fragmentation, even in the absence of rate-limiting primary nucleation events. This result contrasts with the classical theories of nucleated growth, where homogeneous nucleation is the only source of additional filaments and where such nucleation processes are consequently crucial for determining the duration of the lag phase (13)—the characteristic time of nucleated polymerization reactions during which the initial growth rate is small, and in some cases not measurable using bulk techniques (17).

Fig. 2

Experimental measurements of the polymer mass concentration M(t) of fibrillar insulin in (A) and (B) are explained using Eqs. 2a and 2b with the two parameters k+ = 2.9 × 104 M−1 s−1 and k = 2.1 × 10−9 s−1. In (A), the total monomer concentration was (from top to bottom) mtot = 149 μM, 98 μM, and 49 μM, and the seed mass concentration used was M(0) = 0.21 μM. In (D), for a constant mtot = 98 μM, the seed concentration was (from left to right) M(0) = 2.4 μM, 0.8 μM, 0.21 μM, and 0 μM, and the average polymerization number of the seed fibrils L0 = M(0)/P(0) = 1380 was estimated from AFM measurements (see SOM). The inset in (B) shows the polymer number concentration P(t) determined simultaneously for the data in (A) and (B) for the same values of k+ and k. The effect of fragmentation is shown in (C); the polymerization of β-lactoglobulin was measured by Hill et al. (18) for increasing shear rates (right to left: 0, 25, 50, 100, and 200 s−1) and was scaled here between 0% and 100%. The values nc = 2, kn/k = 22.8 were held constant in all the data sets, and κ = 1.2 × 10−4 s−1, 4.3 × 10−5 s−1, 2.6 × 10−5 s−1, 3.7 × 10−6 s−1, and 0 s−1. (D) Experimental measurements from Ferguson et al. (19) for the polymerization of the WW domain. The two parameters in Eqs. 2a and 2b were derived from the data at mtot = 50 μM (blue curve); using the resulting values of k+k = 1.7 × 10−4 M−1 s−2, kn/k = 13.2 M−1 to predict the polymerization time course for mtot = 100 μM, 200 μM, and 500 μM is shown in gray, whereby the parameter nc = 2 is fixed. The comparison between this prediction (green curves) and the experimental measurements (green circles) demonstrates that both the maximal rate and the time at which this rate occurs are predicted accurately. Data not considered in the fits are shown with filled gray squares.

The filament number concentration P(t) can be reconstructed experimentally (7) from measurements of the free monomer concentration m(t) = mtotM(t) and from the measurement of the rate of change M˙(t)in the mass concentration of polymers: P(t)=M˙(t)/{2k+[mtotM(t)]}. Comparison of the results of this reconstruction with the analytical expression for P(t) from Eqs. 2a and 2b demonstrates that in this case also the use of the same values of k+ and k yields good agreement with the measurements (Fig. 2B, inset).

The importance of fibril fragmentation for the phenomenon of amyloid growth can be illustrated with the very high sensitivity of the growth kinetics to mechanical shear. We examine in this context the polymerization of β-lactoglobulin (18) under a series of different controlled rates of shearing that promote breakage. We considered a constant primary nucleation rate, kn, for the entire data set, and a one-parameter fit to each kinetic curve to allow the breakage rate to vary through different values of κ. As shown in Fig. 2C, the variation of this single parameter can account explicitly for the observed differences in both the lag time and the growth rate under different shear conditions. This result shows that even a moderate increase in fracture rate can have very important consequences for the number of protein molecules incorporated into the aggregates; hence, fragmentation emerges as possessing fundamental importance for determining key observables such as the lag phase. This result is also likely to lie at the heart of the well-known high sensitivity of the kinetics of fibril growth to agitation or sonication, processes known to introduce enhanced fragmentation but likely to leave largely unaffected other rate constants for the system.

Based on knowledge gained from the analysis of the rates of the individual processes underlying the growth of protein fibrils, we can now predict quantitatively how the course of this reaction is affected by changes in system parameters such as protein concentration. To illustrate this idea, we consider data from (19) for the amyloid growth rates of the WW domain in vitro as the concentration of the monomeric protein is varied over an order of magnitude. We extract the microscopic rates through a fit to one of the kinetic curves (Fig. 2D, blue line), and then quantitatively predict as a function of time, with no free parameters, the behavior of the system for a different set of initial conditions (Fig. 2D, gray line) simply through the scaling of the microscopic rates with concentration according to Eqs. 2a and 2b.

Our analytical theory further reveals general features of the growth of fragmenting filaments. First, considering the normalized maximal growth rate, given as the steepest slope of the sigmoidal kinetic trace (Fig. 1A) divided by the initial soluble protein concentration vmax=mtot1maxt[M˙(t)]mtot1[M˙(t)]t=tmax=κ/e, where tmax = κ−1log(1/C+) and we have kept the leading C+ term in Eqs. 2a and 2b over constant and decaying terms, our results show that this maximal rate depends on the nature of the aggregates and the environment in which they are formed only through the single parameter κ, and not on other system parameters such as the number of nuclei present initially or on the primary nucleation rate. This result is in contrast to classical linear growth theories (20), where the growth rate and lag phase have strong dependencies on the specific details of the homogeneous nucleation process.

A closer examination of the role of the parameter κ reveals that it also essentially defines the lag phase. The lag phase exists only if the growth rate mtotκ/e is maximal at the inflection point tmax = log(1/C+–1, or equivalently for M(0) < MC = κL0/(2k+e), where MC represents a critical seed concentration which, if exceeded, results in a reaction proceeding without a lag phase. This prediction for the existence of such a threshold can be verified experimentally as shown in Fig. 2B. The conditions used result in a threshold of MC = 0.9 μM, and it can be seen that the reaction starting with M(0) = 0.21 μM proceeds with a marked time lag, whereas for M(0) = 2.4 μM the rate profile is convex. If we define the lag time as shown in Fig. 1A by extrapolation from the maximal growth rate, we obtain the expression τlag = [log(1/C+) – e+1]κ–1.

It is well known that primary nucleation is an essential step in the phenomenon of amyloid growth in the absence of seeds, and a contribution to the duration of the lag phase can indeed result from such a nucleation process. Our results show, however, that the nucleation rate enters only as a logarithmic correction through C+ in the expression for τlag; therefore, when secondary nucleation pathways are active, we conclude that the experimental lag time is primarily determined by the exponential growth regime that takes place in the initial phase of the reaction. Under these circumstances, the formation of oligomeric species, a process that has been observed concomitantly with fibril formation and linked with cellular toxicity (2), is likely to contribute to such logarithmic terms and may therefore have a limited influence on the overall growth kinetics of fibrillar species. These results reveal that, although the kinetic curves for processes dominated by primary nucleation (20) and by secondary nucleation give rise to apparently similar kinetic traces, their origin is fundamentally different.

Our theory also provides a general demonstration of the validity of a conjecture about amyloid fibril growth (21, 22) that the lag time is commonly highly correlated with the inverse maximal growth rate. Indeed, for fragmentation-assisted growth, we have shown that both the lag phase and the maximal growth rate depend primarily on κ. Therefore, the correlation emerges naturally without the requirement for the responses of the processes of nucleation and growth to changes in system parameters to be identical, a situation generally likely to be incompatible with structural and mechanistic information but required to enforce this correlation under the assumption that the lag phase is defined by primary nucleation processes and the growth phase solely by fibril elongation. Although the coefficient [log(1/C+) – e+1] linking the lag time to the maximal growth rate is variable for different proteins and different experimental conditions, the fact that these differences only enter as logarithmic corrections implies that the general inverse correlation between lag time and growth rate remains valid. In Fig. 3, this idea is illustrated with a red band that shows that the overall correlation is maintained, even when the factor C+ has been artificially varied by a factor of 1020, a value chosen to exceed greatly the range of variability accessible experimentally.

Fig. 3

Analysis of the lag phase τlag. A strong correlation exists (blue crosses) between the inverse lag time and the normalized growth rate for 2000 randomly generated kinetic curves on the basis of a uniform probability distribution, with 50 representative examples shown in the inset, and all rate constants were varied over 2 orders of magnitude: k+ = 5 × 103 to 5 × 105 M−1 s−1, k = 5 × 10−8 to 5 × 10−10 s−1, mtot = 5 to 500 μM, kn = 10−11 to 10−9 M−1 s−1, nc = 2, and, in the absence of added seed, M(0) = 0. The red open circles are experimental data from (21). The red band results from a variation of a factor of 1020 in C+.

More generally, the parameter κ, which corresponds to the rate of multiplication of the population of filaments, emerges as the most important quantity describing the overall properties of systems that self-assemble by processes that involve elongation and fragmentation. Our results show that in the regime where secondary nucleation through fragmentation of aggregates is a more effective source of additional structures than primary nucleation [ k/(knmtotnc1)>>1], observables such as the lag time and maximal growth rate depend primarily on just the single parameter κ.

A particularly striking result that follows from the dominance of a single kinetic parameter in this regime of elongation-fragmentation growth is that generic scaling laws emerge for the behavior of the system. For instance, the lag time is predicted to scale weakly with an exponent γ = –0.5 with respect to the monomer concentration (because τlag~κ1mtotγwith γ = –0.5). We have analyzed experimental results for eight unrelated systems, including short peptides, proteins such as insulin and β2-microglobulin, and different types of yeast prions, and find that five of them exhibit this type of weak scaling with high accuracy (Fig. 4). Interestingly, in absolute value, this exponent, 0.5, is crucially smaller than the value of 1.0 or greater required by traditional models (20) in which primary nucleation is a dominant effect determining the duration of the lag phase. In these latter models, the lag time scales with an exponent γ = –(nc+1)/2, which is always |γ| ≥ 1 for a nucleus nc ≥ 1 (20), a result that is clearly inconsistent with the experimental observations for the majority of systems we have evaluated (Fig. 4). Furthermore, in the case of the aggregation of glutamine-rich peptides, nucleus sizes of less than 0 appear to emerge (23) when using classical theories of homogeneous nucleated growth, whereas this type of low concentration dependence emerges naturally from our less-than-linear scaling laws and suggests that breakage is likely to be a key factor in the aggregation of polyglutamine peptides and hence potentially an important contributor to the development of disorders such as Huntington’s disease (2).

Fig. 4

Scaling of the lag phase with protein concentration. Data are shown for β2-microglobulin (β2M) (8) (blue upward triangles); the yeast prion fragment Sup35 (7) (blue crosses); Ure2p yeast prion (22) (blue squares); insulin (27) (blue downward triangles); WW domain (19) (blue diamonds); TI I27 immunoglobulin domain (Ig) (29) (red diamonds); apolipoprotein C-II (ApoCII) (28) (red circles); and times for onset of terminal disease in 50% of mice inoculated with a fixed dose of prions at t = 0 (PrP) (26) (blue circles). The slope corresponding to the exponent γ = –0.5 is shown with a blue solid line, and fits to the data sets with γ < –0.5 are shown in red. The inset shows the scaling exponents determined from the data in the main figure for the different systems. The box height in the inset shows standard errors from the linear fit.

It is interesting to note that in the limit of a vanishing breakage rate, k→ 0, an expansion of the exponentials in Eqs. 2a and 2b up to quadratic order in κ yields a polynomial growth form M(t)M(0)+k+knmtotnc+1t2+2k+mtotP(0)t+ncknmtott at the early stages of the reaction. Our result therefore contains in the appropriate limit the behavior characteristic of classical nucleated growth theories (3, 20). It is likely, therefore, that filamentous growth reactions that deviate significantly from the γ = –0.5 exponent, such as those shown in red in Fig. 4, are limited by complex primary or secondary nucleation processes rather than by simple breakage. For instance, application of the schema described in this paper to growth phenomena where secondary nucleation processes produce additional seeds at a rate that depends on the presence of both filaments and monomer (14) dP(t)/dt=k2M(t)m(t)n2 yields a filament multiplication rate κ=2k+k2mtotn2+1 and therefore a scaling exponent |γ| = (n2 + 1)/2 ≥ 1 (see SOM), where k2 represents a rate constant for monomer-dependent secondary nucleation and n2 is the reaction order of the secondary nucleation.

Finally, we investigate the applicability of the analytical results derived in the present work to in vivo systems. We examine the phenomenon of mammalian prion transmission that is characteristic of disorders such as bovine spongiform encephalopathy and its human analog Creutzfeldt-Jacob disease. The protein-only hypothesis for the propagation of prions describes the conversion of cellular prion proteins (PrPC) into a pathologically aggregated form (PrPSc) in a manner that can be transmissible (24). In such cases, PrPSc aggregates are thought to elongate through the addition of further PrP molecules, and it has been suggested that they multiply rapidly enough through fragmentation of existing structures (5, 25) to allow for transmissibility; these mechanisms are equivalent to the microscopic processes described to the lowest order in Eq. 1. In agreement with this idea, we observe in Fig. 4 that the time for disease onset scales with an exponent γ = –0.52 ± 0.07 as a function of the expression level (26) of the prion protein that determines its concentration in vivo; this exponent is fully consistent with the value of –0.5 observed for a range of in vitro systems (Fig. 4) and expected from the analysis developed in this paper for fragmentation-assisted growth, at least when other in vivo limiting factors, such as cellular clearance mechanisms, have been overwhelmed.

In conclusion, we have provided a unified theoretical framework to address complex biomolecular self-assembly processes that involve primary and secondary nucleation events coupled to linear growth, and demonstrated its value for predicting the kinetics and mechanisms of the proliferation of amyloid fibrils, an example of filamentous growth that is increasingly important in the context of understanding and managing some of the most common and debilitating diseases of the modern era.

Supporting Online Material

Materials and Methods


References and Notes

  1. The kinetic equation for the moment P(t) can be found by taking the sum over j from nc to ∞ on both sides of Eq. 1. The analogous equation for the first moment M(t) is obtained by multiplying both sides by j and then taking the sum. The telescopic sums vanish; rearranging the order in double sums, we obtain, after some algebra, the coupled system dP(t)dt=k[M(t)(2nc1)P(t)]+knm(t)nc(3a) and dM(t)dt=2[m(t)k+nc(nc1)k/2]P(t)+ncknm(t)nc(3b)where m(t) = mtotM(t), and we set f(t,j) = 0 for all j < nc, where nc ≥ 2 is the size of the smallest stable aggregate. Eqs. 3a and 3b make intuitive sense, e.g., the number of polymers at a given time, P(t), can increase if, out of (MP) total links, the chains break at any location more than (nc – 1) links distant from each end, thereby forming two stable polymers and thus accounting for the [M – (2nc – 1) P] factor. The general idea underlying the solution of Eqs. 3a and 3b comes from a fixed-point analysis (15). To reformulate the problem in terms of a fixed-point equation, we first formally solve the differential system P(t)=0te(2nc1)k(tτ)kM(τ)dτ+B˜2e(2nc1)kt(4a) and M(t)=mtotnc(nc1)k/2k++B1e2k+0tP(τ)dτ(4b)where B1 and B˜2 are integration constants, and we have neglected monomer consumption by nucleation terms O(kn). We introduce an operator A, acting on the space of the two principal moments and defined by the right-hand sides of Eqs. 4a and 4b; by construction, the fixed points x*(t) = [M*(t), P*(t)] of A with Ax* = x* are solutions to Eqs. 3a and 3b to within O(kn). The fixed-point equation is solved iteratively: x* = limn→∞A(n)(x0); the resulting solution after one [for M(t)] and two [for P(t)] iterations is already in excellent agreement with the numerical result (Fig. 1) and with experiments (Figs. 2 to 4). Successful iteration requires a good starting value x0(t) = [M0(t), P0(t)], and we chose for this value the early aggregation limit, when mtotM(t) ≈ mtot and Eqs. 3a and 3b give P0(t)=C1eκt+C2eκtncknmtotnc12k+(5a) and M0(t)=2k+mtotC1κeκt2k+mtotC2κeκtknmtotnck(5b) where the effective rate κ=2k+kmtot, and we have used the condition mtot k+ >> k, which guarantees the existence of an initially growing filament population. The constants C1 and C2 are related to the initial conditions C1,2=12P(0)+ncknmtotnc1/(4k+)±M(0)κ/(4mtotk+)±knmtotncκ/(4mtotk+k). Using the starting values (5) for the fixed-point iteration, we arrive at the analytical result presented in Eqs. 2a and 2b. Although the agreement between Eqs. 2a and 2b and the numerical solution (Fig. 1) is good (typical differences comparable in magnitude to common experimental errors), systematic differences are present in the intermediate time regime. Qualitatively, these are primarily because the filament number concentration P(t) entering the expression for M(t) in Eqs. 4a and 4b in the first interaction is from the linearized solution and increases more rapidly than the true solution, and therefore the protein conversion rate dM(t)/dt is overestimated. These errors can be minimized by successive iterations of the fixed-point operator beyond the first-order solution discussed here.
  2. T.P.J.K. acknowledges support from St John’s College, Cambridge through a Research Fellowship. C.M.D. acknowledges support from the Leverhulme and Wellcome Trusts. We are grateful to A. Craig, A. Knowles, D. White, A. Buell, F. Chiti, A. Miranker, and J. Falsig for valuable discussions, and to M. Sandal for preparing a Web server, which is publicly available at, to enable the use of the equations presented in this work to fit experimental measurements and to predict aggregation kinetics.

Stay Connected to Science

Navigate This Article