## Abstract

Finding the optimal solution to a complex optimization problem is of great importance in many fields, ranging from protein structure prediction to the design of microprocessor circuitry. Some recent progress in finding the global minima of potential energy functions is described, focusing on applications of the simple “basin-hopping” approach to atomic and molecular clusters and more complicated hypersurface deformation techniques for crystals and biomolecules. These methods have produced promising results and should enable larger and more complex systems to be treated in the future.

The global optimization problem is a subject of intense current interest. Applications of obvious economic importance include traveling salesman–type problems and the design of microprocessor circuitry. In the domain of atoms and molecules, discovering the lowest-energy isomer or crystal structure for a system with a given composition is frequently a goal. For example, it seems likely that the native structure of a protein is often related to the global minimum of its potential energy surface (PES). Hence, considerable research efforts are being made to predict the three-dimensional structure of a protein solely from its amino acid sequence by computer simulation. Also, experimental microcalorimetry for free sodium clusters (1) has revealed highly irregular thermodynamic properties as a function of size. To explain these results, the global potential energy minimum, which is expected to be the favored structure for the low-temperature experiments, must first be identified. As these two examples show, developing methods for treating a diverse range of systems spanning the fields of chemistry, biochemistry, and materials science is important.

In treating any nontrivial global optimization problem, the principal difficulty arises from the number of minima on the PES, which usually increases exponentially with the size of the system. An example is the cluster of 55 atoms interacting by a Lennard-Jones (LJ) potential, LJ_{55} (Fig. 1B), where the number of minima (excluding permutational isomers) is at least 10^{10}. Nevertheless, in this specific case the global minimum is easily located because of the form of the potential energy landscape (2). For systems of different sizes, locating the global minimum can be much harder and may require sophisticated search routines. Similarly, a random conformational search for the global minimum of a typical protein would take an unfeasibly long time. However, it is now generally agreed that the search is not random, but instead the PES is biased toward the global minimum, which results in efficient relaxation (2, 3). Hence, the amino acid sequences of naturally occurring proteins are presumed to have evolved to fold rapidly into a unique native structure. This does not mean, however, that the corresponding computational search becomes trivial. Although there is some evidence that it may be possible to simulate protein folding by brute force molecular dynamics (4), much faster computers will be needed before such tasks can be routinely undertaken.

Numerous approaches to solving the global optimization problem have been suggested (5–13). One difficulty with the global optimization literature is that it is spread over journals in many different fields; our citations reflect our knowledge of applications to problems in molecular science. Another problem is that detailed comparisons of different approaches are rare, partly because gathering reliable statistics for nontrivial problems is time-consuming. In this article we focus on a few particular strategies that we have found to work best for problems involving clusters, crystals, and biomolecules. We focus on the PES, rather than the free energy, to avoid the additional complications arising from finite temperature. We also emphasize that, for global optimization to succeed in predicting the properties of real systems, it is necessary to have a sufficiently accurate potential energy function. An efficient way to incorporate more realistic potentials has been suggested by Hartke (8).

The LJ model of inert gas clusters has been investigated intensively and provides a useful testing ground for putative global optimization algorithms (14). In terms of the topology of the energy landscape and the structure of stationary points, this potential also provides a useful model of noble gas clusters (15). Furthermore, some of the nonicosahedral global minima first discovered for this potential have recently been identified in nickel and gold clusters (16). A brief overview of global optimization strategies is therefore first presented in the context of LJ clusters (17). The prediction of crystal structures is also a problem of current interest (18) and is discussed subsequently. Finally, we describe several strategies for tackling the protein folding problem (6).

## Clusters

Most global minima for LJ clusters containing fewer than 100 atoms are based on icosahedral packing (Fig. 1B). The exceptions, LJ_{38} and LJ_{75–77}, serve as particularly interesting test cases, because the corresponding energy landscapes consist of two families of structures (2, 19). At these sizes, the lowest-energy minimum based on icosahedral packing acts as a trap and is widely separated from the true global minimum (Fig. 1, A and C). Actually, it is quite easy to find the global minima of these clusters by seeding the starting geometry with a core of the appropriate morphology. Indeed, most of the lowest known minima up to LJ_{147} were first obtained by Northby by counting nearest-neighbor interactions for icosahedral packing schemes (20). However, our focus here is on unbiased methods that may be transferable to other systems.

Simulated annealing (21) probably provided the first generally applicable technique for global optimization. In this approach the state of the system is followed by simulation as the temperature is decreased slowly from a high value, in the hope that it will eventually come to rest at the global potential energy minimum. A new global minimum was located for LJ_{24} with this approach (22), but otherwise simulated annealing does not appear to have been very successful for LJ clusters, even in more sophisticated forms (10, 23). The problem is that the free-energy global minimum can change at a temperature where the energy barriers are too high for the system to escape from a local minimum.

An alternative approach is based upon “hypersurface deformation” where the functional form of the potential energy is deliberately altered. Some transformations smooth the surface and reduce the number of minima, thereby making the global optimization problem easier (10, 24–28). However, the global minimum of the smoothed surface must then be mapped back to the real surface, and this reversing procedure is the key problem associated with such approaches. It is now known that the global minimum can change rather dramatically under some of the smoothing procedures (18, 29). Hence, it is necessary to couple smoothing with an efficient local search procedure, which can be applied in mapping minima back from the deformed hypersurface to the original one (18, 30). To improve efficiency, more than one minimum of the smoothed surface must be tracked backwards (18, 30).

One hypersurface deformation approach, the distance scaling method (DSM, see the section on biomolecules), has been applied with a reversing procedure that involves both structural perturbations and short molecular dynamics (MD) simulations (27). A short molecular dynamics run was carried out for each minimum being tracked back at each step of the reversing procedure, and another molecular dynamics run was carried out for the resulting minimum on the undeformed surface. The deformation lowered barriers between minima, allowing the molecular dynamics process to explore configurational space more efficiently. Coupled in this way to molecular dynamics, the DSM was used to locate the nonicosahedral global minimum for LJ_{38}; however, it failed in some other cases (27). Using local search and self-consistent mapping of deformed and undeformed minima, a deformation-based method has now been used to locate global minima for LJ clusters of sizes up to 100 atoms, except for those containing 75 to 77 atoms.

A third approach uses genetic algorithms (31) that mimic the evolutionary process by evolving a “genetic code” based on the structure and using the concepts of fitness, mutation, and crossover. A variety of different implementations have been proposed (32). In discrete genetic algorithms, the conformational space is subject to a binary encoding corresponding to the “genotype” (33). However, it is also possible to work directly in physical coordinate space, corresponding to the “phenotype” (34), and such a “continuous” genetic algorithm located new global minima for LJ_{88} and LJ_{98} (35). The latter study also includes an implicit “catchment basin” transformation of the energy landscape, which appears to be common to several of the more successful global optimization algorithms. This transformation can be separated from consideration of how the resulting surface is actually searched and is described in the following section.

## Basin-Hopping

We now outline the particular hypersurface deformation that appears to be a constituent of several studies in which new global minima have been found for LJ clusters (35–39). This “basin-hopping” approach has proved to be useful for a range of atomic and molecular clusters as well as biomolecules, and is easy to implement (40). The functional form is explained in the Appendix, part *A*, and in Fig. 2. In this transformation, the potential energy for every point in the catchment basin of each local minimum becomes the energy of that minimum. These catchment basins partition all of configuration space, and so the potential energy can vary only in discrete steps when the geometry moves from one basin to another. The transformation must be combined with a search strategy, and Monte Carlo sampling provides two possibilities if the structure is reset to that of the current local minimum, or allowed to vary continuously. We refer to search techniques coupled to the catchment basin transformation as basin-hopping. The “Monte Carlo plus energy minimization” (MCM) procedure of Li and Scheraga (41) corresponds to coordinate resetting and is generally found to be the more effective approach (19, 39, 42). Steps are proposed by perturbing the latest set of coordinates and carrying out a minimization from the resulting geometry. A step is accepted if the energy of the new minimum, E_{new}, is lower than the starting point, E_{old}. If E_{new} is greater than E_{old}, then the step is accepted if exp[( E_{old} − E_{new})/ kT ] is greater than a random number drawn from the interval [0,1]. The temperature, T , becomes an adjustable parameter, but is not used for annealing.

In an application of the MCM procedure to LJ clusters, all the lowest known minima were located up to 110 atoms, and global minima for LJ_{69}, LJ_{75–78}, LJ_{103–104}, and LJ_{107} were located for the first time in unbiased searches (39). We have found that an unbiased genetic algorithm can also find all the lowest LJ minima up to 110 atoms, if each member of the population is minimized after every step (43). Use of the minimization step means that the same catchment basin landscape is searched. This enabled Deaven and Ho (34) to find new global minima for LJ_{88} and LJ_{98}. Similarly, the “two-level simulated annealing” approach enabled Xue to find new global minima for LJ_{65} and LJ_{66}; this method basically corresponds to basin-hopping without resetting the coordinates to those of the current minimum (36). The “exponential tunneling” approach of Barrón *et al*. was also used to search the catchment basin surface and to locate the truncated octahedron for LJ_{38}(37) (Fig. 1A). The same authors also located some of the new global minima reported in (39) by a seeded search, which again involves local minimization of structures (38). Wolf and Landman successfully located the LJ global minima up to LJ_{110} using local minimization and a seeded genetic algorithm (44).

To explain why the catchment basin transformation is useful for global optimization, it is helpful to examine the thermodynamics of the deformed landscape. For LJ clusters with nonicosahedral global minima, Doye and Wales found that the catchment basin transformation broadens the occupation probabilities of the different morphologies (19). On the original surface the overlap between these distributions is small, and so the probability of escape from the large basin of icosahedral structures is rather low. Other groups have experimented with different sampling distributions within the framework of simulated annealing, with the same intention of broadening transition regions (45).

Some timings for the basic MCM procedure applied to LJ clusters are given in Fig. 3 and Table 1 (40). There are two free parameters: the temperature was fixed, and the maximum step allowed in the perturbation of coordinates that precedes each minimization was adjusted dynamically to give a particular acceptance ratio. The mean number of basin-hopping steps and cpu time required to find the global minimum at a fixed temperature, are shown in Fig. 3 for LJ_{n} up to n = 74. The results are averages over 100 different random starting points in each case. In fact the optimal temperature increases somewhat with size, and so Fig. 3 does not correspond to the best possible parameterization for each cluster. Some more detailed statistics are presented in Table 1 for selected sizes. These results illustrate how difficult it is to find the truncated octahedron for LJ_{38}, and how easy it is to find the Mackay icosahedron for LJ_{55} (Fig. 1B).

It will certainly be possible to improve upon these results by exploring the transformed landscape more efficiently, or perhaps by a quite different approach. One possibility is to find pathways by means of true transition states (46). Another is to use molecular dynamics to simulate the system's evolution between minimizations. The optimal algorithm in any given case is almost certainly system-dependent and may involve a combination of different methods. Unfortunately, testing the efficiency of different algorithms for nontrivial problems, such as LJ_{38} and LJ_{75}, is rather time-consuming.

## Crystals

Crystal engineering is an important branch of materials science whose aim is to design solids with particular properties (47). This effort would be greatly assisted if it were possible to predict crystal structures from the intermolecular potential alone. Unfortunately, many compounds of interest are polymorphic, exhibiting alternative packing modes. This phenomenon causes particular problems for the pharmaceutical industry, where polymorphic “impurities” can lead to undesirable physical properties, which, for example, led to litigation surrounding the drug Zantac (48). Polymorphism may also provide a stringent test of empirical intermolecular potentials.

The application of global optimization techniques to crystal structure prediction is at an early stage of development. Some studies have attempted to generate plausible starting points for energy minimization using common coordination geometries, most probable crystal symmetries, close-packing arguments, and statistical correlations (49). Monte Carlo-simulated annealing has also been considered (50) and molecular dynamics techniques have enabled solvent and kinetic effects to be simulated (51). Williams (52) has approached this global optimization problem by Monte Carlo sampling to predict the crystal structure of benzene without prior assumption of the space group.

Deformation procedures, initially adopted for treating biomolecules, have now been applied to predict crystal structures without making use of ancillary information such as the space group (18). In this work the diffusion equation method (DEM) and DSM (Appendix, part *B* were used to predict the crystal structures of hexasulfur and benzene, which were treated as rigid molecules (18). After fixing only the molecular geometry and the interaction potential, the unit cell dimensions, space groups, and the number of molecules in the unit cell were all computed, and the experimental crystal structures were successfully located. For benzene, the calculation succeeded even when the number of molecules in the unit cell was set to twice the experimental value, which made the global optimization problem considerably harder.

Genetic algorithms have also been applied to analyze powder diffraction data, which does not require an intermolecular potential at all. For example, Kariuki et al. correctly predicted the previously unknown crystal structure of ortho-thymotic acid (53). This approach could prove very useful when single crystals of sufficient size or quality for routine structure solution are impossible to obtain.

## Biomolecules

Predicting the native structure of a protein from its amino acid sequence alone is an area of intense current research. The potential savings of experimental time and effort alone have stimulated a number of approaches: sequence-homology employs the known structures of sequences that are similar to the one in question (54), whereas threading uses energy (or energy-like) functions to compare the sequence with structural motifs from a database of known structures (55). Some previous methods have combined the global optimization of a potential energy function with constraints based on secondary-structure prediction and multiple-sequence alignments, and these have also been quite successful (56). Here we focus on three global optimization procedures, DEM (25), DSM (27, 57) and conformational space annealing (CSA) (58) (see Appendix, part *C*), which have recently produced encouraging results without the use of any sequence or structure analogies and databases (59).

We have applied potential function deformation to this global optimization problem, with careful mapping between deformed and undeformed minima using local search techniques. The underlying principle is to locate large regions of conformational space containing low-energy minima by coupling them to some of the greatly reduced number of minima on the highly deformed surface. The DSM and the DEM have been implemented to carry out the deformation, each being applied to a different part of the potential function (see Appendix, part*B*).

The DSM has been applied to united-residue polyalanine chains with a length of up to 100 residues and to staphylococcal protein A (SPA) (57). It has successfully located low-energy structures of polyalanine chains, predicting that the most stable structure in the absence of solvent is a straight α-helix for up to 70 residues. For 70–80 residues the most stable form is bent in the middle of the α-helix and, from 80 residues upward, the most stable structure is a three-helix bundle. For SPA, a minimum very close to the experimental structure has been located. These results show that hypersurface deformation can be applied to the conformational analysis of globular proteins.

The greatest advantage of the CSA method (see Appendix, part*C*) is that it finds distinct families of low-energy conformations. It was successfully applied to all-atom polypeptide chains for the pentapeptide enkephalin (58) and to the 20-residue membrane-bound portion of melittin (60). An application using a united-residue representation (61) was also successful; the method located very low energy structures for the fragment of SPA consisting of residues 10 through 55 and for apo calbindin (ACB), with a root-mean-square deviation of 2.1 Å and 3.9 Å, respectively, from the α-carbon trace of the experimental structure (62). CSA is also a core part of a recently developed hierarchical approach to protein-structure prediction (59) first fully used on Critical Assessment of Techniques for Protein Structure Prediction (CASP3) (63) proteins.Figure 4 illustrates the quality of one of the blind predictions (59).

## Conclusions

We have provided an overview of some selected recent developments in the field of global optimization as applied to clusters, crystals, and biomolecules. For atomic and molecular clusters the basin-hopping approach coupled to search strategies based on Monte Carlo sampling or genetic algorithms seems to work well. Unbiased algorithms can often treat systems with at least 100 atoms or molecules reliably, and we expect biased or seeded approaches to be useful for significantly larger systems.

The basin-hopping approach has demonstrated transferability between atomic and molecular clusters and has also yielded useful results for biomolecules (11). However, more complicated methods such as conformational space annealing have been shown to perform better for SPA, ACB, the periplasmic protein HDEA, and a bipartite transcriptional activator MarA (59).

For biomolecules, development of better potential functions is clearly a priority. Improved potential functions must better describe structures that exist in nature, where the global minimum is expected to be separated by an energy gap from higher energy structures. The quality of the potential is also a serious issue in structure prediction for crystals.

With further improvements in both algorithms and potential energy functions, we expect to see solutions of previously intractable global optimization problems in many different fields (64).

*Note added in proof*: Using a basin-hopping approach, R. H. Leary has found a new nonicosahedral global minimum of T_{α}symmetry for LJ_{98} (14).

## Appendix

*A. A transformed energy landscape.* The following transformation of the energy landscape does not change the relative energies of any minima: *Ẽ(***X**) = min{*E(***X**)}, where **X** represents the three-dimensional vector of nuclear coordinates and “min” signifies that an energy minimization is carried out starting from **X**. The transformed energy, *Ẽ(***X**), at any point,**X**, becomes the energy of the structure obtained by minimization. Each local minimum is, therefore, surrounded by a catchment basin of constant energy consisting of all the neighboring geometries from which that particular minimum is obtained. The overall energy landscape becomes a set of plateaus, one for each catchment basin (Fig. 2), but the energies of the local minima are unaffected by the transformation. Aside from removing all the transition state regions from the surface, the catchment basin transformation also accelerates the dynamics, because the system can pass between basins all along their boundaries. Atoms can even pass through each other without encountering prohibitive energy barriers. *B. The diffusion equation and distance scaling methods.* The DEM achieves a smoothing of a potential function *f(***X**) by transforming it into the function*F(***X**, t), which is the solution of the diffusion equation with *f(***X**) as the starting condition and*t* (time) the deformation parameter. The DSM, which is applicable to pairwise interactions, transforms the distance between the centers of interactions according to the formula *r̃ = (r + r _{0}t)/(1 + t),* where

*r*is the equilibrium distance for a pairwise interaction. The whole procedure consists of macro-iterations in which the parameter

_{0}*t*controls the deformation changes between two extreme values,

*t*

_{max}and

*t*

_{min}(

*t*= 0 corresponds to the original energy surface).

*C. Conformational space annealing.*The CSA method (58, 60) searches the whole conformational space in its early stages and then narrows the search to smaller regions with low energy as the distance cut-off,

*D*

_{cut}, which defines the similarity of two conformations, is reduced. As for genetic algorithms (33), CSA starts with a preassigned number of randomly generated and subsequently energy-minimized conformations. This pool of conformations is called the bank. At the beginning, the bank is a sparse representation of the entire conformational space. A number of dissimilar conformations are then selected from the bank, excluding those that have already been used; they are called seeds. Each seed conformation is modified by changing from one variable to one-third of the total number of variables pertaining to a contiguous portion of the chain; the new variables are selected from one of the remaining bank conformations rather than being picked at random. Each conformation is energy minimized to give a trial conformation. For each trial conformation, α, the closest conformation

*A*from the bank (in terms of distance

*D*is determined. If

_{αA})*D*

_{αA}< D_{cut}(

*D*

_{cut}being the current cut-off criterion), α is considered similar to

*A*; in this case α replaces

*A*in the bank if it is also lower in energy. If α is not similar to

*A*, but its energy is lower than that of the highest-energy conformation in the bank,

*B*, α replaces

*B*. If neither of the above conditions holds, α is rejected. The narrowing of the search regions is accomplished by setting

*D*

_{cut}to a large value initially (usually one-half of the average pair distance in the bank) and gradually diminishing it as the search progresses. Special attention is paid to selecting seeds that are far from each other. One round of the procedure is completed when there is no seed to select, that is, all conformations from the bank have already been used. The round is repeated a predetermined number of times.

↵* To whom correspondence should be addressed. E-mail: dw34{at}cus.cam.ac.uk