## Abstract

A fundamental challenge in biological research is achieving an atomic-level description and mechanistic understanding of the function of biomolecules. Techniques for biomolecular simulations have undergone substantial developments, and their accuracy and scope have expanded considerably. Progress has been made through an increasingly tight integration of experiments and simulations, with experiments being used to refine simulations and simulations used to interpret experiments. Here we review the underpinnings of this progress, including methods for more efficient conformational sampling, accuracy of the physical models used, and theoretical approaches to integrate experiments and simulations. These developments are enabling detailed studies of complex biomolecular assemblies.

In modern biological research, a key goal is to understand the functional consequences of structure, dynamics, and interactions of biological macromolecules. Proteins, lipids, carbohydrates, and nucleic acids interact, rearrange, and modify their shape while effecting their various functions. Experimentalists face the daunting task of characterizing thermodynamic and kinetic properties of macromolecules in a complex environment. Computational simulation plays a role in these efforts, as modeling approaches can aid in understanding experimental data and designing and predicting the outcome of future experiments.

Here we review the progress and current challenges in computational modeling of biomolecules, focusing on the topic of atomistic biomolecular simulations and the relationship between experiments and simulations. We highlight recent technological and theoretical advances in the field and consider whether there is a perfect match between experiments and simulations. Disagreement between computation and experiment provides useful insights to further our understanding, and their complementary use yields a clearer picture than either does alone.

## Biomolecular simulations across length and time scales

Experimentalists often collect data that must then be synthesized into a coherent model through inverse problem-solving. Computational modelers deal with the forward problem: constructing a microscopic molecular model that can be compared with observed data (Fig. 1). In the best-case scenario, computational models are fully predictive and widely applicable. This would correspond to a model that correctly predicts the physical behavior of a system—for example, the catalytic power and stability of an enzyme as a function of pH and temperature. Such perfect models do not exist, so variants are developed with distinct strengths and areas of application.

Detailed information for reaction mechanisms and transition states in chemical reactions can be obtained via quantum mechanical (QM) calculations. These allow for simulation of the electronic properties of a subset of atoms within a macromolecule, which can be used to investigate bond cleavage and formation, distribution of charge and spin, and reaction mechanisms. Simulation of electronic properties of molecules requires a great deal of computational power, and thus the applicability of QM methods is, in general, limited to small systems or short time scales (*1*). Molecular dynamics (MD) simulations with empirical molecular mechanics force fields treat atoms as classical particles rather than considering their electronic structure: This approximation makes it possible to study the structure and dynamics of larger systems for longer periods of time, such as small proteins at the millisecond time scale. There are many relevant biological processes, however, that involve much larger biomolecular assemblies. The computational complexity of these problems can be decreased by grouping atoms together into single particles called beads. Such coarse-grained (CG) models range in resolution from one or a few beads per amino acid to one bead per hundreds or thousands of DNA bases. Despite their intrinsic approximations, such models are essential for tackling important problems in structural biology, including understanding complex formation between intrinsically disordered proteins (*2*) and rationalizing chromosome conformation–capture experimental data, thereby gaining insights into the internal chromosome organization (*3*). There also exist mixed, hybrid, and intermediate models that bridge together different resolutions. We focus below on the use of classical MD simulations to study biomolecules.

## Challenges and opportunities for biomolecular simulations

The successful use of MD simulations hinges on solving two distinct, yet related, problems: the “sampling problem” and the “force-field problem” (Fig. 2). The sampling problem refers to our ability to sample the relevant biomolecular configurations and to determine their relative populations. Exhaustive sampling is difficult to achieve, because it is not possible to know in advance the required amount of sampling needed to calculate precise statistical averages. It is even difficult to assess whether a simulation is converged, because one may never know whether there are motions occurring on time scales beyond those sampled and robust and generally applicable tools to monitor convergence are needed (*4*, *5*). Thus, active areas of development are theories, algorithms, and technological improvements to increase the precision of the simulations.

The force-field problem refers to the construction of the energy function that describes the physical interactions between atoms. Improvements in force fields thus increase the accuracy of simulations by providing a more realistic description of the molecular interactions. Although progress on solving these two problems requires distinct approaches, they are tightly related. Only after taking into account all relevant conformations, that is, those that contribute to thermodynamic averages, is it meaningful to calculate average quantities and compare them to experiments. Hence, our ability to improve force fields is tightly connected to improvements in sampling.

## Challenge 1: Improving physical models

The first fundamental challenge in biomolecular modeling is the construction of the physical model itself. Trade-off between computer power and spatial or temporal resolution requires a choice of model, ranging from all-atom representations to CG (*6*) and ultra-CG models in which multiple residues or nucleotides are represented by a single site (*7*). A long-standing goal of the field is to construct hybrid models that smoothly couple together different components at different resolutions (*8*).

In atomistic MD simulations, interactions between particles are modeled by “physics-based” terms that take into account chain connectivity, electrostatic interactions, London dispersion forces, and so on. Parameterized pairwise interaction terms are fitted against QM calculations and experimental data to generate a force field that describes interactions between individual particles. Parameters like the equilibrium distance between two covalently bonded atoms are known with high accuracy. Other parameters, such as partial charges, are difficult to establish, as they do not correspond to physical observables that can be directly probed through experiments.

Accurate force-field parameterization for proteins has benefited from benchmarking and direct optimization of MD simulations with experimental nuclear magnetic resonance (NMR) data on partially structured peptides (*9*, *10*). Simulations of peptides that are 10 to 40 residues long are possible to converge, yet can capture cooperative phenomena, such as helix-coil transitions or formation of small hydrophobic cores, which are difficult to parameterize from smaller molecules. Solution NMR experiments can provide residue-level information and are sensitive to the relative energies of conformations that correspond to local minima and have sizable populations. By optimizing the backbone potential to match the experimentally measured helicity of a 15-residue peptide, as measured by NMR, a small change of about 1 kJ mol^{−1} was found to be sufficient to balance the secondary-structure populations (*10*). This small change in energy leads to a substantial improvement in accuracy, as the forces are accumulated over multiple residues and the populations scale to energies exponentially. Corrections to force fields obtained from examining short peptides are transferable among different structural classes of proteins and have improved models of folding processes for small globular and fast-folding proteins (*11*, *12*).

Unfolded states and intrinsically disordered proteins (IDPs) have long appeared to be more compact and structured when observed in MD simulations than when observed through experiments. This discrepancy suggests that important physical effects were not modeled correctly in the simulations. Proposed solutions include improving the description of water or of protein-water interactions (*13*). These modifications have improved the accuracy of simulations of IDPs (*14*), which play important roles in biology and disease.

A side effect of increasing protein-water interactions is, however, destabilization of folded proteins. In practice, one might thus have to choose between one family of force fields for simulations of folded proteins and a separate set for disordered systems, complicating studies of partially folded systems or of order-disorder transitions such as folding upon binding. To tackle this problem, it is necessary to consider simultaneously proteins that span from fully ordered to completely disordered and to test and optimize parameters at the same time on all of these systems. A comprehensive study of model systems with diverse properties has recently produced a force field capable of handling both fully folded proteins and IDPs (*15*).

In parallel to the development of models to study the structure and dynamics of proteins, there has been a growing interest in modeling nucleic acids, particularly RNA because of its catalytic and regulatory activities. Although important improvements have been made, state-of-the-art RNA force fields remain less accurate that those for polypeptides (*16*). Here, too, artifacts of MD simulations have been uncovered by direct comparison against solution NMR data on small model systems (*17*). Similar to the case of IDPs, promising results have been obtained by balancing water-RNA and RNA-RNA interactions (*18*).

Systematic benchmarking of force fields against experiments has revealed a comforting trend: Force fields are getting better (*12*, *15*, *19*). It is worthwhile to note that these improvements have been possible even without substantial changes to the underlying model or mathematical function used in the force fields. Thus, despite the inherent simplicity and lack of, for example, taking polarization into account, it has been possible to improve force fields dramatically. Indeed, it is surprising that it is possible to parameterize force fields that work well across many different proteins and problems (*20*), and eventually, progress will require models that are more complex.

Improvements of force fields generally rely on ab initio QM calculations. Machine-learning approaches, particularly neural networks, make it possible to train simple potentials with QM-level accuracy (*21*). Training is typically done on small molecules, and encouraging results have been obtained by transferring these potentials to the study of larger organic molecules (*22*). Force fields that explicitly include polarization effects are also likely to benefit from automated methods for integrated parameterization from experiments and QM calculations and from improvements in software and algorithms from sampling with these potentials. Here, Bayesian methods for optimizing force fields against experimental data and QM calculations are expected to play an even larger role, by enabling systematic balancing of different sources of information (*23*–*25*).

## Challenge 2: Accessing long time scales

Atomistic biomolecular MD simulations are inherently costly, owing to the need to model forces between tens or hundreds of thousands of individual atoms or more. These forces are evaluated every few femtoseconds of simulation time, requiring about a billion steps to simulate a molecule for a microsecond. Although the speed of a simulation depends strongly on the size of the biomolecular system and the available computational resources, it is not uncommon to require weeks or months of computer time with hundreds or thousands of processors working simultaneously to obtain microsecond-length simulations.

Conceptually, the most straightforward means to increase the speed and throughput of molecular simulations is perhaps “simply” to decrease the time it takes to perform a single iteration of the simulation. Widely used software packages designed for biomolecular simulations, such as GROMACS (*26*), NAMD (*27*), Desmond (*28*), AceMD (*29*), and AMBER (*30*), use different levels of parallelization by taking advantage of multicore processors and high-performance computing facilities. Speedups can be achieved by off-loading calculations to graphics processing units, which provide high performance at reasonable cost. A different route to improve efficiency is to build hardware specifically adapted to molecular simulations such as MDGRAPE (*31*) and Anton (*32*). For example, Anton is a massively parallel supercomputer designed to perform fast and accurate simulations of biomolecules by simultaneously considering all parts of the calculations, including MD-specific integrated circuits for calculating the costly parts of the force-field interactions, a specialized communication network tailored to match the periodic boundary conditions used in simulation, and special parallelization algorithms developed for this architecture. Anton enabled the first millisecond-length all-atom MD simulation of a globular protein (*32*). Its successor, Anton 2, is optimized for larger biomolecular systems and can perform multimicrosecond simulations in a single day for systems such as a small virus or a solvated ribosome with more than 1 million atoms (*33*).

Massive parallelization has been exploited in the folding@home project, which utilizes hundreds of thousands of “stand-by” machines all over the globe (*34*). Such distributed computing studies may now reach multiple milliseconds of aggregate simulation time and consist of hundreds or thousands of simulations ranging from hundreds of nanoseconds to a few microseconds (*35*). Because each simulation may be much shorter than the time scales of interest, a key problem is how to extract information about slow, long–time scale processes from a combined analysis of many short simulations. One possible solution to this problem is to build a Markov state model (MSM) (*35*, *36*), which enables one to construct a “memoryless transition network” describing the populations and kinetics of interconversion between different conformational states. In recent years, MSMs have gained widespread use, thanks to improved algorithms and software (*37*, *38*) and several successful applications to biomolecular processes, including protein folding, ligand binding, and protein-protein association (*35*). Path-based methods such as transition path sampling (*39*) and milestoning (*40*) also use many short simulations to study kinetics and mechanisms of long–time scale processes. These and related methods exploit the fact that many conformational transitions are “rare events,” for which the time it takes to cross the barrier is substantially shorter than the waiting time between such events.

Sampling may also be enhanced by changing simulation parameters. Increasing the temperature increases the kinetic energy, making barrier-crossing events faster (*41*). This idea is at the basis of parallel tempering, perhaps the most widely used enhanced sampling approach. These alterations can be viewed as enhancing simulations along a progress variable, also known as a reaction coordinate or collective variable (CV), in this case related to the energy of the system. For some problems, rapid fluctuations of the energy, and similar energies in different distinct conformational states, mean that increased temperature does not transfer into efficient sampling. This problem can be exacerbated by the fact that the available conformational space at high temperatures is larger, so that increased rates of sampling are more than offset by the increased volume of conformational space. Accelerated MD may instead be used to “boost” the energy along internal degrees of freedom, such as the backbone dihedral angles, thus enhancing the ability to cross local barriers (*42*).

Enhancing sampling along one or more prespecified CVs that describe the process of interest is another widely used strategy (*43*). In a protein-folding simulation, the number of native contacts formed or the progress along an initial guess of the folding path might be used to guide the simulation, even if the path is imperfect, and thus provide detailed insight into the folding free-energy landscape. Metadynamics (*44*) uses a time-dependent potential to simultaneously enhance sampling and construct a free-energy profile along such CVs and is widely used both because of its applicability to a range of problems (e.g., biomolecular processes, molecular docking, chemical reactions, crystal growth, and proton diffusion) and the availability of efficient and easy-to-use software (*45*).

Long unbiased simulations performed with Anton represent a useful reference to benchmark and validate enhanced sampling methods and kinetic models. In such applications, one may compare a specific protocol for enhanced sampling or constructing a kinetic model with the results from an unbiased simulation with the same force field to focus on benchmarking the algorithms and avoiding complications from force-field uncertainty (*46*, *47*).

Approaches based on CVs are very powerful, but their optimal choice is a critical and nontrivial step. For complex biomolecular rearrangements, it is difficult to identify CVs that correspond to the relevant, slowly varying degrees of freedom. In this respect, deep learning approaches have recently been used to identify improved CVs (*48*). CVs are not only useful to enhance sampling: They are essential to rationalize the large amount of complex data generated in MD simulations. New approaches to create better low-dimensional representations of high-dimensional data are also useful to construct improved MSMs (*35*), and we expect the advances in machine-learning methods [e.g., low-dimensional embedding and clustering (*49*, *50*)] to play an increasingly important role in this field.

## Challenge 3: Integrating experiments and simulations

Although simulations and statistical mechanical theories are important and very powerful in their own right, direct integration of experimental data with molecular simulations can provide a rich description of the structure and dynamics of biomolecules. This field—also called integrative structural biology (*51*)—has benefited from recent technological advances in cryo–electron microscopy (cryo-EM) and is particularly important for studies of complex, dynamic systems for which multiple structural techniques provide complementary information. Formally, the problem consists of determining the three-dimensional structure or, more generally, an ensemble of molecular conformations and their associated weights, which are compatible with a set of experimental observations.

One strategy is to modify the simulation to match experimental data (Fig. 3). In this case, the force field is not considered a fixed, immutable model but instead a fitting function to be adjusted by experimental data. Indeed, this “pseudoenergy” approach underlies most structure-determination algorithms in which a physical energy function (often a simplified force field) is combined with an “experimental energy function” that measures the deviation between experiment and simulation (*52*). These integrative approaches enable accurate protein-structure determination when using chemical shifts (*53*) (Fig. 3A) or when using sparse, uncertain, and ambiguous experimental data (*54*). Similar approaches have been developed with the aim of providing molecular models of large molecular complexes constructed by using diverse sets of experimental data, including cross-linking, small-angle x-ray scattering (SAXS), and cryo-EM images (*55*, *56*). The availability of different sources of complementary experimental data is, in this context, important, as it allows one to cross-validate results and avoid overfitting.

Structural experiments such as SAXS, NMR, and x-ray diffraction report on quantities averaged over many molecules and long periods of time. For rigid molecules, the error may be small when interpreting ensemble-averaged quantities for individual structures. However, dynamical averaging is crucial when studying flexible molecules, such as IDPs or single-stranded RNA, because the structural interpretation of experimental data must be addressed by considering the coexistence of multiple conformational states (Fig. 3). One theoretical approach for dealing with the averaging problem is based on the maximum-entropy principle (*52*). The basic idea is to introduce a perturbation to the conformational ensemble generated by simulations in order to match a set of experimental data. The perturbation should be as small as possible: Mathematically, this is achieved by maximizing a quantity called relative Shannon entropy, hence the name maximum entropy. Thus, a minimal modification to the simulations to match the experimental data results in the least-biased combination of the force field and the experimental measurements. In practice, these approaches can remove much of the uncertainty associated with the choice of force fields so that conformational ensembles derived by combining experiments and simulations are more similar than ensembles derived solely from simulations (*5*, *57*).

Although the maximum-entropy principle provides a coherent framework to obtain conformational ensembles that combine force fields and experimental data, the basic formalism does not take sources of error into account. Another important development has thus been theory that considers not only experimental measurements but also the associated uncertainty. When combining data from multiple experimental techniques, uncertainties are essential to set the correct weights among them. For some sources of experimental data—for example, chemical shifts from NMR spectroscopy—the measurement itself is extremely precise, but our ability to relate the experimental quantity to molecular structure (i.e., the forward model that is used to calculate experimental quantities from three-dimensional structures) is associated with substantial uncertainty. Both experimental and modeling uncertainty can be treated by using Bayesian approaches such as those used in inferential structural determination protocols, leading to improved precision and a rigorous approach to integrate multiple sources of experimental data (*58*).

Combined Bayesian–maximum entropy integrative methods that consider uncertainty and averaging offer a promising route to reconstruct the conformational variability of complex biomolecular systems (*59*, *60*). These methods can be used with all-atom simulations or with CG models for larger assemblies. For instance, the structure and allosteric mechanism of a protein kinase were revealed by reweighting CG simulations using SAXS experimental data (*61*). An alternative approach is to construct an MSM that has also been biased by using experimental data (*62*).

An important challenge is how the information gleaned from these studies may be fed back into improved force fields—for example, by systematically analyzing differences between the experimentally restrained ensembles and those obtained from the models alone. For instance, we recently identified a specific dihedral angle in the RNA backbone whose distribution in MD simulations was markedly different from that found when reweighting the same simulations using a Bayesian–maximum entropy approach (*63*). This observation suggested that force-field errors for this specific term could explain part of the disagreement between experiment and simulations, and, indeed, parallel work on improving RNA force fields resulted in distributions for this dihedral angle that were in much better agreement with the experimentally derived results (*18*, *63*).

The discussion above pertains to experimental data that can be related to equilibrium properties and that can be represented by population-weighted averages over individual conformations in the ensemble. For example, distances probed via nuclear Overhauser effect (NOE) NMR experiments are typically calculated from the average of the inverse sixth power of the distances in each individual structure (*58*). In reality, NOEs and many other experimental quantities depend on kinetic properties that need to be taken into account for the most accurate calculations. Recent theoretical and practical advances make it possible to construct conformational ensembles also based on such information (*62*, *64*–*66*) and thus extend applications to new sources of experimental data.

## Conclusions and outlook

The complexity of biological systems often mandates the combined use of multiple techniques, including biomolecular simulations. Clearly, simulations are not ordinary experiments and often require a detailed knowledge of algorithms, underlying assumptions, and tricks that can be difficult to access and understand for nonspecialists. Much progress has been made on making these tools more user-friendly and accessible, though analyzing simulations often requires specialist knowledge. With a wide range of tools available, it is important to balance precision and accuracy when deciding on a simulation strategy (sampling method, force field, and level of resolution): What level of detail is relevant to the problem at hand, what are the relevant time scales, and can I address imperfections in the model by, for example, experimental restraints?

Substantial improvements in force fields have been made possible by using data from experimental studies on systems that are large enough to capture complex behavior yet simple enough to converge simulations. Future progress requires that experimentalist and computational chemists continue to work together to design experiments that are best suited to optimize force fields and to isolate properties that current models fail to describe (*9*, *17*). By testing and optimizing models broadly across different classes of problems and molecules, it will be possible to create force fields that are more transferable. Eventually, we will have to go beyond the current simple functional forms (*67*, *68*), but a surprising observation has been how much force fields could be improved by careful parameter optimization on an increasingly broad set of QM and experimental data. When reading the simulation literature, one should thus check whether a carefully validated force field has been used. Judging this is helped by the increased availability of systematic comparisons on a broad range of systems (*12*, *15*, *19*, *69*).

Further, as it remains difficult to sample conformational space sufficiently, particularly for complex systems, one should check whether convergence has been assessed and whether quantitative differences are backed up by sufficient sampling. This is inherently difficult because it is much easier to prove lack of convergence than the opposite (*70*). Nevertheless, useful questions to ask include (i) whether the same events are observed multiple times, (ii) the simulations are longer than the correlation times and the statistical analyses take time correlation into account, and (iii) whether the observed effects are greater than the statistical uncertainty.

We must, however, also be pragmatic in the way simulations are used. Like experiments, simulations are not perfect, and we will continue to live with uncertainty in sampling and force fields. Here the integration between experiment and simulations can help alleviate problems in both accuracy and sampling. We envision that these methods will play an increasingly important role in studying the relationship between structure and dynamics of large biomolecular assemblies or highly flexible molecules. The link between molecular simulations and cryo-EM, inherently a single-molecule technique, might be particularly fruitful for looking at conformational dynamics at high spatial resolution (*71*, *72*). Much can also be gained by carefully choosing systems that are amenable to both experimental and computational analysis. Recent examples include elucidating the molecular details that underlie the alternating access mechanism in a minimal sugar transporter (*73*) and an atomic-level description of interactions that lead to barrier roughness in protein folding (*74*).

The overwhelming growth of sequence data also presents new opportunities for computational chemists seeking to understand macromolecular structure and function. Evolution is, after all, governed by the same physical forces that simulations are constructed to model. One point of convergence has been the use of evolutionary records to construct statistical models of amino acid sequences (*75*, *76*). Conversely, computational biophysics can guide interpretations of what mutations do to proteins when analyzing exome sequencing for patient diagnosis (*76*–*78*). Finally, large-scale deep mutational scanning experiments can provide comprehensive maps of the mutational effects on protein stability across entire proteins (*79*) and enable a deeper understanding and benchmarking of our ability to predict the consequences of mutations (*76*, *80*).

We thus anticipate that simulations will eventually be commonplace when studying the effect of drugs and mutations and will play an essential role in the future of bioengineering in the same way that computer modeling is used today in computational prototyping of cars and buildings.

This is an article distributed under the terms of the Science Journals Default License.

## References and Notes

**Acknowledgments:**We thank Y. Wang for providing part of Fig. 1. Parts of Fig. 1A were designed by Freepik.

**Funding:**We acknowledge funding from the Velux Foundations, the Lundbeck Foundation BRAINSTRUC initiative, and a Hallas-Møller stipend from the Novo Nordisk Foundation.

**Competing interests:**None declared.