## Working out how to pack benzene in silico

Many organic compounds crystallize in several different energetically similar packing arrangements, or polymorphs. This complicates processes such as drug formulation that rely on reproducible crystallization. Yang *et al.* have now achieved the long-standing goal of calculating a crystal packing arrangement from first principles to an accuracy that can distinguish polymorphs (see the Perspective by Price). They used benzene as a prototypical test case and applied quantum chemical methods that improve estimates of multibody interactions. The results bode well for future applications of theory to optimization of crystallization protocols.

## Abstract

Computation of lattice energies to an accuracy sufficient to distinguish polymorphs is a fundamental bottleneck in crystal structure prediction. For the lattice energy of the prototypical benzene crystal, we combined the quantum chemical advances of the last decade to attain sub-kilojoule per mole accuracy, an order-of-magnitude improvement in certainty over prior calculations that necessitates revision of the experimental extrapolation to 0 kelvin. Our computations reveal the nature of binding by improving on previously inaccessible or inaccurate multibody and many-electron contributions and provide revised estimates of the effects of temperature, vibrations, and relaxation. Our demonstration raises prospects for definitive first-principles resolution of competing polymorphs in molecular crystal structure prediction.

Crystal structure prediction is a scientific challenge affecting diverse fields ranging from pharmaceuticals to energy research. Two decades ago, Maddox argued that failures in a priori prediction amounted to “one of the continuing scandals in the physical sciences” (*1*). The crystal structure prediction (CSP) competitions held by the Cambridge Crystallographic Data Centre serve as convenient snapshots of progress (*2*). The twin tasks are (i) to compute lattice energies accurately enough to distinguish between competing low-energy polymorphs and (ii) to efficiently sample the energy landscapes. Sufficient progress has been made toward the latter goal that, for moderately complex crystals, the recent CSP2010 competition showed that sampling the landscape is no longer the bottleneck and that “accurate energy models can often prove to be the most challenging aspect of successful crystal structure prediction” (*2*). Organic molecular crystals exemplify this challenge, because their polymorphs are often separated by energies of 1 kJ/mol or even less. Crystalline benzene, the simplest organic molecular crystal with aromatic π-π stacking (Fig. 1), is a prototypical system of this kind, with a wealth of low-energy polymorphs under pressure within the first few kJ/mol (*3*, *4*), and serves as the hydrogen atom of the field on which the accuracy of energy models is judged (*5*–*10*). The most successful energy method used in CSP2010, density functional theory with dispersion (DFT-D), is reliable to only ∼10 kJ/mol (*11*). Here, we show that, by using recent advances in electronic structure methods (*12*–*22*), we can now compute the benzene crystal lattice energy to better than 1 kJ/mol. The dominant remaining uncertainty at finite temperature then originates entirely from atomic motions.

We follow a fragment strategy, decomposing the Born-Oppenheimer (BO) lattice energy into monomer Δ*E _{i}*, dimer Δ

*E*, etc., contributions (

_{ij}*17*–

*24*) (1)as reviewed by Beran and colleagues (

*21*). Schweizer and Dunitz (

*5*) attempted the first ab initio fragment computation of crystalline benzene at the second-order Møller-Plesset level, obtaining a lattice energy >50 kJ/mol below the experimental estimate (−55.3 ± 2.2 kJ/mol, derived later in this work), concluding that reliable energies seemed remote (

*5*). Their work highlighted that, in organic molecular crystals, the lattice energy involves many small contributions, all of which must be converged to high accuracy. Specifically, this requires (i) convergence of the fragment types: dimers, trimers, etc.; (ii) for each type, convergence in the fragment cluster separation (range) out to infinite distance; and (iii) convergence of quantum chemical correlations of each fragment both to complete bases as well as to infinite-order electron processes.

Since Schweizer and Dunitz, many groups have tried to achieve convergence of these different contributions. Ringer and Sherrill considered dimers out to 9.4 Å at the level of coupled cluster (CC) with perturbative triples, obtaining an estimate of –56.4 kJ/mol (*6*). Bludsky, Rubes, and Soldán; Wen and Beran; and Podeszwa, Rice, and Szalewicz further included longer-range asymptotic corrections and partial trimer effects, obtaining lattice energies of –50.56 kJ/mol (*7*), –54.0 kJ/mol (*8*) (including lattice relaxation), and –50.33 kJ/mol (*9*) respectively. The current best theoretical estimates thus lie in the range of −50 to −56 kJ/mol. The uncertainty is improved but remains an order of magnitude larger than the 1 kJ/mol polymorph energy scale.

Here, we computed the benzene crystal lattice energy by using ab initio many-electron wave function methods. Wave function methods are essential because they can, in principle, be systematically improved until they converge to the exact result (*25*), which is not the case for DFT methods. We can now also achieve convergence in practice because of several recent advances made in our groups and in others. First, local correlation methods reduce the cost of quantum chemistry methods by orders of magnitude. We used the orbital-specific-virtual, explicitly correlated, local CC with perturbative triples [OSV-LCCSD(T0)-F12] method developed by us (*12*, *13*), where the dominant (triples) cost scales linearly with the number of atoms. Earlier work only partially treated trimers (if at all), but local methods now allow us to compute not only the very large numbers of trimers but even the tetramers needed to converge the fragment expansion. Second, arbitrary-order CC and density matrix renormalization group (DMRG) methods (*14*, *15*) enable the description of high-order quantum chemical correlation. Whereas previous efforts only considered at most triples (three-electron processes), we provide a description of electron correlations involving four or more electron processes, which turn out to be important. Third, explicit correlation methods (F12) (*16*) analytically capture short-range behavior of the electron-electron cusp. Compared with previous work, the use of explicitly correlated methods substantially reduces basis set errors. Together, these fundamental advances, combined with asymptotic expansions for distant fragments (*9*), allow us to converge contributions (i) to (iii) to beyond the polymorph energy scale.

The 0-K lattice energy is the difference in the total (electronic plus nuclear repulsion) energy at the 0-K equilibrium structure of the crystal and the corresponding total energy of the ideal molecular gas. Because the 0-K structure is not known, we have computed the lattice energy at the 138-K neutron diffraction structure of Bacon *et al*. (*26*), . Although lower-temperature (15- and 4-K) structures (*27*) are available, prior theoretical work has generally used the 138-K structure, enabling direct comparisons. Our objective was to demonstrate the attainable energy accuracy for this fixed crystal structure. However, we return to the lattice relaxation energy from 138 to 0 K, , in our discussion later.

We first decomposed the lattice energy into *E*_{HF}, the (mean-field) Hartree-Fock (HF) energy, and *E*_{c}, the correlation energy. For *E*_{HF}, we used the fragment expansion (Eq. 1) up to tetramers (supplementary section S1), which converged relatively quickly because of the nonpolar nature of the benzene monomer. The HF tetramer contribution (not previously considered) still contributes –0.10 kJ/mol. Comparison to a bulk periodic boundary condition HF calculation confirms convergence of the expansion to a value better than 0.05 kJ/mol. For each fragment, we recovered the basis set limit by using systematic Dunning basis sets (up to quintuple-zeta with extrapolation) (*28*, *29*), obtaining a final *E*_{HF} = 20.18 ± 0.04 kJ/mol.

Note that the mean-field lattice energy is positive; binding arises from electron correlations that cause van der Waals dispersion. We computed *E*_{c} by using the fragment expansion up to tetramers. For each fragment, we converged the correlation energy both with respect to the size of the one-particle basis (by using explicitly correlated F12 methods) and with respect to the level of the electron correlation treatment (doubles, triples, higher excitations, etc., in the CC framework, representing two-, three-, and higher-electron processes). For short-range dimers (<11 Å), we used canonical F12 CC theory up to perturbative triples excitations with augmented triple-zeta bases. We estimated the remaining basis set incompleteness error for the dimers to be 0.31 kJ/mol, an order of magnitude less for trimers and higher clusters (supplementary section S4). Correlation beyond triples was estimated from frozen-core CC up to perturbative quadruples and DMRG calculations in a small 6-31G basis (*30*), rescaled to an all-electron, complete basis estimate (supplementary section S5). Trimer and tetramer contributions were obtained from local F12 CC treatments (up to perturbative triples) with augmented triple-zeta bases (augmented only for trimers; see supplementary section S2). The trimer local approximation error is estimated at <0.1 kJ/mol. (supplementary section S3). Last, long-range fragments (>11 Å for dimers, >9 Å for trimers) used the asymptotic expansion of Podeszwa *et al*. (*9*) with an estimated error of <0.1 kJ/mol (supplementary section S6). The results are summarized in Table 1 and in Fig. 2.

Our total BO lattice energy at the 138-K structure amounts to = −54.58 ± 0.76 kJ/mol. Our error estimate sums errors in each contribution; thus, ±0.76 kJ/mol is an overestimate but is still less than the polymorph energy scale (1 kJ/mol), demonstrating the power of our energy prediction method. We can analyze our result in the light of representative earlier treatments. Our result is at the upper end of the earlier estimates ranging from −50 to −56 kJ/mol. The primary distinct contributions computed in our work (which have been either completely or partially neglected before) are the trimer, tetramer, four-electron, and core correlation contributions, which individually contribute +3.43, −0.38, +0.36, and −0.57 kJ/mol, respectively. Ringer and Sherrill’s lattice energy of −56.4 kJ/mol (*6*) is unexpectedly close to ours, even though they omit all the above effects, as well as long-range dimers (a further −1.48 kJ/mol), but the agreement is fortuituous because the separate components almost cancel (+3.43 − 0.38 + 0.36 − 0.57 − 1.48 = 1.36 kJ/mol). Multiple earlier estimates of the lattice energy are closer to −50.5 kJ/mol (*7*, *9*). The error in these lower numbers is from trimer estimates, which are more than 3.0 kJ/mol larger than our result, for example, resulting from earlier omission of attractive contributions to the trimer dispersion beyond third order in the symmetry-adapted perturbation theory (*31*). Furthermore, we see that the omitted contributions in earlier treatments are each significant and cannot be neglected, although there is some fortuitous cancellation. In total, this means a number of earlier estimates of the lattice energy need to be revised upward by more than 10%.

We have so far computed the lattice energy at the widely used experimental 138-K structure, but estimating the lattice energy at the hypothetical 0-K structure is also of interest. We obtained the relaxation of the crystal lattice energy from the 4-K neutron diffraction geometry (*26*, *27*), as well as the relaxation energy of the monomer (relaxation from the crystal monomer to the gas-phase optimized geometry), by using local coupled cluster [OSV-LCCSD(T0)-F12]. We find kJ/mol (the error estimate arises from uncertainty in the 0-K structure; see supplementary section S7). This gives kJ/mol, where we have separated the uncertainty due to the energy calculation and due to the unknown 0-K structure.

We now briefly compare to available experimental measurements. Experimental lattice energies derive from the equivalence of the sublimation enthalpy at 0 K to the lattice energy with zero-point energy (ZPE) contributions, and both the extrapolation of the sublimation enthalpy to 0 K as well as estimates of the ZPE introduce uncertainties (*10*, *32*). Sublimation enthalpies can be measured up to the benzene triple-point (278.674 K) but are commonly given as an extrapolation to room temperature (298.15 K). The 0-K lattice energy is then obtained from

Commonly reported experimental lattice energies range from −51 to −53 kJ/mol. This is in good agreement with earlier computations of the lattice energy but outside the error bars of the lattice energy computed in this work. Inaccuracies in the experimental extrapolation to 0 K are a possible culprit. We have carefully reanalyzed and reestimated the zero-point and thermal contributions (details in supplementary section S7), and these are summarized in Table 2. By using Δ*E*_{ZPE} = −4.8 ± 1.0 kJ/mol from (*10*) and kJ/mol [an average of the heat-capacity integration in (*9*) and the extrapolation of the equation of state in (*33*)], we find a total correction of −10.6 kJ/mol to the room-temperature sublimation enthalpy, more than 2.0 to 4.0 kJ/mol larger than earlier corrections of −6 to −8 kJ/mol. Our new extrapolated 0-K experimental lattice energy is then −55.3 ± 2.2 kJ/mol, in very good agreement with our theoretical lattice energy and almost 10% higher than earlier values. However, the sub-kJ/mol accuracy of the theoretical lattice energy remains much higher than that of the extrapolated experimental estimate, demonstrating the predictive power of converged theoretical calculations even in the condensed phase.

In CSP2010, several groups obtained correct structures for a polymorphic hydrate in their extended candidate lists but were unable to rank the candidates with sufficient accuracy to obtain the correct lowest energy structure, placing the true lowest energy polymorph more than 12 kJ/mol above others (*2*). The methods used in this work would eliminate this problem. When combined with efficient candidate structure generation, our work raises the prospects of the first-principles resolution of competing polymorphs, a long-standing goal of molecular crystal structure prediction.

## Supplementary Materials

www.sciencemag.org/content/345/6197/640/suppl/DC1

Materials and Methods

Supplementary Text

Fig. S1

Tables S1 to S5

## References and Notes

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
**Acknowledgments:**We acknowledge many helpful discussions with C. D. Sherrill, G. J. O. Beran, and R. Podeszwa. We especially thank R. Podeszwa for providing the detailed data on dimer and trimer geometries and A. Tkatchenko for providing unpublished ZPE data. Work performed by J.Y., W.H., and G.K.-L.C. was supported by the U.S. Department of Energy under grant no. DE-SC0008624, with secondary support from grant no. DE-SC0010530. The DMRG software package BLOCK used in part of this work was developed with primary funding from NSF grant no. OCI-1265278. BLOCK implements theoretical methods developed under NSF grant no. CHE-1265277. D.M. was supported by the U.S. Department of Energy through a Computational Science Graduate Fellowship, funded by grant no. DE-FG02-97ER25308. D.U. and M.S. acknowledge financial support from the Deutsche Forschungsgemeinschaft (DFG), grants US-103/1-1 and SCHU 1456/12-1.