Research ArticlesPhysics

Solving the quantum many-body problem with artificial neural networks

See allHide authors and affiliations

Science  10 Feb 2017:
Vol. 355, Issue 6325, pp. 602-606
DOI: 10.1126/science.aag2302

Machine learning and quantum physics

Elucidating the behavior of quantum interacting systems of many particles remains one of the biggest challenges in physics. Traditional numerical methods often work well, but some of the most interesting problems leave them stumped. Carleo and Troyer harnessed the power of machine learning to develop a variational approach to the quantum many-body problem (see the Perspective by Hush). The method performed at least as well as state-of-the-art approaches, setting a benchmark for a prototypical two-dimensional problem. With further development, it may well prove a valuable piece in the quantum toolbox.

Science, this issue p. 602; see also p. 580


The challenge posed by the many-body problem in quantum physics originates from the difficulty of describing the nontrivial correlations encoded in the exponential complexity of the many-body wave function. Here we demonstrate that systematic machine learning of the wave function can reduce this complexity to a tractable computational form for some notable cases of physical interest. We introduce a variational representation of quantum states based on artificial neural networks with a variable number of hidden neurons. A reinforcement-learning scheme we demonstrate is capable of both finding the ground state and describing the unitary time evolution of complex interacting quantum systems. Our approach achieves high accuracy in describing prototypical interacting spins models in one and two dimensions.

The wave function Embedded Image is a fundamental object in quantum physics and possibly the hardest to grasp in the classical world. Embedded Image is a monolithic mathematical quantity that contains all of the information on a quantum state, be it a single particle or a complex molecule. In principle, an exponential amount of information is needed to fully encode a generic many-body quantum state. However, wave functions representing many physical many-body systems can be characterized by an amount of information much smaller than the maximum capacity of the corresponding Hilbert space. A limited amount of quantum entanglement and a small number of physical states in such systems enable modern approaches to solve the many-body Schrödinger’s equation with a limited amount of classical resources.

Numerical approaches directly relying on the wave function can either sample a finite number of physically relevant configurations or perform an efficient compression of the quantum state. Stochastic approaches, like quantum Monte Carlo (QMC) methods, belong to the first category and rely on probabilistic frameworks typically demanding a positive semidefinite wave function (13). Compression approaches instead rely on efficient representations of the wave function, such as in terms of matrix product states (MPS) (46) or more general tensor networks (79). However, examples of systems in which existing approaches fail are numerous, mostly owing to the sign problem in QMC (10) and to the inefficiency of current compression approaches in high-dimensional systems. As a result, despite the notable success of these methods, a large number of unexplored regimes exist, including many open problems. These encompass fundamental questions ranging from the dynamical properties of high-dimensional systems (11, 12) to the exact ground-state properties of strongly interacting fermions (13, 14). At the heart of this lack of understanding lies the difficulty in finding a general strategy to reduce the exponential complexity of the full many-body wave function down to its most essential features (15).

In a much broader context, the problem resides in the realm of dimensional reduction and feature extraction. Among the most successful techniques to attack these problems, artificial neural networks play a prominent role (16). They can perform exceedingly well in a variety of contexts ranging from image and speech recognition (17) to game playing (18). Very recently, applications of neural networks to the study of physical phenomena have been introduced (1923). These have so far focused on the classification of complex phases of matter, when exact sampling of configurations from these phases is possible. The challenging goal of solving a many-body problem without prior knowledge of exact samples is nonetheless still unexplored, and the potential benefits of artificial intelligences in this task are, at present, substantially unknown. Therefore, it is of fundamental and practical interest to understand whether an artificial neural network can modify and adapt itself to describe and analyze such a quantum system. This ability could then be used to solve the quantum many-body problem in regimes that have traditionally been inaccessible to existing exact numerical approaches.

Here we introduce a representation of the wave function in terms of artificial neural networks specified by a set of internal parameters Embedded Image. We present a stochastic framework for reinforcement learning of the parameters Embedded Image , allowing for the best possible representation of both ground-state and time-dependent physical states of a given quantum Hamiltonian Embedded Image. The parameters of the neural network are then optimized (trained, in the language of neural networks), either by static variational Monte Carlo (VMC) sampling (24) or time-dependent VMC (25, 26), when dynamical properties are of interest. We validate the accuracy of this approach by studying the Ising and Heisenberg models in both one and two dimensions. The power of the neural-network quantum states (NQS) is demonstrated, obtaining state-of-the-art accuracy in both ground-state and out-of-equilibrium dynamics.

Neural-network quantum states

Consider a quantum system with N discrete-valued degrees of freedom Embedded Image, which may be spins, bosonic occupation numbers, or similar. The many-body wave function is a mapping of the N-dimensional set Embedded Image to (exponentially many) complex numbers that fully specify the amplitude and the phase of the quantum state. The point of view we take here is to interpret the wave function as a computational black box which, given an input many-body configuration Embedded Image, returns a phase and an amplitude according to Embedded Image. Our goal is to approximate this computational black box with a neural network, trained to best represent Embedded Image. Different possible choices for the artificial neural-network architectures have been proposed to solve specific tasks, and the best architecture to describe a many-body quantum system may vary from one case to another. For the sake of concreteness, we henceforth specialize our discussion to restricted Boltzmann machine (RBM) architectures and apply them to describe spin-½ quantum systems. In this case, RBM artificial networks are constituted by one visible layer of N nodes, corresponding to the physical spin variables in a chosen basis (e.g., Embedded Image) and a single hidden layer of M auxiliary spin variables Embedded Image (Fig. 1). This description corresponds to a variational expression for the quantum states Embedded Imagewhere Embedded Image is a set of M hidden spin variables and the network parameters Embedded Image fully specify the response of the network to a given input state Embedded Image. Because this architecture features no intralayer interactions, the hidden variables can be explicitly traced out, and the wave function reads Embedded Image, where Embedded Image. The network weights are, in general, to be taken complex-valued to provide a complete description of both the amplitude and the phase of the wave function.

Fig. 1 Artificial neural network encoding a many-body quantum state of N spins.

A restricted Boltzmann machine architecture that features a set of N visible artificial neurons (yellow dots) and a set of M hidden neurons (gray dots) is shown. For each value of the many-body spin configuration Embedded Image, the artificial neural network computes the value of the wave function Embedded Image.

The mathematical foundations for the ability of NQS to describe intricate many-body wave functions are the established representability theorems (2729), which guarantee the existence of network approximates of sufficiently smooth and regular high-dimensional functions. If these conditions are satisfied by the many-body wave function, we can reasonably expect the NQS form to be a sensible choice. One of the practical advantages of this representation is that its quality can, in principle, be systematically improved by increasing the number of hidden variables. The number M (or, equivalently, the density α = M/N) then plays a role analogous to the bond dimension for the MPS. However, the correlations induced by the hidden units are intrinsically nonlocal in space and are therefore well suited to describe quantum systems in arbitrary dimension. Another convenient point of the NQS representation is that it can be formulated in a way that conserves some specific symmetries. For example, lattice translation symmetry can be used to reduce the number of variational parameters of the NQS ansatz, in the spirit of shift-invariant RBMs (30, 31). Concretely, for integer hidden-variable density α = 1,2,…, the weight matrix takes the form of feature filters Embedded Image for Embedded Image. These filters have a total of αN variational elements in lieu of the αN2 elements of the asymmetric case (see supplementary materials).

Given a general expression for the quantum many-body state, we are now left with the task of solving the many-body problem by using machine learning to optimize the network parameters Embedded Image. In the most interesting applications, the exact many-body state is unknown, and it is typically found upon solving either the static Schrödinger equation Embedded Image or the time-dependent one Embedded Image for a given Hamiltonian Embedded Image. In the absence of samples drawn according to the exact wave function, supervised learning of Embedded Image is therefore not a viable option. Instead, we derive a consistent reinforcement learning approach in which either the ground-state wave function or the time-dependent one is learned on the basis of feedback from variational principles.

Ground state

To demonstrate the accuracy of the NQS in the description of complex many-body quantum states, we first focus on the goal of finding the best neural-network representation of the unknown ground state of a given Hamiltonian Embedded Image. In this context, reinforcement learning is realized through minimization of the expectation value of the energy Embedded Image with respect to the network weights Embedded Image. In the stochastic setting, this is achieved with an iterative scheme. At each iteration k, a Monte Carlo sampling of Embedded Image is realized for a given set of parameters Embedded Image. At the same time, stochastic estimates of the energy gradient are obtained. These are then used to propose a next set of weights Embedded Image with an improved gradient-descent optimization (32). The overall computational cost of this approach is comparable to that of standard ground-state QMC simulations (see supplementary materials).

To validate our scheme, we consider the problem of finding the ground state of two prototypical spin models, the transverse-field Ising (TFI) model and the antiferromagnetic Heisenberg (AFH) model. Their Hamiltonians areEmbedded ImageandEmbedded Imagerespectively, where Embedded Image are Pauli matrices.

In the following, we consider the case of both one- and two-dimensional (1D and 2D) lattices with periodic boundary conditions (PBCs). In Fig. 2, we show the optimal network structure of the ground states of the two spin models for hidden-variable density α = 4 and with imposed translational symmetries. We find that each filter Embedded Image learns specific correlation features emerging in the ground-state wave function. For example, in the 2D case (Fig. 2, rightmost panels) the neural network learns patterns corresponding to antiferromagnetic correlations. The general behavior of the NQS is completely analogous to that observed in convolutional neural networks, where different layers learn specific structures of the input data.

Fig. 2 Neural-network representation of the many-body ground states.

Results for prototypical spin models in one and two dimensions are shown. In the top group of panels, we show the feature maps for the 1D transverse-field Ising (TFI) model at the critical point h = 1, as well as for the antiferromagnetic Heisenberg (AFH) model. In both cases, the hidden-unit density is α = 4 and the lattices comprise 80 sites. Each horizontal colormap shows the values that the fth feature map Embedded Image takes on the jth lattice site (horizontal axis, broadened along the vertical direction for clarity). In the bottom group of panels, we show the feature maps for the 2D Heisenberg model on a square lattice, for α = 16. In this case, the horizontal (or vertical) axis of the colormaps corresponds to the x (or y) coordinates on a 10-by-10 square lattice. Each of the feature maps acts as an effective filter on the spin configurations, capturing the most important quantum correlations.

In Fig. 3, we show the accuracy of the NQS, quantified by the relative error on the ground-state energy Embedded Image, for several values of α and model parameters. In Fig. 3A, we compare the variational NQS energies with the exact result obtained by the fermionization of the TFI model, on a 1D chain with PBCs. The most notable result is that NQS achieve a controllable and arbitrary accuracy that is compatible with a power-law behavior in α. The hardest-to-learn ground state is at the quantum critical point h = 1, where nonetheless a notable accuracy of one part per million can be easily achieved with a relatively modest density of hidden units. The same accuracy is obtained for the more complex 1D AFH model (Fig. 3B). In this case, we also observe a systematic drop in the ground-state energy error, which, for a small α = 4, attains the same high precision obtained for the TFI model at the critical point. The accuracy of our model is several orders of magnitude higher than the spin-Jastrow ansatz (dashed line in Fig. 3B). It is also interesting to compare the value of α with the MPS bond dimension M needed to reach the same level of accuracy. For example, on the AFH model with PBCs, we find that with a standard density matrix renormalization group (DMRG) implementation (33), we need M ~ 160 to reach the accuracy NQS have at α = 4. This points toward a more compact representation of the many-body state in the NQS case, which features about three orders of magnitude fewer variational parameters than the corresponding MPS ansatz.

Fig. 3 Finding the many-body ground-state energy with neural-network quantum states (NQS).

The error of the NQS ground-state energy relative to the exact value is shown for several test cases. Arbitrary precision on the ground-state energy can be obtained upon increasing the hidden-unit density α. (A) Accuracy for the 1D TFI model, at a few values of the field strength h and for an 80-spin chain with periodic boundary conditions (PBCs). Points below 10–8 are not shown to enhance readability. (B) Accuracy for the 1D AFH model, for an 80-spin chain with PBCs, compared with the Jastrow ansatz (horizontal dashed line). (C) Accuracy for the AFH model on a 10-by-10 square lattice with PBCs, compared with the precision obtained by EPS [upper dashed line (35)] and PEPS [lower dashed line (36)]. For all cases considered here, the NQS approach reaches MPS-grade accuracies in one dimension and systematically improves the best known variational states for 2D finite lattice systems.

We next studied the AFH model on a 2D square lattice (for a comparison with QMC results, see Fig. 3C) (34). As expected from entanglement considerations, the 2D case proves harder for the NQS. Nonetheless, we always find a systematic improvement of the variational energy upon increasing α, qualitatively similar to the 1D case. The increased difficulty of the problem is reflected in a slower convergence. We still obtain results at the level of existing state-of-the-art methods or better. In particular, with a relatively small hidden-unit density (α ~ 4), we already obtain results at the same level as the best-known variational ansatz for finite clusters [the entangled plaquette states (EPS) of (35) and the projected entangled pair states (PEPS) of (36)]. Further increasing α then leads to a sizable improvement and, consequently, yields the best variational results reported to date for this 2D model on finite lattices.

Unitary dynamics

NQS are not limited to ground-state problems but can be extended to the time-dependent Schrödinger equation. For this purpose, we define complex-valued and time-dependent network weights Embedded Image that, at each time t, are trained to best reproduce the quantum dynamics, in the sense of the Dirac-Frenkel time-dependent variational principle (37, 38). In this context, the variational residualsEmbedded Imageare the objective functions to be minimized as a function of the time derivatives of the weights Embedded Image (see supplementary materials). In the stochastic framework, this is achieved by a time-dependent VMC method (25, 26), which samples Embedded Image at each time and provides the best stochastic estimate of the Embedded Image that minimizes R2(t), with a computational cost Embedded Image. Once the time derivatives are determined, these can be conveniently used to obtain the full time evolution after time integration.

To demonstrate the effectiveness of the NQS in the dynamical context, we consider the unitary dynamics induced by quantum quenches in the coupling constants of our spin models. In the TFI model, we induce nontrivial quantum dynamics by means of an instantaneous change in the transverse field: The system is initially prepared in the ground state of the TFI model for some transverse field hi and then evolves under the action of the TFI Hamiltonian with a transverse field Embedded Image. We compare our results with the analytical solution obtained from fermionization of the TFI model for a 1D chain with PBCs. In Fig. 4A, the exact results for the time-dependent transverse spin polarization are compared to NQS with α = 4. In the AFH model (Fig. 4B), we study quantum quenches in the longitudinal coupling Jz and monitor the time evolution of the nearest-neighbors correlations. Our results for the time evolution (with α = 4) are compared with the numerically exact MPS dynamics (3941) for a system with open boundaries (Fig. 4B).

Fig. 4 Many-body unitary time evolution with NQS.

NQS results (solid lines) for the time evolution induced by a quantum quench in the microscopic parameters of the models we study (the transverse field h for the TFI model and the coupling constant Jz in the AFH model) are shown. (A) Time-dependent transverse spin polarization in the TFI model, compared to exact results (dashed lines). (B) Time-dependent nearest-neighbors spin correlations in the AFH model, compared to exact numerical results obtained with t-DMRG (dashed lines). All results refer to 1D chains representative of the thermodynamic limit, with finite-size corrections smaller than the line widths.

The high accuracy also obtained for the unitary dynamics further confirms that neural-network–based approaches can be successfully used to solve the quantum many-body problem, not only for ground-state properties but also for modeling the evolution induced by a complex set of excited quantum states.


Variational quantum states based on artificial neural networks can be used to efficiently capture the complexity of entangled many-body systems in both one and two dimensions. Despite the simplicity of the restricted Boltzmann machines used here, very accurate results for both ground-state and dynamical properties of prototypical spin models can be readily obtained. Many paths for research can be envisaged in the near future. For example, the most recent advances in machine learning, like deep network architectures and convolutional neural networks, can constitute the basis of more advanced NQS and therefore have the potential for increasing their expressive power. Furthermore, the extension of our approach to treat quantum systems other than interacting spins is, in principle, straightforward. In this respect, applications to answer the most challenging questions concerning interacting fermions in two dimensions can already be anticipated. Finally, at variance with tensor network states, the NQS feature intrinsically nonlocal correlations, which can lead to substantially more compact representations of many-body quantum states. A formal analysis of the NQS entanglement properties might therefore bring about substantially new concepts in quantum information theory.

Supplementary Materials

Materials and Methods

Supplementary Text

Figs. S1 and S2

References (4345)

Code and Data Files

References and Notes

  1. Acknowledgments: We acknowledge discussions with F. Becca, J. F. Carrasquilla, M. Dolfi, J. Osorio, D. Patané, and S. Sorella. The time-dependent MPS results have been obtained with the open-source implementation available as a part of the Algorithms and Libraries for Physics Simulations (ALPS) project (33, 42). This work was supported by the European Research Council (ERC) through ERC Advanced Grant SIMCOFE, by the Swiss National Science Foundation through National Center of Competence in Research Quantum Science and Technology (QSIT), and by Microsoft Research. This paper is based on work supported in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA) via Massachusetts Institute of Technology Lincoln Laboratory Air Force contract no. FA8721-05-C-0002. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of ODNI, IARPA, or the U.S. government. The U.S. government is authorized to reproduce and distribute reprints for governmental purposes, notwithstanding any copyright annotation thereon. The authors agree to making the code used in this paper available upon reasonable request.
View Abstract

Navigate This Article