## Keeping the lid on infection spread

From February to April 2020, many countries introduced variations on social distancing measures to slow the ravages of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Publicly available data show that Germany has been particularly successful in minimizing death rates. Dehning *et al.* quantified three governmental interventions introduced to control the outbreak. The authors predicted that the third governmental intervention—a strict contact ban since 22 March—switched incidence from growth to decay. They emphasize that relaxation of controls must be done carefully, not only because there is a 2-week lag between a measure being enacted and the effect on case reports but also because the three measures used in Germany only just kept virus spread below the growth threshold.

*Science*, this issue p. eabb9789

## Structured Abstract

### INTRODUCTION

When faced with the outbreak of a novel epidemic such as coronavirus disease 2019 (COVID-19), rapid response measures are required by individuals, as well as by society as a whole, to mitigate the spread of the virus. During this initial, time-critical period, neither the central epidemiological parameters nor the effectiveness of interventions such as cancellation of public events, school closings, or social distancing is known.

### RATIONALE

As one of the key epidemiological parameters, we inferred the spreading rate λ from confirmed SARS-CoV-2 infections using the example of Germany. We apply Bayesian inference based on Markov chain Monte Carlo sampling to a class of compartmental models [susceptible-infected-recovered (SIR)]. Our analysis characterizes the temporal change of the spreading rate and allows us to identify potential change points. Furthermore, it enables short-term forecast scenarios that assume various degrees of social distancing. A detailed description is provided in the accompanying paper, and the models, inference, and forecasts are available on GitHub (https://github.com/Priesemann-Group/covid19_inference_forecast). Although we apply the model to Germany, our approach can be readily adapted to other countries or regions.

### RESULTS

In Germany, interventions to contain the COVID-19 outbreak were implemented in three steps over 3 weeks: (i) Around 9 March 2020, large public events such as soccer matches were canceled; (ii) around 16 March 2020, schools, childcare facilities, and many stores were closed; and (iii) on 23 March 2020, a far-reaching contact ban (*Kontaktsperre*) was imposed by government authorities; this included the prohibition of even small public gatherings as well as the closing of restaurants and all nonessential stores.

From the observed case numbers of COVID-19, we can quantify the impact of these measures on the disease spread using change point analysis. Essentially, we find that at each change point the spreading rate λ decreased by ~40%. At the first change point, assumed around 9 March 2020, λ decreased from 0.43 to 0.25, with 95% credible intervals (CIs) of [0.35, 0.51] and [0.20, 0.30], respectively. At the second change point, assumed around 16 March 2020, λ decreased to 0.15 (CI [0.12, 0.20]). Both changes in λ slowed the spread of the virus but still implied exponential growth (see red and orange traces in the figure).

To contain the disease spread, i.e., to turn exponential growth into a decline of new cases, the spreading rate has to be smaller than the recovery rate μ = 0.13 (CI [0.09, 0.18]). This critical transition was reached with the third change point, which resulted in λ = 0.09 (CI [0.06, 0.13]; see blue trace in the figure), assumed around 23 March 2020.

From the peak position of daily new cases, one could conclude that the transition from growth to decline was already reached at the end of March. However, the observed transient decline can be explained by a short-term effect that originates from a sudden change in the spreading rate (see Fig. 2C in the main text).

As long as interventions and the concurrent individual behavior frequently change the spreading rate, reliable short- and long-term forecasts are very difficult. As the figure shows, the three example scenarios (representing the effects up to the first, second, and third change point) quickly diverge from each other and, consequently, span a considerable range of future case numbers.

Inference and subsequent forecasts are further complicated by the delay of ~2 weeks between an intervention and the first useful estimates of the new λ (which are derived from the reported case numbers). Because of this delay, any uncertainty in the magnitude of social distancing in the previous 2 weeks can have a major impact on the case numbers in the subsequent 2 weeks. Beyond 2 weeks, the case numbers depend on our future behavior, for which we must make explicit assumptions. In sum, future interventions (such as lifting restrictions) should be implemented cautiously to respect the delayed visibility of their effects.

### CONCLUSION

We developed a Bayesian framework for the spread of COVID-19 to infer central epidemiological parameters and the timing and magnitude of intervention effects. With such an approach, the effects of interventions can be assessed in a timely manner. Future interventions and lifting of restrictions can be modeled as additional change points, enabling short-term forecasts for case numbers. In general, our approach may help to infer the efficiency of measures taken in other countries and inform policy-makers about tightening, loosening, and selecting appropriate measures for containment of COVID-19.

## Abstract

As coronavirus disease 2019 (COVID-19) is rapidly spreading across the globe, short-term modeling forecasts provide time-critical information for decisions on containment and mitigation strategies. A major challenge for short-term forecasts is the assessment of key epidemiological parameters and how they change when first interventions show an effect. By combining an established epidemiological model with Bayesian inference, we analyzed the time dependence of the effective growth rate of new infections. Focusing on COVID-19 spread in Germany, we detected change points in the effective growth rate that correlate well with the times of publicly announced interventions. Thereby, we could quantify the effect of interventions and incorporate the corresponding change points into forecasts of future scenarios and case numbers. Our code is freely available and can be readily adapted to any country or region.

During the initial outbreak of an epidemic, reliable short-term forecasts are key to estimate medical requirements and capacities and to inform and advise the public and decision makers (*1*). During this initial phase, three tasks are important to provide time-critical information for crisis mitigation: (i) establishing central epidemiological parameters, such as the basic reproduction number, that can be used for short-term forecasting; (ii) simulating the effects of different possible interventions aimed at the mitigation of the outbreak; and (iii) estimating the actual effects of the measures taken not only to make rapid adjustments but also to adapt short-term forecasts. Addressing these tasks is challenging because of the large statistical and systematic errors that occur during the initial stages of an epidemic, when case numbers are low. This is further complicated by the fact that mitigation measures are taken rapidly while the outbreak unfolds, but they take effect only after an unknown delay. To obtain reasonable parameter estimates for short-term forecasting and policy evaluation despite these complications, any prior knowledge available needs to be integrated into modeling efforts to reduce uncertainties. This includes knowledge about basic mechanisms of disease transmission, recovery, and preliminary estimates of epidemiological parameters from other countries or from closely related pathogens. The integration of prior knowledge, the quantitative assessment of the remaining uncertainties about epidemiological parameters, and the principled propagation of these uncertainties into forecasts is the domain of Bayesian modeling and inference (*2*, *3*).

We draw on an established class of models for epidemic outbreaks: The susceptible-infected-recovered (SIR) model (*4*–*7*) specifies population compartments and the rates at which they change (susceptible people becoming infectious and infectious people recovering). This simple model can be formulated in terms of coupled ordinary differential equations (in mean field), which enable analytical treatment (*8*, *9*) or fast evaluation (ideally suited for Bayesian inference). Accordingly, SIR-like models have been used to model epidemic spreads, from Bayesian Markov chain Monte Carlo (MCMC) parameter estimation (*10*–*12*) to detailed scenario discussions (*13*–*16*). This family of models has played a dominant role in the analyses of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic, from inference (*17*–*19*) through scenario forecasting (*20*–*27*) to control strategies (*28*, *29*).

Here, we combined the SIR model (and generalizations thereof) with Bayesian parameter inference and augmented the model with a time-dependent spreading rate. The time dependence was implemented as potential change points in the spreading rate, which we assume to be driven by governmental interventions and the associated change of individual behavior (nonpharmaceutical interventions). On the basis of three distinct measures taken in Germany, we detected three corresponding change points in the reported COVID-19 case numbers. By 1 April 2020, we had reported evidence for the first two change points and predicted the third (*30*). Now, with data until 21 April 2020, we have evidence for all three change points. First, the spreading rate decreased from 0.43 (with 95% credible interval, CI [0.35,0.51]) to 0.25 (CI [0.20,0.30]). This inferred decrease was initiated around 7 March 2020 (CI [3,10]) and matches the timing of cancellation of large public events such as trade fairs and soccer matches. Second, the spreading rate decreased further to 0.15 (CI [0.12,0.20]). This decrease was initiated around 16 March 2020 (CI [14,18]) and matches closure of schools, childcare facilities, and nonessential stores. Third, the spreading rate decreased further to 0.09 (CI [0.06,0.13]). This decrease was initiated around 24 March 2020 (CI [21,26]) and matches the strict contact ban (including the closing of all restaurants and nonessential stores), which was announced on 22 March 2020. Whereas already the first two change points strongly slowed the spread of the virus, the third change point can be associated with the start of a sustained decline in daily new cases.

Our framework is designed to infer the effectiveness of past measures and to explore potential future scenarios, along with propagating the respective uncertainties. In the following, we demonstrate the potential impact of timing and magnitude of change points and report our inference about the three past governmental interventions in Germany. Our framework can be readily adapted to any other country or region. The code (already including data sources from many other countries) and figures are available on GitHub (*31*).

## Basic inference of central epidemiological parameters during the initial phase of the COVID-19 outbreak in Germany

To assess the general effect of different possible interventions on the spread of COVID-19 in Germany, we first focused on the initial phase of the outbreak, when no serious mitigation measures had been implemented. In the absence of interventions, an epidemic outbreak can be described by SIR models with a constant spreading rate (see the materials and methods). In Germany, the first serious interventions occurred around 9 March 2020 and affected the case reports after an observation delay [a combination of incubation period with a median of 5 to 6 days (*32*)] and a test delay (the time until a doctor is visited plus test evaluation time, which we assumed to be ~2 to 3 days, combined). Thus, to infer central epidemiological parameters, we considered the initial phase to be 2 March 2020 to 15 March 2020. The central epidemiological parameters estimated here will also be estimated under the full model with change points on the data records up to 21 April 2020, allowing for a consistency check.

We performed Bayesian inference for the central epidemiological parameters of an SIR model using MCMC sampling (Fig. 1). The central parameters are the spreading rate λ, the recovery rate μ, the reporting delay *D*, and the number of initially infected people *I*_{0}. We chose informative priors based on available knowledge for λ, μ, and *D*, and we chose uninformative priors for the remaining parameters (see the materials and methods). Also, we intentionally kept the informative priors as broad as possible so that the data would constrain the parameters (Fig. 1).

As median estimates, we obtained λ = 0.41, μ = 0.12, *D* = 8.6, and *I*_{0} = 19 (see Fig. 1, C to H, for the posterior distributions and the 95% CIs). Converting to the basic reproduction number *R*_{0} = λ/μ, we found a median *R*_{0} = 3.4 (CI [2.4,4.7]), which is consistent with previous reports finding median values between 2.3 and 3.3 (*18*, *33*, *34*). Overall, the model exhibited good agreement with new cases (Fig. 1A) and cumulative cases (Fig. 1B), both of which show the expected exponential growth (linear in log-lin plot). The observed data were clearly informative about λ, *I*_{0}, and σ, as indicated by the difference between the priors (gray line) and posteriors (histograms) in Fig. 1, D to F. However, μ and *D* were largely determined by our prior choice of parameters (histograms match the gray line in Fig. 1, C and H). This is to be expected for the initial phase of an epidemic outbreak, which is dominated by exponential growth.

To quantify the impact of possible interventions, we concentrated on the effective growth of active infections before and after the intervention. As long as the number of infections and recoveries are small compared with the population size, the number of active infections can be approximated by an exponential growth with effective growth rate λ* = λ – μ (see the materials and methods). As a consequence, λ and μ cannot be estimated independently. This was further supported by a systematic scan of the model’s log-likelihood in the (λ – μ) space, which showed an equipotential line for the maximum likelihood (Fig. 1I). This strongly suggests that the growth rate λ* is the relevant free parameter with a median λ* = 28% (Fig. 1G). The control parameter of the dynamics in the exponential phase is thus the (effective) growth rate: If the growth rate is larger than zero (λ > μ), then case numbers grow exponentially; if, however, the growth rate is smaller than zero (λ < μ), then recovery dominates and new cases decrease. The two different dynamics (supercritical and subcritical, respectively) are separated by a critical point at λ* = 0 (λ = μ) (*35*).

## Magnitude and timing of interventions matter for the mitigation of the outbreak

To simulate the effect of different possible interventions, we modeled the effects of interventions as change points in the spreading rate (see the materials and methods). We considered different, hypothetical interventions after the initial phase to show that both the amount of change in behavior (leading to a change in spreading rate λ; Fig. 2A) and the exact timing of the change (Fig. 2B) determine future development. Hypothetical interventions build on the inferred parameters from the initial phase (in particular median λ_{0} = 0.41 and median μ = 0.12; Fig. 1) and were implemented as change points in the spreading rate from the inferred λ_{0} to a new value λ_{1}. With such a change point, we modeled three potential scenarios of public behavior.

### (i) No social distancing

Public behavior is unaltered and spread continues with the inferred rate (λ_{1} = λ_{0} with median λ_{1} = 0.41 > μ).

### (ii) Mild social distancing

The spreading rate decreases to 50% (λ_{1} = λ_{0}/2 with median λ_{1} = 0.21 > μ). Although people effectively reduce the number of contacts by a factor of two in this second scenario, the total number of reported cases continues to grow alongside scenario (i) for the time period of the reporting delay *D* (median *D* = 8.6 from initial phase; see below for a more constrained estimation). Also, we still observe an exponential increase of new infections after the intervention becomes effective, because the growth rate remains positive, λ_{1}* = λ_{1} –μ > 0.

### (iii) Strong social distancing

Here, the spreading rate decreases to 10% (λ_{1} = λ_{0}/10 with median λ_{1} = 0.04 < μ). The assumptions here are that contacts are severely limited, but even when people stay at home as much as possible, some contacts are still unavoidable. Even under such drastic policy changes, no effect is visible until the reporting delay *D* is over. Thereafter, a quick decrease in daily new infections manifests within 2 weeks (delay plus change point duration), and the total number of cases reaches a stable plateau. Only in this scenario (iii) a plateau is reached, because here the growth rate becomes negative, λ_{1}* < 0, which leads to decreasing numbers of new infections.

Furthermore, the timing of an intervention matters: Apart from the strength of an intervention, its onset time has great impact on the total case number (Fig. 2B). For example, focusing on the strong intervention (iii), by which a stable plateau is reached, the effect of advancing or delaying the change point by just five days leads to more than a threefold difference in cumulative cases.

Whereas we found that the timing of an intervention has a great effect on case numbers, the duration over which the change takes place has only minor effect if the intervals of change are centered around the same date. In Fig. 2C, we illustrate the adjustment of λ_{0} → λ_{1} for mild social distancing with durations of 14, 7, and 1 day(s). The change point duration is a simple way to incorporate variability in individual behavior and is not linked to the reporting delay *D*. As an interesting effect, a sudden change in the spreading rate can lead to a temporary decrease of new case numbers despite the fact that the effective growth rate remains positive at all times.

## Change point detection for the spread of COVID-19 in Germany

To model real-world data, we further refined the SIR model. We accounted for systematic variations of case reports throughout the week (lower case numbers toward the weekend) by explicitly modeling a weekly reporting modulation (see the materials and methods). Comparisons confirmed that models with this correction outperformed those without (see table S2). In the supplemental material, we further generalize our model to include an explicit incubation period [as in susceptible-exposed-infectious-recovered (SEIR) models; fig. S3] that yields results consistent with our main model.

We incorporated the effect of nonpharmaceutical interventions into our models by introducing flexible change points in the spreading rate (see the materials and methods). During the COVID-19 outbreak in Germany, governmental interventions occurred in three stages from (i) the cancellation of large events with more than 1000 participants (around 9 March 2020), through (ii) closing of schools, childcare facilities, and most stores (in effect 16 March 2020), to (iii) the contact ban and closing of all nonessential stores (in effect 23 March 2020). The aim of all these interventions was to reduce the (effective) growth rate λ* = λ – μ. As soon as the growth rate becomes notably negative (λ* < 0), the number of daily new cases decreases after the respective reporting delay.

Detecting change points in the spreading rate and quantifying the amount of change as quickly as possible becomes a central modeling challenge when short-term forecasts are required. To address this challenge, we assumed an initial spreading rate λ_{0} (the exponential growth phase; Fig. 1) and up to three potential change points motivated by German governmental interventions: In our modeling, the first change point (λ_{0} → λ_{1}) is expected around 9 March 2020 (*t*_{1}) as a result of the official recommendations to cancel large events. A second change point (λ_{1} → λ_{2}) is expected around 16 March 2020 (*t*_{2}), when schools and many stores were closed. A third change point (λ_{2} → λ_{3}) is expected around 23 March 2020 (*t*_{3}), when all nonessential stores were closed and a contact ban was enacted. We modeled the behavioral changes that were introduced at these change points to unfold over a few days (Δ*t _{i}*), but the changes in duration can be partly compensated by changes in the onset time (

*t*) (see Fig. 2C, scenarios). We chose priors for all parameters based on the information available to us up to 28 March 2020 (see the materials and methods). In addition, we performed a sensitivity analysis by using wider priors in the supplemental material (figs. S5 to S7 and table S2), which yielded consistent results. On 28 March 2020, the data were already informative about the first change point, and thereby helped to inform our forecast scenarios.

_{i}The inferred parameters for the models with change points are consistent with the inferred parameters from the exponential onset phase (Figs. 1 and 3 and figs. S1 and S2). In particular, all estimated λ_{0} values from models with multiple change points are compatible with the value of the model without change points (during the exponential onset phase, λ_{0} = 0.41, CI [0.32,0.51], assuming a stationary λ until 15 March 2020; Fig. 1E). Also, the scale factor σ and the number of initial infections *I _{0}* for the models with change points are consistent with the initial model inference during the exponential onset phase (Fig. 1, D to F).

## Models with two or three change points fit the observed data better

The models with three change points describe the data better than models with fewer change points, as indicated by the leave-one-out (LOO) cross-validation–based Bayesian model comparison (*36*) (lowest LOO score in Table 1). However, the LOO scores of the model with two and three change points differ by <1 SE. This originates from an extended duration of the second change point in the two-change-point model, which partially captures the effect of the third intervention. As expected, the models with none or a single change point had LOO scores that were at least 1 SE higher (worse) than those of the best models, and we did not consider them further.

When comparing our inference based on three change points with the number of confirmed cases, we found them to largely match (Fig. 3, B and C). The dominant periodic change in the daily new reported cases (Fig. 3B) was well described by the weekday modulation. In addition to the periodic change, the daily new case numbers also reflect the fairly sudden change of the spreading rate at the change points (compare Fig. 2 and fig. S4 for the effect of change points without the modulation). The cumulative effect of change points manifested in an overarching decay in new case numbers that was visible after 5 April 2020 and followed the third change point (with reporting delay).

## Change point detection quantifies the effect of nonpharmaceutical interventions

Ideally, detected changes can be related to specific mitigation measures so that one gains insights into the effectiveness of different measures (Fig. 3). Our model comparison favored the model with three change points with the following posteriors of the parameters: First, λ(*t*) decreased from λ_{0} = 0.43 (CI [0.35,0.51]) to λ_{1} = 0.25 (CI [0.20,0.30]). The date of the change point was inferred to be 7 March 2020 (CI [3,10]), and this inferred date matches the timing of the first governmental intervention, which included cancellations of large events, as well as increased awareness. After this first intervention, the (effective) growth rate λ*(*t*) = λ(*t*) – μ decreased by more than a factor of 2, from median λ_{0} – μ = 0.3 to median λ_{1} – μ = 0.12, given that the recovery rate was inferred as μ = 0.13 (CI [0.09,0.18]). At the second change point, λ(*t*) decreased from λ_{1} = 0.25 to λ_{2} = 0.15 (CI [0.12,0.20]), which is larger than our prior assumption. The date of this change point was inferred to be 16 March 2020 (CI [14,18]), and this inferred date matches the timing of the second governmental intervention including closing schools and some stores. After this second intervention, the median growth rate became λ*(*t*) = λ_{2} – μ = 0.02 ≈ 0 (CI [0.00, 0.06]) and is thus in the vicinity of the critical point yet still slightly positive. The first two interventions thereby mitigated the spread of COVID-19 in Germany by drastically reducing the growth rate, but did most likely not lead to a sustained decline of new infections. A third change point, when λ(*t*) decreased from λ_{2} = 0.15 to λ_{3} = 0.09 (CI [0.06,0.13]), was inferred on 24 March 2020 (CI [21,26]), and this inferred date matches the timing of the third governmental intervention including contact ban and closing of all nonessential stores. Only after this third intervention did the median (effective) growth rate, λ*(*t*) = λ_{3} – μ = –0.03 < 0 (CI [–0.05,–0.02]), become negative, enabling a sustained decrease in the number of new infections. In summary, we have related the change points to the nonpharmaceutical interventions to quantify their effect.

## Discussion

We have presented a Bayesian approach for monitoring of the effect of nonpharmaceutical interventions on the epidemic outbreak of an infectious disease. Using the example of the COVID-19 outbreak in Germany, we applied this approach to infer the central epidemiological parameters and three change points in the spreading rate from the number of reported cases. We showed that change points in the spreading rate affect the confirmed case numbers with a delay of ~2 weeks (median reporting delay of *D* = 11.4 days plus a median change point duration of 3 days). Thereby, we were able to relate the inferred change points to the three major governmental interventions in Germany: In our model with three change points, we found a clear reduction of the spreading rate related to each governmental intervention and the concurring adaptation of individual behavior (Fig. 3), (i) the cancellation of large events with >1000 participants (around 9 March 2020), (ii) the closing of schools, childcare centers, and most stores (in effect 16 March 2020), and (iii) the contact ban and closing of all nonessential stores (in effect 23 March 2020).

Our results suggest that the full extent of interventions and the concurring adaptation of behavior lead to a swift and sustained decrease of daily new cases. The first two interventions brought a reduction of the growth rate λ* from 30% to 12% and down to 2% (CI [0%, 6%]), respectively. These numbers still implied the possibility of exponential growth. Only with the third intervention did we find that the epidemic changed from growth to decay. However, the decay rate of ~–3% (CI [–5%,–2%]) remained close to zero. Therefore, even a minor increase in the spreading rate may again switch the dynamics to the unstable regime with exponential growth.

We used a formal Bayesian model comparison to validate the presence of change points. Our model comparison ruled out models with fewer than two change points (Table 1 and table S2). Although this may seem trivial, it has important consequences for making short-term forecasts that decision makers rely on. Demonstrating and quantifying the effect of past change points can be used to formulate priors for the effects of future, similar change points. These priors help to project the effects of more recent change points into future forecasts, even when these change points are not yet apparent in the reported case numbers. Consequently, it is important to look out for and identify potential change points as early as possible to incorporate them into forecasts.

The detection of change points and their interpretation depend crucially on an accurate estimate of the reporting delay *D*. Therefore, the validity of its estimate should be evaluated. In our model, *D* contains at least three distinct factors: the biological incubation period (median 5 to 6 days) (*32*), an additional delay from first symptoms to symptoms motivating a test (1 to 3 days), and a possible delay before testing results come in (1 to 4 days). The sum of these delays seems compatible with our inferred median delay of *D* = 11.4 days, especially given the wide range of reported incubation periods.

We chose to keep our main model comparatively simple because of the small number of data points initially available during an epidemic outbreak. With few data points, only a limited number of parameters can be effectively constrained. Therefore, we chose to approximate a time-dependent spreading rate λ(*t*) by using episodes of constant spreading rates λ* _{i}* that are separated by change points where a transition occurs. Although we introduced fairly broad priors on the spreading rates, we obtained comparably narrow posterior distributions for each spreading rate λ

*(Fig. 3). We additionally evaluated extensions of our main model with three change points, e.g., by explicitly taking into account the incubation period (fig. S3). These models yield consistent results for the three change points, and all have LOO scores within 1 SE of each other. Thus, we consider our main model to be sufficient to describe case numbers in Germany at present.*

_{i}Our framework can be easily adapted to other countries and enables incorporation of future developments. For other countries, or for forecasts within smaller communities (e.g., federal states or cities), additional details may become important, such as explicit modeling of incubation time distributions (*17*, *37*) (i.e., as in fig. S3), spatial heterogeneity (*17*, *21*), isolation effects (*20*, *37*), subsampling effects hiding undetected cases even beyond the reporting delay (*38*, *39*), or the age and contact structure of the population (*26*). In countries where major changes in test coverage are expected, this will have to be included as well. The methodology presented here is in principle capable of incorporating such details. It also lends itself to modeling of continuous drifts in the spreading rate, e.g., reflecting reactions of the public to news coverage of a catastrophic situation or people growing tired of mitigation measures. Such further adaptations, however, can only be performed on a per-country basis by experts with an intimate knowledge of the local situation. Our approach provides a solid and extensible base for this. For Germany, several developments that occurred after the time span of the presented analysis should be included in the model. First, people may have transiently changed their behavior over the Easter holidays; second, we expect a series of change points, as well as continuous drifts, with governments trying to ease and calibrate mitigation measures. Third, extensions to hierarchical models will enable regional assessments, e.g., on the level of federal states.

In Germany, following the three major governmental interventions in March, effective growth rates remained close to zero and warranted careful consideration of future measures. With the data available until 21 April 2020, we estimated an effective growth rate of ~–3% for the beginning of April; the growth rate remained close to zero—the threshold between exponential growth or decay. Together with the delay of ~2 weeks between infection and case report, a growth rate close to zero warrants caution in lifting restrictions for two reasons. First, lifting restrictions too much will quickly lead to renewed exponential growth, and second, we would be effectively blind to this worsened situation for nearly 2 weeks, during which time transmission would be uninhibited. This may result in a growth in case numbers beyond the level that the health system can cope with, especially if active cases are not close to zero before lifting restrictions.

In conclusion, our Bayesian approach allows the detection and quantification of the effect of nonpharmaceutical interventions and, combined with potential subsequent interventions, the forecasting of future case number scenarios. Our analysis highlights the importance of precise timing and magnitude of interventions for future case numbers. It also stresses the importance of including the reporting delay *D* between the date of infection and the date of the confirmed case in the model. The reporting delay *D*, together with the time required to implement interventions, means that changes in our behavior today can only be detected in confirmed cases in 2 weeks’ time. This delay, combined with a spreading rate close to zero, indicates that careful planning of future measures is essential.

## Materials and methods

As a basis for our Bayesian inference and the forecast scenarios, we used the differential equations of the well-established SIR model. We also tested the robustness of our results by means of more sophisticated models, in particular an SEIR-like model that explicitly incorporates an incubation period (fig. S3). Although SIR model dynamics are well understood in general, here, our main challenge was to estimate model parameters specifically for the COVID-19 outbreak and to use them for forecasting. To that end, we combined a Bayesian approach to incorporate prior knowledge with MCMC sampling to compute the posterior distribution of the parameters and to sample from it for forecasting. Put simply, we first estimated the parameter distribution that best described the observed situation, and then we used many samples from this parameter distribution to evolve the model equations and thus forecast future developments.

The data used come from the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE) dashboard (*40*). The JHU CSSE provides up-to-date data on COVID-19 infections, usually a few days ahead of official German sources. The exact versions of the data and code are available at (*31*). Data were incorporated until 21 April 2020. Note that after this cutoff date, additional modeling of the effects of behavioral changes over the Easter holidays became necessary.

### Simple model: SIR model with stationary spreading rate

We consider a time-discrete version of the standard SIR model. In short, we assume that the disease spreads at rate λ from the infected population compartment (*I*) to the susceptible compartment (*S*), and that the infected population compartment recovers (*R*) at rate μ. This well-established model for disease spreading can be described by the following set of (deterministic) ordinary differential equations [see, e.g., (*5*), (*6*), (*20*)]. Within a population of size *N*,

During the onset phase of an epidemic, only a very small fraction of the population is infected (*I*) or recovered (*R*), and thus *S* ≈ *N* ≫ *I* such that *S*/*N* ≈ 1. Therefore, the differential equation for the infected reduces to a simple linear equation, exhibiting an exponential growth

Because our dataset is discrete in time (Δ*t* = 1 day), we solve the above differential equations with a discrete time step (d*I*/d*t* ≈ Δ*I*/Δ*t*), such that

*I _{t}* models the number of all (currently) active infected people, whereas

*D*between new infections

*C*) as

_{t}We begin our simulations at time *t* = 0 with *I*_{0} infected cases and start including real-word data of reported cases *t* > *D* (see below for a parameterization).

In our model, we do not explicitly incorporate the inflow of additional infected people by travel for two reasons. First, we implicitly model the initial surge of infections with *I*_{0}. Second, previous work showed that travel during the outbreak has only modest effects on the dynamics, e.g., travel restrictions in China merely delayed the exponential spread if not combined with reductions of spreading (*41*).

### Full model: SIR model with weekly reporting modulation and change points in spreading rate

Our change point detection builds on a generalization of the simple SIR model with stationary spreading rate. We now assume that the spreading rate λ* _{i}*,

*i*= 1, …,

*n*, may change at certain time points

*t*from λ

_{i}

_{i}_{– 1}to λ

*, linearly over a time window of Δ*

_{i}*t*days. Thereby, we account for mitigation measures, which were implemented in Germany step by step. Thus, the parameters

_{i}*t*, Δ

_{i}*t*, and λ

_{i}*are added to the parameter set of the simple model above, and the differential equations are augmented by the time-varying λ*

_{i}*.*

_{i}In addition, we include a weekly modulation to account for lower case reports around the weekend, which subsequently accumulate during the week. To model the systematic variation of case reports during the week, we adapt the newly reported cases by a reporting fraction*f _{w}* and Φ

*will also be constrained by the data.*

_{w}### Estimating model parameters with Bayesian MCMC

We estimate the set of model parameters θ = {λ* _{i}*,

*t*, μ,

_{i}*D*, σ,

*I*

_{0},

*f*, Φ

_{w}*} using Bayesian inference with MCMC. The parameter σ is the scale factor for the width of the likelihood*

_{w}*42*) with NUTS (No-U-Turn Sampling) (

*43*) using multiple, independent Markov chains. The structure of our approach is as follows.

*Initialization of the Markov chains through variational inference*

The posterior is approximated by Gaussian distributions ignoring correlations between parameters through automatic differentiation variational inference (ADVI) (*44*), which is implemented in PyMC3. From this distribution, four starting points for four chains are sampled.

*Burn-in phase*

Each chain performs 1000 burn-in (tuning) steps using NUTS, which are not recorded. This serves as equilibration to sample from an equilibrium distribution in the next step.

*Sampling phase*

Each chain performs 4000 steps, which are used to approximate the posterior distribution. To ensure that the chains are equilibrated and sampled from the whole posterior distribution (ergodicity), we verified that the R-hat statistic is below 1.05, which is implemented in PyMC3. The rank-normalized R-hat diagnostic tests for lack of convergence by comparing the variances within chains and between chains: For identical within-chain and between-chain variances, R-hat becomes 1, indicating convergence. For well-converged chains, the resulting samples will describe the real-world data well, so that consistent forecasts are possible in the forecast phase.

### Forecast using MCMC samples

For the forecast, we take all samples from the MCMC step and continue time integration according to different forecast scenarios. Note that the overall procedure yields an ensemble of forecasts as opposed to a single forecast that would be solely based on one set of (previously optimized) parameters.

### MCMC sampling details

Each MCMC step requires us to propose a new set of parameters θ to generate a (fully deterministic) time series of new infected cases *C*(θ) = {*C _{t}*(θ)} of the same length as the observed real-world data

*p*(θ) is the prior distribution of the parameters (see below). The likelihood quantifies the similarity between model outcome and the available real-world time series. Here, the likelihood is the product over local likelihoods

*t*,

*C*(θ), and the corresponding real-world data point

_{t}*t*distribution because it resembles a Gaussian distribution around the mean but features heavy tails, which make the MCMC more robust with respect to outliers (

*45*) and thus reporting noise. The case-number–dependent width is motivated by observation noise through random subsampling (

*38*), resulting in a variance proportional to the mean. Our likelihood neglects any noise in the dynamic process, because the SIR model is deterministic, but could be in principle extended to incorporate typical demographic noise from stochastic spreading dynamics (

*35*,

*46*).

### Priors that constrain model parameters

Because short-term forecasts are time critical at the onset of an epidemic, the available real-world data are typically not informative enough to identify all free parameters or to empirically find their underlying distributions. We therefore chose informative priors on initial model parameters where possible and complemented them with uninformative priors otherwise. Our choices are summarized in Table 2 for the simple model, i.e., the SIR model with stationary spreading rate for the exponential onset phase, and in Table 3 for the full model with change points and are discussed in the following.

### Priors for the simple model (Table 2)

To constrain our simple model, an SIR model with stationary spreading rate for the exponential onset phase, we chose the following informative priors. Because of the ambiguity between the spreading and recovery rate in the exponential onset phase (see description of the simple model), we chose a narrow log-normal prior for the recovery rate *20*). Note that our implementation of μ accounts for the recovery of infected people and isolation measures because it describes the duration during which a person can infect others. For the spreading rate, we assume a broad log-normal prior distribution *R*_{0} = λ/μ) has a median of 3.2, consistent with the broad range of previous estimates (*18*, *33*, *34*). In addition, we chose a log-normal prior for the reporting delay *32*) plus a delay from infected people waiting to contact the doctor and get tested (assumed as 3 days).

The remaining model parameters are constrained by uninformative priors, in practice the half-Cauchy distribution (*47*). The half-Cauchy distribution, HalfCauchy(*x*,β) = 2/πβ[1 + (*x*/β)^{2}], is essentially a flat prior from zero to *O*(β) with heavy tails beyond. Thereby, β merely sets the order of magnitude that should not be exceeded for a given parameter. We chose for the number of initially infected people in the model (16 days before first data point) *I*_{0} ~ HalfCauchy(100) assuming an order of magnitude *O*(100) and below. In addition, we chose the scale factor of the width of the likelihood function as σ ~ HalfCauchy(10); this choice means that the variance in reported numbers may be up to a factor of 100 larger than the actual reported number.

### Priors for the full model (Table 3)

To constrain our full model, an SIR model with weekly reporting modulation and change points in the spreading rate, we chose the same priors as for the simple model but added the required priors associated with the change points. In general, we assume that each set of interventions (together with the increasing awareness) leads to a reduction (and not an increase) of λ at a change point. Because we cannot know yet the precise reduction factor, we adhere to assuming a reduction by ~50%, but always with a fairly wide uncertainty so that, in principle, even an increase at the change point would be possible. We model the time dependence of λ as change points, not as continuous changes, because the policy changes were implemented in three discrete steps, which were presumably followed by the public in a timely fashion.

For the spreading rates, we chose log-normal distributed priors as in the simple model. In particular, we chose for the initial spreading rate the same prior as in the simple model, _{0} ≈ 0.4) with a broad prior distribution; after the second change point _{3} to be smaller than the recovery rate μ, causing a decrease in new case numbers and a saturation of the cumulative number of infections.

For the timing of change points, we chose normally distributed priors. In particular, we chose *t*_{1} ~ Normal(2020/03/09,3) for the first change point because on the weekend of 8 March 2020, large public events such as soccer matches or fairs were canceled. For the second change point, we chose *t*_{2} ~ Normal(2020/03/16,1), because on 15 March 2020, the closing of schools and other educational institutions, along with the closing of nonessential stores, was announced and implemented on the following day. Restaurants were allowed to stay open until 6:00 p.m. For the third change point, we chose *t*_{3} ~ Normal(2020/03/23,1), because on 23 March 2020, a far-reaching contact ban (*Kontaktsperre*), which includes the prohibition of even small public gatherings as well as complete closing of restaurants and nonessential stores, was imposed by the government. Further policy changes may occur in future; however, for now, we do not include more change points.

The change points take effect over a certain time period Δ*t _{i}* for which we chose

The number of tests that are performed and reported vary regularly over the course of a week and are especially low during weekends. To account for this periodic variation, we modulated the number of inferred cases by the absolute value of a sine function with, in total, a period of 7 days. We chose this function because it is a nonsymmetric oscillation, fitting the weekly variation of cases on a phenomenological level. For the amplitude of the modulation, we chose a weakly informative Beta prior: *f _{w}* ~ Beta(mean = 0.7, std = 0.17) and for the phase a nearly flat circular distribution: Φ

*~ VonMises(mean = 0, κ = 0.01).*

_{w}### Model comparison

Because change-point detection entails evaluating models with different numbers of parameters, some form of fair model comparison is needed. This is necessary to compensate for the higher flexibility of more complex models, because this flexibility carries the risk of overfitting and overconfident forecasts. The standard approach to avoid overfitting in machine learning is cross-validation, which has recently also been advocated for Bayesian model comparison [e.g., (*3*, *36*)], especially for models used for predictions and forecasts. Thus, one would ideally like to compare the models with different numbers of change points by the probability they assign to previously unobserved data points. Technically this is measured by their out-of-sample prediction accuracy, i.e., their log pointwise predictive density (lppd):*N* new data points and *p*_{post}(θ) = *p*_{post}(θ | *y*, *M _{j}*) is the posterior distribution of the parameters given the in-sample data

*y*and the model

*M*. In practice, the integral is approximated using a sufficient number of samples from

_{j}*p*

_{post}(θ). However, this approach is only reasonable if a sufficient amount of out-of-sample data are available, which is not the case in the early stages of a disease outbreak. Therefore, the pointwise out-of-sample prediction accuracy was approximated using LOO in PyMC3 to compute Eq. 6 individually for each left-out data point based on the model fit to the other data points. The sum of these values, multiplied by a factor of –2, then yields the LOO cross-validation (LOO-CV) score. Lower LOO-CV scores imply better models.

## Supplementary Materials

science.sciencemag.org/content/369/6500/eabb9789/suppl/DC1

Tables S1 and S2

Figs. S1 to S7

MDAR Reproducibility Checklist

References

This is an open-access article distributed under the terms of the Creative Commons Attribution license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## References and Notes

**Acknowledgments:**We thank T. Friede, T. Geisel, K. Heidemann, M. Linkmann, M. Loidolt, and V. Zykov for carefully and promptly reviewing our work internally; the Priesemann group, especially D. Gonzalez Marx, F. Mikulasch, L. Rudelt, and A. Schneider, for exciting discussions and valuable comments; our colleagues at the Göttingen campus, H. Bickeböller, E. Bodenschatz, W. Brück, A. Ecker, A. Leha, R. Golestanian, H. Grubmüller, S. Herminghaus, R. Jahn, N. Lossau, and S. Scheithauer, with whom we discussed the project and the COVID-19 case forecast in the past weeks very intensively; and N. Bertschinger for stimulating discussion on Bayesian modeling and model comparison.

**Funding:**All authors received support from the Max-Planck-Society. J.D. and P.S. acknowledge funding by SMARTSTART, the joint training program in computational neuroscience by the VolkswagenStiftung and the Bernstein Network. J.Z. received financial support from the Joachim Herz Stiftung. M. Wibral is employed at the Campus Institute for Dynamics of Biological Networks funded by the VolkswagenStiftung.

**Author contributions:**J.D., J.Z., M. Wibral, M. Wilczek, and V.P. designed the research. J.D., J.Z., and P.S. conducted the research. J.D., J.Z., P.S., J.P.N., M. Wibral, M. Wilczek, and V.P. analyzed the data. J.D., P.S., and J.P.N. created the figures. All authors wrote the paper.

**Competing interests:**V.P. is currently active in various groups to advise the government but is not receiving payment.

**Data and materials availability:**Both data (

*48*) and analysis code (

*31*) are available online. This work is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/. This license does not apply to figures/photos/artwork or other content included in the article that is credited to a third party; obtain authorization from the rights holder before using such material.