Improved Surface Temperature Prediction for the Coming Decade from a Global Climate Model

See allHide authors and affiliations

Science  10 Aug 2007:
Vol. 317, Issue 5839, pp. 796-799
DOI: 10.1126/science.1139540


Previous climate model projections of climate change accounted for external forcing from natural and anthropogenic sources but did not attempt to predict internally generated natural variability. We present a new modeling system that predicts both internal variability and externally forced changes and hence forecasts surface temperature with substantially improved skill throughout a decade, both globally and in many regions. Our system predicts that internal variability will partially offset the anthropogenic global warming signal for the next few years. However, climate will continue to warm, with at least half of the years after 2009 predicted to exceed the warmest year currently on record.

It is very likely that the climate will warm over the coming century in response to changes in radiative forcing arising from anthropogenic emissions of greenhouse gases and aerosols (1). There is, however, particular interest in the coming decade, which represents a key planning horizon for infrastructure upgrades, insurance, energy policy, and business development. On this time scale, climate could be dominated by internal variability (2) arising from unforced natural changes in the climate system such as El Niño, fluctuations in the thermohaline circulation, and anomalies of ocean heat content. This could lead to short-term changes, especially regionally, that are quite different from the mean warming (35) expected over the next century in response to anthropogenic forcing. Idealized studies (612) show that some aspects of internal variability could be predictable several years in advance, but actual predictive skill assessed against real observations has not previously been reported beyond a few seasons (13). Global climate models have been used to make predictions of climate change on decadal (14, 15) or longer time scales (4, 5, 16), but these only accounted for projections of external forcing, neglecting initial condition information needed to predict internal variability. We examined the potential skill of decadal predictions using the newly developed Decadal Climate Prediction System (DePreSys), based on the Hadley Centre Coupled Model, version 3 (HadCM3) (17), a dynamical global climate model (GCM). DePreSys (18) takes into account the observed state of the atmosphere and ocean in order to predict internal variability, together with plausible changes in anthropogenic sources of greenhouse gases and aerosol concentrations (19) and projected changes in solar irradiance and volcanic aerosol (20).

We assessed the accuracy of DePreSys in a set of 10-year hindcasts (21), starting from the first of March, June, September, and December from 1982 to 2001 (22) inclusive (80 start dates in total, although those that project into the future cannot be assessed at all lead times). We also assessed the impact of initial condition information by comparing DePreSys against an additional hindcast set (hereafter referred to as NoAssim), which is identical to DePreSys but does not assimilate the observed state of the atmosphere or ocean. Each NoAssim hindcast consists of four ensemble members, with initial conditions at the same 80 start dates as the DePreSys hindcasts taken from four independent transient integrations (3) of HadCM3, which covered the period from 1860 to 2001 (18). The NoAssim hindcasts sampled a range of initial states of the atmosphere and ocean that were consistent with the internal variability of HadCM3 but were independent of the observed state. In contrast, the DePreSys hindcasts were initialized by assimilating atmosphere and ocean observations into one of the transient integrations (18). In order to sample the effects of error growth arising from imperfect knowledge of the observed state, four DePreSys ensemble members were initialized from consecutive days preceding and including each hindcast start date (23). Fig. S1 summarizes our experimental procedure.

We measured the skill of the hindcasts in terms of the root mean square error (RMSE) (24) of the ensemble average and tested for differences over our hindcast period between DePreSys and NoAssim that were unlikely to be accounted for by uncertainties arising from a finite ensemble size and a finite number of validation points (18). We found that global anomalies (25) of annual mean surface temperature (Ts) were predicted with significantly more skill by DePreSys than by NoAssim throughout the range of the hindcasts (compare the solid red curve with the blue shading in Fig. 1A). Averaged over all forecast lead times, the RMSE of global annual mean Ts is 0.132°C for NoAssim as compared with 0.105°C for DePreSys, representing a 20% reduction in RMSE and a 36% reduction in error variance (E). Furthermore, the improvement was even greater for multiannual means: For 5-year means, the RMSE was reduced by 38% (a 61% reduction in E), from 0.106°C to 0.066°C; and for 9-year means, the RMSE was reduced by 49% (a 74% reduction in E), from 0.090°C to 0.046°C.

Fig. 1.

Impact of initial conditions on hindcast skill. (A) RMSE (24) of globally averaged annual mean Ts anomalies (relative to 1979–2001) as a function of forecast period. We compare DePreSys (solid red curve) with the NoAssim hindcasts [the blue shading shows the 5 to 95% CI region where differences between DePreSys and NoAssim are not significant (18)]. The dashed red curve shows the effect of removing from the DePreSys hindcasts differences between DePreSys and NoAssim that are linearly attributable to the state of El Niño. The dotted red curve shows the effect of removing from the DePreSys hindcasts the mean difference between DePreSys and NoAssim hindcasts of Ts for the coming 9 years. Observations are taken from the HadCRUT2vOA data set (3638). (B) As (A), but for H (relative to 1941–1996). Observations of H are computed from analyses of ocean temperature observations (39). (C) Time series of rolling decadal mean global anomalies (relative to 1941–1996) of H from observations (39) and the four transient HadCM3 simulations (models 1 to 4) (3) that provided initial conditions for the NoAssim hindcasts. Values are plotted annually, with the year representing the mean of the next 10 years.

Because the internal variability of the atmosphere is essentially unpredictable beyond a couple of weeks (26), and the external forcing in DePreSys and NoAssim is identical, differences in predictive skill are very likely to be caused by differences in the initialization and evolution of the ocean. During 600 years of the HadCM3, control integration Ts is highly correlated (correlation R = 0.89) with global annual mean ocean heat content in the upper 113 m (H). Furthermore, the correlation is higher when H leads Ts by 1 year (R = 0.56) than when Ts leads H by 1 year (R = 0.32), providing strong evidence that variations in H can force Ts. We also find that H is predicted with significantly more skill by DePreSys than by NoAssim (Fig. 1B), and we conclude that the improvement of DePreSys over NoAssim in predicting Ts on interannual-to-decadal time scales results mainly from initializing upper ocean heat content.

We now examine the factors that control the predictability of H and Ts on annual-to-decadal time scales. Time series of hindcasts of Ts for 1 year ahead (Fig. 2A) show that both DePreSys and NoAssim capture the observed general warming trend, but the interannual variability of Ts is predicted better by DePreSys (detrended RMSE = 0.066°C) than by NoAssim (detrended RMSE = 0.094°C). A statistical forecast method (18) is also able to capture the trend and interannual variability of Ts for the coming year (green triangles in Fig. 2A). The statistical method accounts for interannual variability using predictors based on the state of El Niño and recent volcanic activity. Volcanic activity cannot explain the difference between DePreSys and NoAssim because both include forcing from volcanic aerosol in the same way. We assess the impact of El Niño on the difference between DePreSys and NoAssim as follows. From the transient HadCM3 simulations, we compute linear regression coefficients that relate the state of El Niño, as measured by SST in the Niño3 region (210° to 270°E, 5°S to 5°N), to Ts. Using these coefficients, we compute the contribution to Ts from El Niño for each DePreSys and NoAssim hindcast and remove the difference from the DePreSys hindcasts. We find that the increased skill of DePreSys over NoAssim is consistent with an improved ability to predict El Niño for the first 15 to 18 months, but not at longer lead times (compare the dashed red curve with the blue shading in Fig. 1A).

Fig. 2.

Time series of hindcast and observed anomalies (relative to 1979–2001) of globally averaged surface temperature. (A) Hindcasts of the first annual mean (forecast period of 1 year) compared with observations from HadCRUT2vOA (black curve). Rolling annual mean observations and DePreSys and NoAssim hindcasts are plotted seasonally from March, June, September, and December. Statistical hindcasts are plotted each January. The CI (27) (red shading) is diagnosed from the standard deviation of the DePreSys ensemble, assuming a t distribution centered on the ensemble mean (white curve). Only the ensemble mean is shown for the NoAssim hindcasts (blue curve). The mean uncertainty in the observations is ±0.056°C (5 to 95% CI range). (B) As (A), but for year 9 of the hindcasts. (C) As (A), but for the first 9-year mean of the hindcasts.

The hindcasts for year 9 capture the observed mean warming but not the interannual variability (Fig. 2B). This is expected because the main factors governing interannual variability, namely El Niño and volcanic eruptions, are not predictable at this lead time. The 90% confidence limits (27) diagnosed from the ensemble spread (red shading) generally capture the observations [supporting online material (SOM) text and fig. S5], apart from the cooling after the eruption of Mount Pinatubo (28). This is unavoidable unless volcanic eruptions can be predicted, and we note that our decadal forecasts assume that no major volcanic eruptions will occur during the forecast period. We therefore expect both NoAssim and DePreSys hindcasts of periods containing volcanic eruptions to be too warm on average. However, the warm bias is significantly smaller in DePreSys. This is clearly illustrated in hindcasts of 9-year mean Ts (Fig. 2C), for which the DePreSys bias of 0.016°C represents a 79% reduction from the NoAssim bias of 0.075°C. If we remove the difference in these biases (–0.059°C) from the DePreSys hindcasts of annual mean Ts (dotted red curve in Fig. 1A), the RMSE is no longer significantly different from NoAssim at forecast periods greater than 15 months. The cooling of DePreSys relative to NoAssim is consistent with a warm bias of H in the NoAssim initial conditions provided by the transient HadCM3 integrations (Fig. 1C from 1982 onward). Furthermore, the magnitude of this bias is consistent with the level of internal multidecadal variability of H found in both the observations and the individual HadCM3 integrations used to initialize the NoAssim hindcasts (Fig. 1C). We therefore conclude that the increased predictive skill of DePreSys over NoAssim at forecast periods longer than 15 months results mainly from initializing the low-frequency variability of H, thereby removing errors of H from the NoAssim initial conditions (SOM text).

Because forecast errors generally grow with time, differences between the RMSE of NoAssim and DePreSys would be expected to be largest at short lead times. This was not the case in our experiments (Fig. 1, A and B). We investigated this unexpected behavior using a simple energy balance model (EBM) (29) to predict the evolution of the average difference in H between the DePreSys and NoAssim initial conditions (SOM text and fig. S2). We found that the detailed evolution of this difference [which increases in magnitude for the first 4 years, decreasing thereafter (fig. S3)] is governed by an atmospheric feedback response to the initial anomaly of H (fig. S4). Furthermore, the RMSE of the trend in global Ts during the first 5 years of the hindcasts is lower in DePreSys than NoAssim (30). These results indicate that the evolution of the climate system is predicted better by DePreSys than NoAssim and that some of this improvement results from atmospheric feedbacks simulated by the coupled climate model.

Although global Ts is important for informing greenhouse gas emissions policy, many applications in industry and commerce require regional predictions. We found significant differences between DePreSys and NoAssim RMSE in 9-year mean Ts in many regions (Fig. 3, A to C). Much of the regional improvement in DePreSys relative to NoAssim is coincident with improvements in H (Fig. 3D), particularly in the Indian Ocean and Australasian sector of the Southern Hemisphere [consistent with (12)], although there are also some regions where DePreSys gives larger errors than NoAssim. Furthermore, there are significant differences in RMSE over land, the largest improvements occurring in North and South America and eastern Australia (Fig. 3C).

Fig. 3.

Impact of initial conditions on regional hindcast skill. (A) RMSE of 9-year mean Ts anomalies (relative to 1979–2001) for the ensemble mean NoAssim hindcasts, verified against observations from HadCRUT2v (3638). (B) As (A), but for DePreSys. (C) NoAssim minus DePreSys RMSE of 9-year mean Ts. Differences are shown only where they are significant at the 5% level (18). (D) As (C), but for 9-year mean H anomalies (relative to 1941–1996). In all panels, each 5° latitude by 5° longitude pixel represents the RMSE for predictions of Ts spatially averaged over the 35° latitude by 35° longitude box centered on that pixel.

The strong correspondence (R = 0.75) between regional differences in Ts and H (Fig. 3, C and D) further supports our conclusion that improvements in DePreSys relative to NoAssim on decadal time scales result mainly from initializing H. Although our hindcast period is limited to 20 years, the existence of natural low-frequency variability of H (31) (Fig. 1C) strongly suggests that DePreSys would also improve on NoAssim in other decades, although the regional details could be different. Furthermore, a substantial increase in the number of subsurface ocean observations through the Argo program (32) should substantially improve our ability to initialize the ocean in future, thereby leading to further improvements in DePreSys relative to NoAssim both globally and regionally.

Having established the predictive skill of DePreSys, we issued the first GCM-based forecast of global Ts for the coming decade (33, 34) (Fig. 4). The DePreSys forecast is based on 20 ensemble members, 10 starting from consecutive days leading to 1 June 2005, combined with 10 from consecutive days leading to 1 March 2005. We assessed the impact of initial conditions on this forecast by comparing it with a NoAssim forecast, consisting of eight ensemble members. We also compared two eight-member DePreSys and NoAssim hindcasts with observations. The DePreSys hindcast starting from June 1985 correctly predicted a rapid warming during the transition from the weak La Niña of 1985 to the El Niño of 1986–1987 and correctly predicted the warming trend throughout the period until the eruption of Mount Pinatubo. The DePreSys hindcast starting from June 1995 correctly predicted an initial cooling, followed by a general warming. As expected, the NoAssim hindcasts predicted only the general warming trend, although the NoAssim hindcast from June 1995 is generally too warm. In the DePreSys forecast, internal variability offsets the effects of anthropogenic forcing in the first few years, leading to no net warming before 2008 (Fig. 4). In contrast, the NoAssim forecast warms during this period. Regional assessment to February 2007 (fig. S8) indicates that this initial cooling in DePreSys relative to NoAssim results from the development of cooler anomalies in the tropical Pacific and the persistence of neutral conditions in the Southern Ocean. In both cases, the DePreSys forecast is closer to the verifying changes observed since the forecast start date. Both NoAssim and DePreSys, however, predict further warming during the coming decade, with the year 2014 predicted to be 0.30° ± 0.21°C [5 to 95% confidence interval (CI)] warmer than the observed value for 2004. Furthermore, at least half of the years after 2009 are predicted to be warmer than 1998, the warmest year currently on record.

Fig. 4.

Globally averaged annual mean surface temperature anomaly (relative to 1979–2001) forecast by DePreSys starting from June 2005. The CI (red shading) is diagnosed from the standard deviation of the DePreSys ensemble, assuming a t distribution centered on the ensemble mean (white curve). Also shown are DePreSys and ensemble mean NoAssim (blue curves) hindcasts starting from June 1985 and June 1995, together with observations from HadCRUT2vOA (black curve). Rolling annual mean values are plotted seasonally from March, June, September, and December. The mean bias as a function of lead time was computed from those DePreSys hindcasts that were unaffected by Mount Pinatubo (SOM text) and removed from the DePreSys forecast (but not the hindcasts).

Supporting Online Material

Materials and Methods

SOM Text

Figs. S1 to S8


References and Notes

Stay Connected to Science

Navigate This Article