Skill assessment#

These pages present some preliminary assessment of the skill of the CAFE-f6 hindcasts. The assessment approach largely follows the assessment of the CanESM5 decadal hindcasts in Sospedra-Alfonso et al.1. Skill scores are computed for the CAFE-f6 hindcasts, and for the (40 ensemble member) CanESM5 1 and (10 ensemble member) EC-Earth3 2 CMIP6 Decadal Climate Prediction Project (DCPP) hindcast submissions for comparison. Note that the CanESM5 hindcasts are initialised at the end of December every year, while the CAFE-f6 and EC-Earth3 hindcasts are initialised at the beginning of November every year (only Novemeber-initialised CAFE-f6 forecasts are considered here due to the reproducibility issue with many of the May-initialised forecasts).

Forced historical simulations#

For each hindcast dataset, a forced historical simulation is used to help quantify the skill added by initialisation of the hindcasts relative to the uninitialised simulations. For the CMIP6 DCPP hindcasts, historical simulations are taken from the corresponding CMIP6 “historical” experiment, ensuring that the same number of ensemble members are used for each. For the CAFE-f6 hindcasts, a dedicated 96-member forced historical simulation and corresponding 20-member control simulation has been generated (see here for the run scripts). These simulations were initialised from CAFE60v1 in 1960-11-01 and run for 80 years, with the assumption that ensemble members will be independent by the beginning of the CAFE-f6 hindcast period (1981). The CAFE historical simulation experiences the same external forcing as the CAFE-f6 hindcasts at initialisation (see Application of forcing) and the control simulation experiences fixed 1960 forcing. The ensemble-mean control run climatology is substracted from the historical run to account for model drift over the relative short simulation period.

Skill assessment methods#

Unless otherwise specified, skill assessment is performed on hindcast anomalies that are computed relative to each model’s own ensemble-mean climatology as a function of lead time. 30-year climatological and verification periods are used for both the CAFE-f6 and DCPP data. However, because the historical CMIP6 data end in 2014, these periods differ slightly for the different model hindcasts: 1991-2020 for CAFE-f6; 1985-2014 for CanESM5 and EC-Earth3. Anomalies of the reference data (the “truth” to verify the hindcasts against, e.g. observations) are computed relative to their climatological mean over the same period as the hindcast data. All model and reference data are bi-linearly interpolated to the CAFE-f6 atmospheric grid prior to the calculation of skill scores.

The following reference datasets are used to assess the hindcast skill:

Reference datasets#
Variable(s)	Reference dataset
Global SST, NINO 3.4, DMI, AMV, IPO	HadISST 3
Global upper ocean heat content	EN.4.2.2 c14 4
Global 2m temperature, global SLP, SAM, NAO, Australian 10m wind, Australian FFDI	JRA55 5 6
Global precipitation	GPCP v02r03 7
Australian 2m temperature, Australian extreme 2m temperature, Australian precipitation, Australian extreme precipitation, Australian drought, Australian EHF severity	AGCDv2 8

A number of deterministic and probabilistic skill metrics were calculated, with only a few deterministic metrics, computed on the ensemble-mean, shown in this documentation. Following Sospedra-Alfonso et al.1, the reference anomalies are denoted as \(X\), the ensemble-mean hindcast anomalies as \(Y\), the ensemble-mean simulation anomalies as \(U\), \(C_{AB}\) denotes the covariance of \(A\) and \(B\), and \(\sigma_{A}\) denotes the standard deviation of \(A\). The following table defines the skill metrics presented in this documentation.

Skill metrics#
Metric	Definition	Interpretation	Reference
Anomaly cross correlation (ACC)	\(r_{XY} = \frac{C_{XY}}{\sigma_{X}\sigma_{Y}}\)	How in phase are the hindcasts and observations?	Wilks9
Initialised component of the ACC	\(r_{i} = r_{XY} - \theta * r_{XU} * r_{YU}\) where \(\theta =\) 0 if \(r_{YU} < 0\) else 1	How much of the correlation skill came from initialisation?	Sospedra-Alfonso and Boer10
Mean Squared Skill Score	\(\mathrm{MSSS}(Y, R, X) = 1 - \frac{\mathrm{MSE}(Y, X)}{\mathrm{MSE}(R,X)}\) where \(\mathrm{MSE}\) is the mean square error and \(R\) corresponds to baseline predictions of either observed climatology, persistence or uninitialised simulations.	Is the forecast error smaller than a baseline prediction?	Goddard et al.11

Statistical significance of the skill scores is evaluated using a non-parametric cicular moving-block bootstrap approach 1 11. Following Sospedra-Alfonso et al.1, 1000 repetitions are performed using 5-year blocks. Skill scores that are found to be significant at the 95% confidence level are indicated in the following pages (using hatching on map plots and dots on line plots).

Note

We define the “0th” lead period of a forecast as the period that includes the initialisation. For example, for annual forecasts initialised on 2022-11-01, “lead year 0” refers to the period 2022-11-01 - 2023-10-31, “lead year 1” refers to 2023-11-01 - 2024-20-31. This is different than some existing studies (e.g. Sospedra-Alfonso et al.1) whose “lead year 1” is equivalent to our “lead year 0”.

Hindcast skill results#

The skill assessment is split into two sections:

Generic hindcast skill - Assessment of the skill of commonly assessed quantities, including gridded global variables and indices for key climate drivers.
Australian hindcast skill - Assessment of the skill of Australian regionally-averaged quantities. This analysis has some focus on climate extremes and was motivated by deliverables for the Australian Climate Service.

Documentation of the code and workflows used to carry out these skill assessment can be found in Producing these docs.

References#

1(1,2,3,4,5,6): Reinel Sospedra-Alfonso, William J Merryfield, George J Boer, Viatsheslav V Kharin, Woo-Sung Lee, Christian Seiler, and James R Christian. Decadal climate predictions with the canadian earth system model version 5 (canesm5). Geoscientific Model Development, 14(11):6863–6891, 2021.
2: Roberto Bilbao, Simon Wild, Pablo Ortega, Juan Acosta-Navarro, Thomas Arsouze, Pierre-Antoine Bretonnière, Louis-Philippe Caron, Miguel Castrillo, Rubén Cruz-García, Ivana Cvijanovic, and others. Assessment of a full-field initialized decadal climate prediction system with the cmip6 version of ec-earth. Earth System Dynamics, 12(1):173–196, 2021.
3: NAA Rayner, De E Parker, EB Horton, Chris K Folland, Lisa V Alexander, DP Rowell, Elizabeth C Kent, and A Kaplan. Global analyses of sea surface temperature, sea ice, and night marine air temperature since the late nineteenth century. Journal of Geophysical Research: Atmospheres, 2003.
4: Simon A Good, Matthew J Martin, and Nick A Rayner. En4: quality controlled ocean temperature and salinity profiles and monthly objective analyses with uncertainty estimates. Journal of Geophysical Research: Oceans, 118(12):6704–6716, 2013.
5: Shinya Kobayashi, Yukinari Ota, Yayoi Harada, Ayataka Ebita, Masami Moriya, Hirokatsu Onoda, Kazutoshi Onogi, Hirotaka Kamahori, Chiaki Kobayashi, Hirokazu Endo, and others. The jra-55 reanalysis: general specifications and basic characteristics. Journal of the Meteorological Society of Japan. Ser. II, 93(1):5–48, 2015.
6: Yayoi Harada, Hirotaka Kamahori, Chiaki Kobayashi, Hirokazu Endo, Shinya Kobayashi, Yukinari Ota, Hirokatsu Onoda, Kazutoshi Onogi, Kengo Miyaoka, and Kiyotoshi Takahashi. The jra-55 reanalysis: representation of atmospheric circulation and climate variability. Journal of the Meteorological Society of Japan. Ser. II, 94(3):269–302, 2016.
7: Robert F Adler, Mathew RP Sapiano, George J Huffman, Jian-Jian Wang, Guojun Gu, David Bolvin, Long Chiu, Udo Schneider, Andreas Becker, Eric Nelkin, and others. The global precipitation climatology project (gpcp) monthly analysis (new version 2.3) and a review of 2017 global precipitation. Atmosphere, 9(4):138, 2018.
8: Alex Evans, David Jones, Rob Smalley, and Stephen Lellyett. An enhanced gridded rainfall analysis scheme for Australia. Volume 66. Bureau of Meteorology, 2020.
9: Daniel S Wilks. Statistical methods in the atmospheric sciences. Volume 100. Academic press, 2011.
10: Reinel Sospedra-Alfonso and George J Boer. Assessing the impact of initialization on decadal prediction skill. Geophysical Research Letters, 47(4):e2019GL086361, 2020.
11(1,2): Lisa Goddard, A Kumar, A Solomon, D Smith, G Boer, P Gonzalez, V Kharin, W Merryfield, Clara Deser, Simon J Mason, and others. A verification framework for interannual-to-decadal predictions experiments. Climate Dynamics, 40(1):245–272, 2013.