src.utils#
Module Contents#
Functions#
|
Load a config .yml file for a specified dataset |
|
Return a composite function of all functions and kwargs specified in a |
|
Return a region specified by a range of longitudes and latitudes. |
|
Calculate the NINO3.4 index. The NINO3.4 index is calculated as the spatial average |
|
Calculate the Dipole Mode Index (DMI) for the Indian Ocean Dipole. The DMI is |
|
Calculate the Southern Annular Mode index from monthly data as defined by Gong, D. |
|
Calculate the Northern Atlantic Oscillation index from monthly data as defined by |
|
Calculate the Atlantic Multi-decadal Variability (AMV)--also known as the Atlantic |
|
Calculate the tripolar pacific index for the Interdecadal Pacific Oscillation (IPO) |
|
Calculate the ocean heat content above 300m |
|
Calculate the wind speed |
|
Estimate tmean as the average of tmin and tmax |
|
Returns the McArthur Forest Fire Danger Index following the formula provided |
|
Calculate the Excess Heat Factor (EHF) index, defined as: |
|
Calculate the severity of the Excess Heat Factor index, defined as: |
|
Return the ensemble mean of the input array |
|
Return a boolean array with True where elements > value |
|
Return array with elements <= value masked to nan |
Add CAFE grid info to a CAFE dataset that doesn't already have it |
|
Normalise input array by the number of days in each month |
|
|
Return provided array with time dimension converted to lead time dimension |
|
Return provided array with latitudes truncated to specified dp. |
|
Convert calendar, dropping invalid/surplus dates or inserting missing dates |
|
Rechunk a dataset |
|
Returns a new dataset with each array indexed by tick labels along the |
|
Add attributes to a dataset |
|
Rename all variables etc that have an entry in names |
|
Convert variables in a dataset according to provided dictionary |
|
Keep only times outside of a specified period |
|
Get the groupby and reduction dimensions for performing operations like |
|
Returns the anomalies of ds relative to its climatology over clim_period. |
|
Returns the percentile values of ds over a provided period. |
|
Find which values in the input array are over a specified percentile |
|
Find which values in the input array are under a specified percentile |
|
Correct the mean bias of ds relative to observations over a provided period |
|
|
|
Return provided array with specified time dimension rounded to the start of |
|
Return provided array with specified time dimension rounded to the start of |
|
Coarsen data, applying 'max' to all relevant coords and optionally starting |
|
Apply a rolling mean to the data, applying 'max' to all relevant coords and |
|
Resample data to a different temporal frequency by taking the mean |
|
Extract region masks according to a shapefile |
Average the provided array over the NRM super cluster regions |
|
Mask out the ensemble members of CAFE-f6 that were run with a reduced timestep |
|
|
Returns the area weights computed using cdo's gridarea function |
|
Add a area coordinate to the provided dataset containing the cell areas |
Get the max chunk size in a dataset |
Attributes#
- src.utils.PROJECT_DIR#
- src.utils.load_config(name)#
Load a config .yml file for a specified dataset
- Parameters
- namestr
The path to the config file to load
- src.utils.composite_function(function_dict)#
Return a composite function of all functions and kwargs specified in a provided dictionary
- Parameters
- function_dictdict
Dictionary with functions in this module to composite as keys and kwargs as values
- src.utils.extract_lon_lat_box(ds, box, weighted_average, lon_dim='lon', lat_dim='lat')#
Return a region specified by a range of longitudes and latitudes.
- Parameters
- dsxarray Dataset or DataArray
The data to subset and average. Assumed to include an “area” Variable
- boxiterable
Iterable with the following elements in this order: [lon_lower, lon_upper, lat_lower, lat_upper] where longitudes are specified between 0 and 360 deg E and latitudes are specified between -90 and 90 deg N
- weighted_averageboolean
If True, reture the area weighted average over the region, otherwise return the region
- lon_dimstr, optional
The name of the longitude dimension
- lat_dimstr, optional
The name of the latitude dimension
- src.utils.calculate_nino34(sst_anom, sst_name='sst')#
Calculate the NINO3.4 index. The NINO3.4 index is calculated as the spatial average of SST anomalies over the tropical Pacific region (5∘S–5∘N and 170–120∘ W).
- Parameters
- sst_anomxarray Dataset
Array of sst anomalies
- sst_namestr, optional
The name of the sst variable in sst_anom
- src.utils.calculate_dmi(sst_anom, sst_name='sst')#
Calculate the Dipole Mode Index (DMI) for the Indian Ocean Dipole. The DMI is calculated as the difference between the spatial averages of SST anomalies over two regions of the tropical Indian Ocean: (10°S-10°N and 50°E-70°E) and (10°S-0°S and 90°E-110°E).
- Parameters
- sst_anomxarray Dataset
Array of sst anomalies
- sst_namestr, optional
The name of the sst variable in sst_anom
- src.utils.calculate_sam(slp, clim_period, groupby_dim='time', slp_name='slp', lon_dim='lon', lat_dim='lat')#
Calculate the Southern Annular Mode index from monthly data as defined by Gong, D. and Wang, S., 1999. The SAM index is defined as the difference between the normalized monthly zonal mean sea level pressure at 40∘S and 65∘S.
- Parameters
- slpxarray Dataset
Array of sea level pressures
- clim_perioditerable
Size 2 iterable containing strings indicating the start and end dates of the climatological period used to normalise the SAM index
- groupby_dimstr
The dimension to compute the normalisation over
- slp_namestr, optional
The name of the slp variable in the input slp Dataset
- lon_dimstr, optional
The name of the longitude dimension
- lat_dimstr, optional
The name of the latitude dimension
- src.utils.calculate_nao(slp, clim_period, groupby_dim='time', slp_name='slp', lon_dim='lon', lat_dim='lat')#
Calculate the Northern Atlantic Oscillation index from monthly data as defined by Jianping, L. & Wang, J. X. L. (2003). The NAO index is defined as the difference between the normalized monthly mean sea level pressure at 35∘N and 65∘N, averaged over the zonal band spanning 80◦W–30◦E
- Parameters
- slpxarray Dataset
Array of sea level pressures
- clim_perioditerable
Size 2 iterable containing strings indicating the start and end dates of the climatological period used to normalise the NAO index
- groupby_dimstr
The dimension to compute the normalisation over
- slp_namestr, optional
The name of the slp variable in the input slp Dataset
- lon_dimstr, optional
The name of the longitude dimension
- lat_dimstr, optional
The name of the latitude dimension
- src.utils.calculate_amv(sst_anom, sst_name='sst')#
Calculate the Atlantic Multi-decadal Variability (AMV)–also known as the Atlantic Multi-decadal Oscillation (AMO)–according to Trenberth and Shea (2006). The AMV is calculated as the spatial average of SST anomalies over the North Atlantic (Equator–60∘ N and 80–0∘ W) minus the spatial average of SST anomalies averaged from 60∘ S to 60∘ N.
Note typically the SST anomalies are smoothed in time using a 10-year moving average (Goldenberg et al., 2001; Enfield et al., 2001), a low-pass filter (Trenberth and Shea 2006) or a 4-year temporal average (Bilbao at al., 2021).
- Parameters
- sst_anomxarray Dataset
Array of sst anomalies
- sst_namestr, optional
The name of the sst variable in sst_anom
- src.utils.calculate_ipo(sst_anom, sst_name='sst')#
Calculate the tripolar pacific index for the Interdecadal Pacific Oscillation (IPO) following Henley et al (2015). The IPO is calculated as the average of SST anomalies over the central equatorial Pacific (region 2: 10∘ S–10∘ N, 170∘ E–90∘ W) minus the average of the SST anomalies in the northwestern (region 1: 25–45∘ N, 140∘ E–145∘ W) and southwestern Pacific (region 3: 50–15∘ S, 150∘ E–160∘ W).
Note typically the IPO index is smoothed in time using a 13-year Chebyshev low-pass filter (Henley et al., 2015) or by first applying a 4-year temporal average to the sst anomalies (Bilbao at al., 2021).
- src.utils.calculate_ohc300(temp, depth_dim='depth', temp_name='temp')#
Calculate the ocean heat content above 300m
The input DataArray or Dataset is assumed to be in Kelvin
- Parameters
- tempxarray Dataset
Array of temperature values in Kelvin
- depth_dimstr, optional
The name of the depth dimension
- temp_namestr, optional
The name of the temperature variable in temp
- src.utils.calculate_wind_speed(u_v, u_name, v_name, lon_dim='lon', lat_dim='lat')#
Calculate the wind speed
- Parameters
- u_vxarray Dataset
Dataset containing the longitudinal and latitudinal components of the wind
- u_namestr
The name of the u-velocity variable in u
- v_namestr
The name of the v-velocity variable in v
- lon_dimstr, optional
The name of the longitude dimension for u and v
- lat_dimstr, optional
The name of the latitude dimension for u and v
- src.utils.calculate_tmean_from_tmin_tmax(ds, tmin_name='tmin', tmax_name='tmax', tmean_name='tmean')#
Estimate tmean as the average of tmin and tmax
- Parameters
- dsxarray Dataset
Dataset containing tmin and tmax variables
- tmin_namestr
The name of the tmin variable
- tmax_namestr
The name of the tmax variable
- tmean_namestr
The name of the output tmean variable
- src.utils.calculate_ffdi(ds, clim_period, wind_from_components, precip_name='precip', rh_name='rh', tmax_name='t_ref_max', wmax_name='V_ref_max', u_name='u_ref', v_name='v_ref')#
Returns the McArthur Forest Fire Danger Index following the formula provided in Dowdy (2018): FFDI = D ** 0.987 * exp (0.0338 * T - 0.0345 * H + 0.0234 * W + 0.243147)
- Parameters
- dsxarray Dataset
Dataset containing the following variables - precip; Daily total precipitation [mm]. This is used to estimate the drought factor, D, as the 20-day accumulated rainfall scaled to lie between 0 and 10, with larger values indicating less precipitation (see Richardson et al. (2021) and Squire et al. (2021)). The drought factor is used as D in the above equation. - tmax; Daily max 2 m temperature [deg C]. This is used as T in the above equation. - rh; Daily max relative humidity at 2m [%] (or similar, depending on data availability). Richardson et al. (2021) uses mid-afternoon relative humidity at 2 m, Squire et al. (2021) uses daily mean relative humidity at 1000 hPa. This is used as H in the above equation. - wmax; Daily max 10 m wind speed [km/h] (or similar, depending on data availability). Squire et al. (2021) uses daily mean wind speed. This is used as W in the above equation.
- clim_perioditerable
Size 2 iterable containing strings indicating the start and end dates of the climatological period used to calculate the drought factor
- wind_from_componentsboolean
Whether to calculate the wmax estimate from provided individual components of wind or whether to use a provide max estimate. If True, variables with names matching those provided as parameters ‘u_name’ and ‘v_name’ must exist in ds. If False, uses for wmax the variable name provided as the wmax_name parameter.
- precip_namestr, optional
The name of the precip variable
- rh_namestr, optional
The name of the rh variable
- tmax_namestr, optional
The name of the tmax variable
- wmax_namestr, optional
The name of the wmax variable. This is only used if wind_from_components=False Otherwise an estimate of wmax is calculated from the variables u_name and v_name
- u_namestr, optional
The name of the u-component of wind variable to use to estimate wmax when wind_from_components=True. Not used if wind_from_components=False.
- v_namestr, optional
The name of the v-component of wind variable to use to estimate wmax when wind_from_components=True. Not used if wind_from_components=False.
References
Dowdy, A. J. (2018). “Climatological Variability of Fire Weather in Australia”. Journal of Applied Meteorology and Climatology 57.2, pp. 221–234. issn: 1558-8424. doi: 10.1175/JAMC-D-17-0167.1.
- src.utils.calculate_EHF(T, T_p95_file=None, T_p95_period=None, T_p95_dim=None, rolling_dim='time', T_name='t_ref')#
Calculate the Excess Heat Factor (EHF) index, defined as:
EHF = max(0, EHI_sig) * max(1, EHI_accl)
with
EHI_sig = (T_i + T_i+1 + T_i+2) / 3 – T_p95 EHI_accl = (T_i + T_i+1 + T_i+2) / 3 – (T_i–1 + … + T_i–30) / 30
T is the daily mean temperature (commonly calculated as the mean of the min and max daily temperatures, usually with daily maximum typically preceding the daily minimum, and the two observations relate to the same 9am-to-9am 24-h period) and T_p95 is the 95th percentile of T using all days in the year.
- Parameters
- Txarray DataArray
Array of daily mean temperature
- T_p95_filexarray DataArray, optional
Path to a file with the 95th percentiles of T using all days in the year. This should be relative to the project directory. If not provided, T_p95_period and T_p95_dim must be provided
- T_p95_periodlist of str, optional
Size 2 iterable containing strings indicating the start and end dates of the period over which to calculate T_p95. Only used if T_p95 is None
- T_p95_dimstr or list of str, optional
The dimension(s) over which to calculate T_p95. Only used if T_p95 is None
- rolling_dimstr, optional
The dimension over which to compute the rolling averages in the definition of EHF
- T_namestr, optional
The name of the temperature variable in T
- References
- ———-
- Nairn et al. 2015: https://doi.org/10.3390/ijerph120100227
- src.utils.calculate_EHF_severity(T, T_p95_file=None, EHF_p85_file=None, T_p95_period=None, T_p95_dim=None, EHF_p85_period=None, EHF_p85_dim=None, rolling_dim='time', T_name='t_ref')#
Calculate the severity of the Excess Heat Factor index, defined as:
EHF_severity = EHF / EHF_p85
where “_p85” denotes the 85th percentile of all positive values using all days in the year and the Excess Heat Factor (EHF) is defined as:
EHF = max(0, EHI_sig) * max(1, EHI_accl)
with
EHI_sig = (T_i + T_i+1 + T_i+2) / 3 – T_p95 EHI_accl = (T_i + T_i+1 + T_i+2) / 3 – (T_i–1 + … + T_i–30) / 30
T is the daily mean temperature (commonly calculated as the mean of the min and max daily temperatures, usually with daily maximum typically preceding the daily minimum, and the two observations relate to the same 9am-to-9am 24-h period) and T_p95 is the 95th percentile of T using all days in the year.
- Parameters
- Txarray DataArray
Array of daily mean temperature
- T_p95_filexarray DataArray, optional
Path to a file with the 95th percentiles of T using all days in the year. This should be relative to the project directory. If not provided, T_p95_period and T_p95_dim must be provided
- EHF_p85_filexarray DataArray, optional
Path to a file with the 85th percentiles of positive EHF using all days in the year. This should be relative to the project directory. If not provided, EHF_p85_period and EHF_p85_dim must be provided
- T_p95_periodlist of str, optional
Size 2 iterable containing strings indicating the start and end dates of the period over which to calculate T_p95. Only used if T_p95 is None
- T_p95_dimstr or list of str, optional
The dimension(s) over which to calculate T_p95. Only used if T_p95 is None
- EHF_p85_periodlist of str, optional
Size 2 iterable containing strings indicating the start and end dates of the period over which to calculate EHF_p85. Only used if EHF_p85 is None
- EHF_p85_dimstr or list of str, optional
The dimension(s) over which to calculate EHF_p85. Only used if EHF_p85 is None
- rolling_dimstr, optional
The dimension over which to compute the rolling averages in the definition of EHF
- T_namestr, optional
The name of the temperature variable in T
References
Nairn et al. 2015: https://doi.org/10.3390/ijerph120100227
- src.utils.ensemble_mean(ds, ensemble_dim='member')#
Return the ensemble mean of the input array
- Parameters
- dsxarray Dataset
Array to take the ensemble mean of
- ensemble_dimstr, optional
The name of the ensemble dimension
- src.utils.greater_than(ds, value)#
Return a boolean array with True where elements > value
- Parameters
- ds: xarray Dataset
The array to mask
- value: float, xarray Dataset
The value(s) to use to mask ds
- src.utils.where_greater_than(ds, value)#
Return array with elements <= value masked to nan
- Parameters
- ds: xarray Dataset
The array to mask
- value: float, xarray Dataset
The value(s) to use to mask ds
- src.utils.add_CAFE_grid_info(ds)#
Add CAFE grid info to a CAFE dataset that doesn’t already have it
- Parameters
- dsxarray Dataset
The dataset to add grid info to
- src.utils.normalise_by_days_in_month(ds)#
Normalise input array by the number of days in each month
- Parameters
- dsxarray Dataset
The array to normalise
- src.utils.convert_time_to_lead(ds, time_dim='time', time_freq=None, init_dim='init', lead_dim='lead')#
Return provided array with time dimension converted to lead time dimension and time added as additional coordinate
- Parameters
- dsxarray Dataset
A dataset with a time dimension
- time_dimstr, optional
The name of the time dimension
- time_freqstr, optional
The frequency of the time dimension. If not provided, will try to use xr.infer_freq to determine the frequency. This is only used to add a freq attr to the lead time coordinate
- init_dimstr, optional
The name of the initial date dimension in the output
- lead_dimstr, optional
The name of the lead time dimension in the output
- src.utils.truncate_latitudes(ds, dp=10, lat_dim='lat')#
Return provided array with latitudes truncated to specified dp.
This is necessary due to precision differences from running forecasts on different systems
- Parameters
- dsxarray Dataset
A dataset with a latitude dimension
- dpint, optional
The number of decimal places to truncate at
- lat_dimstr, optional
The name of the latitude dimension
- src.utils.convert_calendar(ds, calendar, time_dim='time')#
Convert calendar, dropping invalid/surplus dates or inserting missing dates
- Parameters
- dsxarray Dataset
A dataset with a time dimension
- time_dimstr, optional
The name of the time dimension
- src.utils.rechunk(ds, **chunks)#
Rechunk a dataset
- Parameters
- dsxarray Dataset
A dataset to be rechunked
- chunksdict
Dictionary of {dim: chunksize}
- src.utils.select(ds, **selection)#
Returns a new dataset with each array indexed by tick labels along the specified dimension(s)
- Parameters
- dsxarray Dataset
A dataset to select from
- selectiondict
A dict with keys matching dimensions and values given by scalars, slices or arrays of tick labels
- src.utils.add_attrs(ds, attrs, variable=None)#
Add attributes to a dataset
- Parameters
- dsxarray Dataset
The data to add attributes to
- attrsdict
The attributes to add
- variablestr, optional
The name of the variable or coordinate to add the attributes to. If None, the attributes will be added as global attributes
- src.utils.rename(ds, **names)#
Rename all variables etc that have an entry in names
- Parameters
- dsxarray Dataset
A dataset to be renamed
- namesdict
Dictionary of {old_name: new_name}
- src.utils.convert(ds, **conversion)#
Convert variables in a dataset according to provided dictionary
- Parameters
- dsxarray Dataset
A dataset to be converted
- conversiondict
Dictionary of {variable: oper} where oper is a dictionary specifying the operation and the value. Current possible operations are ‘multiply_by’ and ‘add’.
- src.utils.keep_period(ds, period)#
Keep only times outside of a specified period
- Parameters
- dsxarray Dataset
The data to mask
- perioditerable
Size 2 iterable containing strings indicating the start and end dates of the period to retain
- src.utils._get_groupby_and_reduce_dims(ds, frequency)#
Get the groupby and reduction dimensions for performing operations like calculating anomalies and percentile thresholds
- src.utils.anomalise(ds, clim_period, frequency=None)#
Returns the anomalies of ds relative to its climatology over clim_period.
Uses a shortcut for calculating hindcast climatologies that will not work for hindcasts with initialisation frequencies more regular than monthly.
- Parameters
- dsxarray Dataset
The data to anomalise
- clim_perioditerable
Size 2 iterable containing strings indicating the start and end dates of the climatological period
- frequencystr, optional
The frequency at which to bin the climatology, e.g. per month. Must be an available attribute of the datetime accessor. Specify “None” to indicate no frequency (climatology calculated by averaging all times). Note, setting to “None” for hindcast data can be dangerous, since only certain times may be available at each lead.
- src.utils.calculate_percentile_thresholds(ds, percentile, percentile_period, percentile_dim=None, frequency=None)#
Returns the percentile values of ds over a provided period.
- Parameters
- dsxarray Dataset
The data to calculate the percentiles
- percentilefloat
The percentile to calculate
- percentile_perioditerable
Size 2 iterable containing strings indicating the start and end dates of the period over which to calculate the percentile thresholds
- percentile_dimstr or list of str, optional
The dimension(s) over which to compute the percentile thresholds. If None, these will determined automatically based on the type of input data: - timeseries : percentile_dim = “time” - forecasts : percentile_dim = “init” [, “member”]
- frequencystr, optional
The frequency at which to bin the percentiles percentiles, e.g. per month. Must be an available attribute of the datetime accessor. Specify “None” to indicate no frequency (percentiles calculated over all times). Note, setting to “None” for hindcast data can be dangerous, since only certain times may be available at each lead.
- src.utils.over_percentile_threshold(ds, percentile, percentile_period, percentile_dim=None, frequency=None)#
Find which values in the input array are over a specified percentile calculated over a specified period. Returns a boolean array with True where values are over the specified percentile and False elsewhere.
- Parameters
- dsxarray Dataset
The data threshold based in it’s percentiles
- percentilefloat
The percentile use to threshold the data
- percentile_perioditerable
Size 2 iterable containing strings indicating the start and end dates of the period over which to calculate the percentile thresholds
- frequencystr, optional
The frequency at which to bin the percentiles percentiles, e.g. per month. Must be an available attribute of the datetime accessor. Specify “None” to indicate no frequency (percentiles calculated over all times). Note, setting to “None” for hindcast data can be dangerous, since only certain times may be available at each lead.
- src.utils.under_percentile_threshold(ds, percentile, percentile_period, percentile_dim=None, frequency=None)#
Find which values in the input array are under a specified percentile calculated over a specified period. Returns a boolean array with True where values are under the specified percentile and False elsewhere.
- Parameters
- dsxarray Dataset
The data threshold based in it’s percentiles
- percentilefloat
The percentile use to threshold the data
- percentile_perioditerable
Size 2 iterable containing strings indicating the start and end dates of the period over which to calculate the percentile thresholds
- frequencystr, optional
The frequency at which to bin the percentiles percentiles, e.g. per month. Must be an available attribute of the datetime accessor. Specify “None” to indicate no frequency (percentiles calculated over all times). Note, setting to “None” for hindcast data can be dangerous, since only certain times may be available at each lead.
- src.utils.correct_bias(ds, obsv_file, period, frequency, method)#
Correct the mean bias of ds relative to observations over a provided period
Will not work for hindcasts with initialisation frequencies more regular than monthly.
- Parameters
- dsxarray Dataset
The hindcast data to correct
- obsv_filestr
Path to a file with the appropriate observation data to correct to. This should be relative to the project directory
- perioditerable
Size 2 iterable containing strings indicating period over which to calculate the biases
- frequencystr
The frequency at which to bin the biases, e.g. per month. Must be an available attribute of the datetime accessor. Specify “None” to indicate no frequency (climatology calculated by averaging all times). Note, setting to “None” can be dangerous, since only certain times may be available at each lead and there is no check that the same times are available between the observations and forecasts.
- methodstr
The method to use to correct the mean bias. Options are: - “additive”: the difference between the ds and obsv climatology is
subtracted from ds
- “multiplicative”: ds is divided by the ratio of the ds and obsv
climatologies
- src.utils.interpolate_to_grid_from_file(ds, file, add_area=True, ignore_degenerate=True)#
- src.utils.round_to_start_of_day(ds, dim)#
Return provided array with specified time dimension rounded to the start of the day
- Parameters
- dsxarray Dataset
The dataset with a dimension(s) to round
- dimstr
The name of the dimensions to round
- src.utils.round_to_start_of_month(ds, dim)#
Return provided array with specified time dimension rounded to the start of the month
- Parameters
- dsxarray Dataset
The dataset with a dimension(s) to round
- dimstr
The name of the dimensions to round
- src.utils.coarsen(ds, window_size, start_points=None, dim='time')#
Coarsen data, applying ‘max’ to all relevant coords and optionally starting at a particular time point in the array
- Parameters
- dsxarray Dataset
The dataset to coarsen
- start_pointslist
Value(s) of coordinate dim to start the coarsening from. If these fall outside the range of the coordinate, coarsening starts at the beginning of the array
- dimstr, optional
The name of the dimension to coarsen along
- src.utils.rolling_mean(ds, window_size, start_points=None, dim='time')#
Apply a rolling mean to the data, applying ‘max’ to all relevant coords and optionally starting at a particular time point in the array
- Parameters
- dsxarray Dataset
The dataset to apply the rolling mean to
- start_pointsstr or list of str
Value(s) of coordinate dim to start the coarsening from. If these fall outside the range of the coordinate, coarsening starts at the beginning of the array
- dimstr, optional
The name of the dimension to coarsen along
- src.utils.resample(ds, freq, start_points=None, min_samples=None, dim='time')#
Resample data to a different temporal frequency by taking the mean over all values at the downsampled frequency and optionally starting at a particular time point in the array
- Parameters
- dsxarray Dataset
The dataset to resample
- freqstr
Resample frequency expressed using pandas offset alias
- start_pointsstr or list of str
Value(s) of coordinate dim to start the resampling from. If these fall outside the range of the coordinate, resampling starts at the beginning of the array
- min_samplesint, optional
The minimum number of samples that must occur within a resampled group. If there are less samples a nan will be assigned.
- dimstr, optional
The name of the time dimension to resample along
- src.utils.get_region_masks_from_shp(ds, shapefile, header)#
Extract region masks according to a shapefile
- Parameters
- dsxarray Dataset
The array with the grid to build the masks for
- shapefilestr
The path to the shapefile to use
- headerstr
Name of the shapefile column to use to name the regions
- src.utils.average_over_NRM_super_clusters(ds)#
Average the provided array over the NRM super cluster regions
- Parameters
- dsxarray Dataset
The array to average over the NRM super cluster regions
- src.utils.mask_CAFEf6_reduced_dt(ds)#
Mask out the ensemble members of CAFE-f6 that were run with a reduced timestep since reducing the timestep was found to produce a different model equilibrium
- Parameters
- dsxarray Dataset
The CAFE-f6 data to mask
- src.utils.gridarea_cdo(ds)#
Returns the area weights computed using cdo’s gridarea function Note, this function writes ds to disk, so strip back ds to only what is needed
- Parameters
- dsxarray Dataset
The dataset to passed to cdo gridarea
- src.utils.add_area_using_cdo_gridarea(ds, lon_dim='lon', lat_dim='lat')#
Add a area coordinate to the provided dataset containing the cell areas estimated by cdo’s gridarea function
- Parameters
- dsxarray Dataset
The data to use to estimate the cell areas
- lon_dimstr, optional
The name of the longitude dimension on ds
- lat_dimstr, optional
The name of the latitude dimension on ds
- src.utils.max_chunk_size_MB(ds)#
Get the max chunk size in a dataset