Skip to content

dodola.core

Core logic for bias adjustment and downscaling

Math stuff and business logic goes here. This is the "business logic".

Functions:

Name Description
adjust_analogdownscaling

Apply QPLAD to downscale bias corrected output.

adjust_quantiledeltamapping

Apply QDM to adjust a range of years within a simulation.

adjust_quantiledeltamapping_year

Apply QDM to adjust a year within a simulation.

apply_precip_ceiling

Converts all precip values above a threshold to the threshold value, uniformly across space and time.

apply_wet_day_frequency_correction

Parameters

dtr_floor

Converts all diurnal temperature range (DTR) values strictly below a floor

non_polar_dtr_ceiling

Converts all non-polar (regions between the 60th south and north parallel) diurnal temperature range (DTR) values strictly above a ceiling

standardize_gcm

360 calendar conversion requires that there are no chunks in

test_dtr_range

Ensure DTR values are in a valid range

test_for_nans

Tests for presence of NaNs

test_maximum_precip

Tests that max precip is reasonable

test_negative_values

Tests for presence of negative values

test_temp_range

Ensure temperature values are in a valid range

test_timesteps

Tests that Dataset contains the correct number of timesteps (number of days on a noleap calendar)

test_variable_names

Test that the correct variable name exists in the file

train_analogdownscaling

Train Quantile-Preserving, Localized Analogs Downscaling (QPLAD)

train_quantiledeltamapping

Train quantile delta mapping

xclim_convert_360day_calendar_interpolate

Parameters

xclim_remove_leapdays

Parameters

xclim_units_any2pint

Parameters

xclim_units_pint2cf

Parameters

xesmf_regrid

Regrid a Dataset.

dodola.core.adjust_analogdownscaling

adjust_analogdownscaling(simulation, qplad, variable)

Apply QPLAD to downscale bias corrected output.

Parameters:

Name Type Description Default
simulation Dataset

Daily bias corrected data to be downscaled. Target variable must have a units attribute.

required
qplad Dataset or QuantilePreservingAnalogDownscaling

Trained xclim.sdba.adjustment.QuantilePreservingAnalogDownscaling, or Dataset representation that will instantiate xclim.sdba.adjustment.QuantilePreservingAnalogDownscaling.

required
variable str

Target variable in simulation to downscale. Downscaled output will share the same name.

required

Returns:

Name Type Description
out Dataset

QPLAD-downscaled values from simulation. May be a lazy-evaluated future, not yet computed.

Source code in dodola/core.py
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
def adjust_analogdownscaling(simulation, qplad, variable):
    """Apply QPLAD to downscale bias corrected output.

    Parameters
    ----------
    simulation : xr.Dataset
        Daily bias corrected data to be downscaled. Target variable must have a units attribute.
    qplad : xr.Dataset or sdba.adjustment.QuantilePreservingAnalogDownscaling
        Trained ``xclim.sdba.adjustment.QuantilePreservingAnalogDownscaling``, or
        Dataset representation that will instantiate
        ``xclim.sdba.adjustment.QuantilePreservingAnalogDownscaling``.
    variable : str
        Target variable in `simulation` to downscale. Downscaled output will share the
        same name.

    Returns
    -------
    out : xr.Dataset
        QPLAD-downscaled values from `simulation`. May be a lazy-evaluated future, not
        yet computed.
    """
    variable = str(variable)

    if isinstance(qplad, xr.Dataset):
        qplad = sdba.adjustment.QuantilePreservingAnalogDownscaling.from_dataset(qplad)

    out = qplad.adjust(simulation[variable]).to_dataset(name=variable)

    out = out.transpose(*simulation[variable].dims)
    # Overwrite QPLAD output attrs with input simulation attrs.
    out.attrs = simulation.attrs
    for k, v in simulation.variables.items():
        if k in out:
            out[k].attrs = v.attrs

    return out

dodola.core.adjust_quantiledeltamapping

adjust_quantiledeltamapping(simulation, variable, qdm, years, astype=None, quantile_variable='sim_q', **kwargs)

Apply QDM to adjust a range of years within a simulation.

Parameters:

Name Type Description Default
simulation Dataset

Daily simulation data to be adjusted. Must have sufficient observations around year to adjust. Target variable must have a units attribute.

required
variable str

Target variable in simulation to adjust. Adjusted output will share the same name.

required
qdm Dataset or QuantileDeltaMapping

Trained xclim.sdba.adjustment.QuantileDeltaMapping, or Dataset representation that will be instantiate xclim.sdba.adjustment.QuantileDeltaMapping.

required
years sequence of ints

Years of simulation to adjust, with rolling years and day grouping.

required
astype str, numpy.dtype, or None

Typecode or data-type to which the regridded output is cast.

None
quantile_variable str or None

Name of quantile coordinate to reset to data variable. Not reset if None.

'sim_q'
kwargs

Keyword arguments passed to dodola.core.adjust_quantiledeltamapping_year.

{}

Returns:

Name Type Description
out Dataset

QDM-adjusted values from simulation. May be a lazy-evaluated future, not yet computed. In addition to adjusted original variables, this includes "sim_q" variable giving quantiles from QDM biasadjustment.

Source code in dodola/core.py
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
def adjust_quantiledeltamapping(
    simulation,
    variable,
    qdm,
    years,
    astype=None,
    quantile_variable="sim_q",
    **kwargs,
):
    """Apply QDM to adjust a range of years within a simulation.

    Parameters
    ----------
    simulation : xr.Dataset
        Daily simulation data to be adjusted. Must have sufficient observations
        around `year` to adjust. Target variable must have a units attribute.
    variable : str
        Target variable in `simulation` to adjust. Adjusted output will
        share the same name.
    qdm : xr.Dataset or sdba.adjustment.QuantileDeltaMapping
        Trained ``xclim.sdba.adjustment.QuantileDeltaMapping``, or Dataset
        representation that will be instantiate
        ``xclim.sdba.adjustment.QuantileDeltaMapping``.
    years : sequence of ints
        Years of simulation to adjust, with rolling years and day grouping.
    astype : str, numpy.dtype, or None, optional
        Typecode or data-type to which the regridded output is cast.
    quantile_variable : str or None, optional
        Name of quantile coordinate to reset to data variable. Not reset
        if ``None``.
    kwargs :
        Keyword arguments passed to
        ``dodola.core.adjust_quantiledeltamapping_year``.

    Returns
    -------
    out : xr.Dataset
        QDM-adjusted values from `simulation`. May be a lazy-evaluated future, not
        yet computed. In addition to adjusted original variables, this includes
        "sim_q" variable giving quantiles from QDM biasadjustment.
    """
    # This loop is a candidate for dask.delayed. Beware, xclim had issues with saturated scheduler.
    qdm_list = []
    for yr in years:
        adj = adjust_quantiledeltamapping_year(
            simulation=simulation, qdm=qdm, year=yr, variable=variable, **kwargs
        )
        if astype:
            adj = adj.astype(astype)
        qdm_list.append(adj)

    # Combine years and ensure output matches input data dimension order.
    adjusted_ds = xr.concat(qdm_list, dim="time").transpose(*simulation[variable].dims)

    if quantile_variable:
        adjusted_ds = adjusted_ds.reset_coords(quantile_variable)
        # Analysts said sim_q needed no attrs.
        adjusted_ds[quantile_variable].attrs = {}

    # Overwrite QDM output attrs with input simulation attrs.
    adjusted_ds.attrs = simulation.attrs
    for k, v in simulation.variables.items():
        if k in adjusted_ds:
            adjusted_ds[k].attrs = v.attrs

    return adjusted_ds

dodola.core.adjust_quantiledeltamapping_year

adjust_quantiledeltamapping_year(simulation, qdm, year, variable, halfyearwindow_n=10, include_quantiles=False)

Apply QDM to adjust a year within a simulation.

Parameters:

Name Type Description Default
simulation Dataset

Daily simulation data to be adjusted. Must have sufficient observations around year to adjust. Target variable must have a units attribute.

required
qdm Dataset or QuantileDeltaMapping

Trained xclim.sdba.adjustment.QuantileDeltaMapping, or Dataset representation that will be instantiate xclim.sdba.adjustment.QuantileDeltaMapping.

required
year int

Target year to adjust, with rolling years and day grouping.

required
variable str

Target variable in simulation to adjust. Adjusted output will share the same name.

required
halfyearwindow_n int

Half-length of the annual rolling window to extract along either side of year.

10
include_quantiles bool

Whether or not to output quantiles (sim_q) as a coordinate on the bias corrected data variable in output.

False

Returns:

Name Type Description
out Dataset

QDM-adjusted values from simulation. May be a lazy-evaluated future, not yet computed.

Source code in dodola/core.py
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
def adjust_quantiledeltamapping_year(
    simulation, qdm, year, variable, halfyearwindow_n=10, include_quantiles=False
):
    """Apply QDM to adjust a year within a simulation.

    Parameters
    ----------
    simulation : xr.Dataset
        Daily simulation data to be adjusted. Must have sufficient observations
        around `year` to adjust. Target variable must have a units attribute.
    qdm : xr.Dataset or sdba.adjustment.QuantileDeltaMapping
        Trained ``xclim.sdba.adjustment.QuantileDeltaMapping``, or
        Dataset representation that will be instantiate
        ``xclim.sdba.adjustment.QuantileDeltaMapping``.
    year : int
        Target year to adjust, with rolling years and day grouping.
    variable : str
        Target variable in `simulation` to adjust. Adjusted output will share the
        same name.
    halfyearwindow_n : int, optional
        Half-length of the annual rolling window to extract along either
        side of `year`.
    include_quantiles : bool, optional
        Whether or not to output quantiles (sim_q) as a coordinate on
        the bias corrected data variable in output.

    Returns
    -------
    out : xr.Dataset
        QDM-adjusted values from `simulation`. May be a lazy-evaluated future, not
        yet computed.
    """
    year = int(year)
    variable = str(variable)
    halfyearwindow_n = int(halfyearwindow_n)

    if isinstance(qdm, xr.Dataset):
        qdm = sdba.adjustment.QuantileDeltaMapping.from_dataset(qdm)

    # Slice to get 15 days before and after our target year. This accounts
    # for the rolling 31 day rolling window.
    timeslice = slice(
        f"{year - halfyearwindow_n - 1}-12-17", f"{year + halfyearwindow_n + 1}-01-15"
    )
    simulation = simulation[variable].sel(
        time=timeslice
    )  # TODO: Need a check to ensure we have all the data in this slice!
    if include_quantiles:
        # include quantile information in output
        with set_options(sdba_extra_output=True):
            out = qdm.adjust(simulation, interp="nearest").sel(time=str(year))
            # make quantiles a coordinate of bias corrected output variable
            out = out["scen"].assign_coords(sim_q=out.sim_q)
    else:
        out = qdm.adjust(simulation, interp="nearest").sel(time=str(year))

    return out.to_dataset(name=variable)

dodola.core.apply_precip_ceiling

apply_precip_ceiling(ds, ceiling)

Converts all precip values above a threshold to the threshold value, uniformly across space and time.

Parameters:

Name Type Description Default
ds Dataset
required
ceiling int or float
required

Returns:

Type Description
Dataset
Source code in dodola/core.py
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
def apply_precip_ceiling(ds, ceiling):
    """
    Converts all precip values above a threshold to the threshold value, uniformly across space and time.

    Parameters
    ----------
    ds : xr.Dataset
    ceiling : int or float

    Returns
    -------
    xr.Dataset

    """
    ds_corrected = ds.where(ds <= ceiling, ceiling)
    return ds_corrected

dodola.core.apply_wet_day_frequency_correction

apply_wet_day_frequency_correction(ds, process, variable='pr')

Parameters:

Name Type Description Default
ds Dataset
required
process (pre, post)
"pre"
variable
'pr'

Returns:

Type Description
Dataset
Notes

[1] A.J. Cannon, S.R. Sobie, and T.Q. Murdock (2015), "Bias correction of GCM precipitation by quantile mapping: How well do methods preserve changes in quantiles and extremes?", Journal of Climate, vol. 28, Issue 7, pp. 6938-6959. [2] S. Hempel, K. Frieler, L. Warszawski, J. Schewe, and F. Piotek (2013), "A trend-preserving bias correction - The ISI-MIP approach", Earth Syst. Dynam. vol. 4, pp. 219-236.

Source code in dodola/core.py
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
def apply_wet_day_frequency_correction(ds, process, variable="pr"):
    """

    Parameters
    ----------
    ds : xr.Dataset
    process : {"pre", "post"}
    variable: str

    Returns
    -------
    xr.Dataset

    Notes
    -------
    [1] A.J. Cannon, S.R. Sobie, and T.Q. Murdock (2015), "Bias correction of GCM
        precipitation by quantile mapping: How well do methods preserve
        changes in quantiles and extremes?", Journal of Climate, vol.
        28, Issue 7, pp. 6938-6959.
    [2] S. Hempel, K. Frieler, L. Warszawski, J. Schewe, and F. Piotek (2013), "A trend-preserving bias correction - The ISI-MIP approach", Earth Syst. Dynam. vol. 4, pp. 219-236.
    """
    # threshold from Hempel et al 2013
    threshold = 1.0  # mm/day
    # adjusted "low" value from the original epsilon in Cannon et al 2015 to
    # avoid having some values get extremely large
    low = threshold / 2.0

    if process == "pre":
        # includes very small values that are negative in CMIP6 output
        ds[variable] = ds[variable].where(
            ds[variable] >= threshold,
            np.random.uniform(
                low=low,
                high=threshold,
                size=ds[variable].shape,
            ).astype(ds[variable].data.dtype),
        )
    elif process == "post":
        ds[variable] = ds[variable].where(ds[variable] >= threshold, 0.0)
    else:
        raise ValueError("this processing option is not implemented")
    return ds

dodola.core.dtr_floor

dtr_floor(ds, floor)

Converts all diurnal temperature range (DTR) values strictly below a floor to that floor.

Parameters:

Name Type Description Default
ds Dataset
required
floor int or float
required

Returns:

Type Description
Dataset
Source code in dodola/core.py
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
def dtr_floor(ds, floor):
    """
    Converts all diurnal temperature range (DTR) values strictly below a floor
    to that floor.

    Parameters
    ----------
    ds : xr.Dataset
    floor : int or float

    Returns
    -------
    xr.Dataset

    """

    ds_corrected = ds.where(ds >= floor, floor)
    return ds_corrected

dodola.core.non_polar_dtr_ceiling

non_polar_dtr_ceiling(ds, ceiling)

Converts all non-polar (regions between the 60th south and north parallel) diurnal temperature range (DTR) values strictly above a ceiling to that ceiling.

Parameters:

Name Type Description Default
ds Dataset
required
ceiling int or float
required

Returns:

Type Description
Dataset
Source code in dodola/core.py
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
def non_polar_dtr_ceiling(ds, ceiling):
    """
    Converts all non-polar (regions between the 60th south and north parallel) diurnal temperature range (DTR) values strictly above a ceiling
    to that ceiling.

    Parameters
    ----------
    ds : xr.Dataset
    ceiling : int or float

    Returns
    -------
    xr.Dataset

    """

    ds_corrected = ds.where(
        xr.ufuncs.logical_or(
            ds <= ceiling, xr.ufuncs.logical_or(ds["lat"] <= -60, ds["lat"] >= 60)
        ),
        ceiling,
    )

    return ds_corrected

dodola.core.standardize_gcm

standardize_gcm(ds, leapday_removal=True)

360 calendar conversion requires that there are no chunks in the 'time' dimension of ds.

Parameters:

Name Type Description Default
ds Dataset
required
leapday_removal bool
True

Returns:

Type Description
Dataset
Source code in dodola/core.py
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
def standardize_gcm(ds, leapday_removal=True):
    """

    360 calendar conversion requires that there are no chunks in
    the 'time' dimension of `ds`.

    Parameters
    ----------
    ds : xr.Dataset
    leapday_removal : bool, optional

    Returns
    -------
    xr.Dataset
    """
    # Remove cruft coordinates, variables, dims.
    cruft_vars = ("height", "member_id", "time_bnds")

    dims_to_squeeze = []
    coords_to_drop = []
    for v in cruft_vars:
        if v in ds.dims:
            dims_to_squeeze.append(v)
        elif v in ds.coords:
            coords_to_drop.append(v)

    ds_cleaned = ds.squeeze(dims_to_squeeze, drop=True).reset_coords(
        coords_to_drop, drop=True
    )

    # Cleanup time.

    # if variable is precip, need to update units to mm day-1
    if "pr" in ds_cleaned.variables:
        # units should be kg/m2/s in CMIP6 output
        if ds_cleaned["pr"].units == "kg m-2 s-1":
            # convert to mm/day
            mmday_conversion = 24 * 60 * 60
            ds_cleaned["pr"] = ds_cleaned["pr"] * mmday_conversion
            # update units attribute
            ds_cleaned["pr"].attrs["units"] = "mm day-1"
        else:
            # we want this to fail, as pr units are something we don't expect
            raise ValueError("check units: pr units attribute is not kg m-2 s-1")

    cal = get_calendar(ds_cleaned)

    if (
        cal == "360_day" or leapday_removal
    ):  # calendar conversion is necessary in either case
        # if calendar is just integers, xclim cannot understand it
        if ds_cleaned.time.dtype == "int64":
            ds_cleaned["time"] = xr.decode_cf(ds_cleaned).time
        if cal == "360_day":
            # Cannot have chunks in time dimension for 360 day calendar conversion so loading
            # data into memory.
            ds_cleaned.load()

            if leapday_removal:  # 360 day -> noleap
                ds_converted = xclim_convert_360day_calendar_interpolate(
                    ds=ds_cleaned,
                    target="noleap",
                    align_on="random",
                    interpolation="linear",
                )
            else:  # 360 day -> standard
                ds_converted = xclim_convert_360day_calendar_interpolate(
                    ds=ds_cleaned,
                    target="standard",
                    align_on="random",
                    interpolation="linear",
                )
        else:  # any -> noleap
            # remove leap days and update calendar
            ds_converted = xclim_remove_leapdays(ds_cleaned)

        # rechunk, otherwise chunks are different sizes
        ds_out = ds_converted.chunk(
            {"time": 730, "lat": len(ds_cleaned.lat), "lon": len(ds_cleaned.lon)}
        )

    else:
        ds_out = ds_cleaned

    return ds_out

dodola.core.test_dtr_range

test_dtr_range(ds, var, data_type)

Ensure DTR values are in a valid range Test polar values separately since some polar values can be much higher post bias adjustment.

Source code in dodola/core.py
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
def test_dtr_range(ds, var, data_type):
    """
    Ensure DTR values are in a valid range
    Test polar values separately since some polar values can be much higher post bias adjustment.
    """
    # test that DTR values are greater than 0 or equal to 0 depending on the data type
    # note that some CMIP6 DTR will equal 0 at polar latitudes, this will be adjusted
    # before bias adjustment with the DTR small values correction

    dtr_min = ds[var].min()
    if data_type == "cmip6":
        # may be equal to zero in polar regions and if tasmax < tasmin (only occurs for GFDL models)
        assert (
            dtr_min >= 0
        ), "diurnal temperature range minimum is {} and thus not greater than or equal to 0 for CMIP6".format(
            dtr_min
        )
    else:
        # this must be greater than 0 for bias corrected and downscaled
        assert (
            dtr_min > 0
        ), "diurnal temperature range minimum is {} and must be greater than zero".format(
            dtr_min
        )

    # test polar DTR values
    southern_polar_max = ds[var].where(ds.lat < -60).max()
    if (southern_polar_max is not None) and (southern_polar_max >= 100):
        assert (
            southern_polar_max < 100
        ), "diurnal temperature range max is {} for polar southern latitudes".format(
            southern_polar_max
        )

    northern_polar_max = ds[var].where(ds.lat > 60).max()
    if (northern_polar_max is not None) and (northern_polar_max >= 100):
        assert (
            northern_polar_max < 100
        ), "diurnal temperature range max is {} for polar northern latitudes".format(
            northern_polar_max
        )

    # test all but polar regions
    non_polar_max = ds[var].where((ds.lat > -60) & (ds.lat < 60)).max()
    assert (
        non_polar_max <= 70
    ), "diurnal temperature range max is {} for non-polar regions".format(non_polar_max)

dodola.core.test_for_nans

test_for_nans(ds, var)

Tests for presence of NaNs

Source code in dodola/core.py
637
638
639
640
641
def test_for_nans(ds, var):
    """
    Tests for presence of NaNs
    """
    assert ds[var].isnull().sum() == 0, "there are nans!"

dodola.core.test_maximum_precip

test_maximum_precip(ds, var)

Tests that max precip is reasonable

Source code in dodola/core.py
782
783
784
785
786
787
788
789
790
791
792
793
794
795
def test_maximum_precip(ds, var):
    """
    Tests that max precip is reasonable
    """
    threshold = 3000  # in mm, max observed is 1.825m --> maximum occurs between 0.5-0.8
    max_precip = ds[var].max().load().values
    num_precip_values_over_threshold = (
        ds[var].where(ds[var] > threshold).count().load().values
    )
    assert (
        num_precip_values_over_threshold == 0
    ), "maximum precip is {} mm and there are {} values over 3000mm".format(
        max_precip, num_precip_values_over_threshold
    )

dodola.core.test_negative_values

test_negative_values(ds, var)

Tests for presence of negative values

Source code in dodola/core.py
773
774
775
776
777
778
779
def test_negative_values(ds, var):
    """
    Tests for presence of negative values
    """
    # this is not set to 0 to deal with floating point error
    neg_values = ds[var].where(ds[var] < -0.001).count()
    assert neg_values == 0, "there are {} negative values!".format(neg_values)

dodola.core.test_temp_range

test_temp_range(ds, var)

Ensure temperature values are in a valid range

Source code in dodola/core.py
713
714
715
716
717
718
719
720
721
def test_temp_range(ds, var):
    """
    Ensure temperature values are in a valid range
    """
    # This high 377 K temperature range is to allow UKESM1-0-LL runs, which
    # apparently run very hot.
    assert (ds[var].min() > 130) and (
        ds[var].max() < 377
    ), "{} values are invalid".format(var)

dodola.core.test_timesteps

test_timesteps(ds, data_type, time_period)

Tests that Dataset contains the correct number of timesteps (number of days on a noleap calendar) for the data_type/time_period combination.

Source code in dodola/core.py
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
def test_timesteps(ds, data_type, time_period):
    """
    Tests that Dataset contains the correct number of timesteps (number of days on a noleap calendar)
    for the data_type/time_period combination.
    """
    if time_period == "future":
        # bias corrected/downscaled data has 2015 - 2099 or 2100 depending on the model
        # CMIP6 future data has an additional ten years from the historical model run
        if data_type == "cmip6":
            assert (
                len(ds.time) >= 35040  # some CMIP6 data ends in 2099
            ), "projection {} file is missing timesteps, has {}".format(
                data_type, len(ds.time)
            )
            if len(ds.time) > 35405:
                warnings.warn(
                    "projection {} file has excess timesteps, has {}".format(
                        data_type, len(ds.time)
                    )
                )
        else:
            assert (
                len(ds.time) >= 31025  # 2015 - 2099
            ), "projection {} file is missing timesteps, has {}".format(
                data_type, len(ds.time)
            )
            if len(ds.time) > 31390:  # 2015 - 2100
                warnings.warn(
                    "projection {} file has excess timesteps, has {}".format(
                        data_type, len(ds.time)
                    )
                )

    elif time_period == "historical":
        # bias corrected/downscaled data should have 1950 - 2014
        # CMIP6 historical data has an additional ten years from SSP 370 (or 245 if 370 not available)
        if data_type == "cmip6":
            assert (
                len(ds.time) >= 27740
            ), "historical {} file is missing timesteps, has {}".format(
                data_type, len(ds.time)
            )
            if len(ds.time) > 27740:
                warnings.warn(
                    "historical {} file has excess timesteps, has {}".format(
                        data_type, len(ds.time)
                    )
                )
        else:
            assert (
                len(ds.time) >= 23725
            ), "historical {} file is missing timesteps, has {}".format(
                data_type, len(ds.time)
            )
            if len(ds.time) > 23725:
                warnings.warn(
                    "historical {} file has excess timesteps, has {}".format(
                        data_type, len(ds.time)
                    )
                )

dodola.core.test_variable_names

test_variable_names(ds, var)

Test that the correct variable name exists in the file

Source code in dodola/core.py
706
707
708
709
710
def test_variable_names(ds, var):
    """
    Test that the correct variable name exists in the file
    """
    assert var in ds.var(), "{} not in Dataset".format(var)

dodola.core.train_analogdownscaling

train_analogdownscaling(coarse_reference, fine_reference, variable, kind, quantiles_n=620, window_n=31)

Train Quantile-Preserving, Localized Analogs Downscaling (QPLAD)

Parameters:

Name Type Description Default
coarse_reference Dataset

Dataset to use as resampled (to fine resolution) coarse reference.Target variable must have a units attribute.

required
fine_reference Dataset

Dataset to use as fine-resolution reference. Target variable must have a units attribute.

required
variable str

Name of target variable to extract from coarse_reference and fine_reference.

required
kind ('+', '*')

Kind of variable. Used for creating QPLAD adjustment factors.

"+"
quantiles_n int

Number of quantiles for QPLAD.

620
window_n int

Centered window size for day-of-year grouping.

31

Returns:

Type Description
QuantilePreservingAnalogDownscaling
Source code in dodola/core.py
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
def train_analogdownscaling(
    coarse_reference, fine_reference, variable, kind, quantiles_n=620, window_n=31
):
    """Train Quantile-Preserving, Localized Analogs Downscaling (QPLAD)

    Parameters
    ----------
    coarse_reference : xr.Dataset
        Dataset to use as resampled (to fine resolution) coarse reference.Target variable must have a units attribute.
    fine_reference : xr.Dataset
        Dataset to use as fine-resolution reference. Target variable must have a units attribute.
    variable : str
        Name of target variable to extract from `coarse_reference` and `fine_reference`.
    kind : {"+", "*"}
        Kind of variable. Used for creating QPLAD adjustment factors.
    quantiles_n : int, optional
        Number of quantiles for QPLAD.
    window_n : int, optional
        Centered window size for day-of-year grouping.

    Returns
    -------
    xclim.sdba.adjustment.QuantilePreservingAnalogDownscaling
    """

    # QPLAD method requires that the number of quantiles equals
    # the number of days in each day group
    # e.g. 20 years of data and a window of 31 = 620 quantiles

    # check that lengths of input data are the same, then only check years for one
    if len(coarse_reference.time) != len(fine_reference.time):
        raise ValueError("coarse and fine reference data inputs have different lengths")

    # check number of years in input data (subtract 2 for the +/- 15 days on each end)
    num_years = len(np.unique(fine_reference.time.dt.year)) - 2
    if (num_years * int(window_n)) != quantiles_n:
        raise ValueError(
            "number of quantiles {} must equal # of years {} * window length {}, day groups must {} days".format(
                quantiles_n, num_years, int(window_n), quantiles_n
            )
        )

    qplad = sdba.adjustment.QuantilePreservingAnalogDownscaling.train(
        ref=coarse_reference[variable],
        hist=fine_reference[variable],
        kind=str(kind),
        group=sdba.Grouper("time.dayofyear", window=int(window_n)),
        nquantiles=quantiles_n,
    )
    return qplad

dodola.core.train_quantiledeltamapping

train_quantiledeltamapping(reference, historical, variable, kind, quantiles_n=100, window_n=31)

Train quantile delta mapping

Parameters:

Name Type Description Default
reference Dataset

Dataset to use as model reference. Target variable must have a units attribute.

required
historical Dataset

Dataset to use as historical simulation. Target variable must have a units attribute.

required
variable str

Name of target variable to extract from historical and reference.

required
kind ('+', '*')

Kind of variable. Used for QDM scaling.

"+"
quantiles_n int

Number of quantiles for QDM.

100
window_n int

Centered window size for day-of-year grouping.

31

Returns:

Type Description
QuantileDeltaMapping
Source code in dodola/core.py
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
def train_quantiledeltamapping(
    reference, historical, variable, kind, quantiles_n=100, window_n=31
):
    """Train quantile delta mapping

    Parameters
    ----------
    reference : xr.Dataset
        Dataset to use as model reference. Target variable must have a units attribute.
    historical : xr.Dataset
        Dataset to use as historical simulation. Target variable must have a units attribute.
    variable : str
        Name of target variable to extract from `historical` and `reference`.
    kind : {"+", "*"}
        Kind of variable. Used for QDM scaling.
    quantiles_n : int, optional
        Number of quantiles for QDM.
    window_n : int, optional
        Centered window size for day-of-year grouping.

    Returns
    -------
    xclim.sdba.adjustment.QuantileDeltaMapping
    """
    qdm = sdba.adjustment.QuantileDeltaMapping.train(
        ref=reference[variable],
        hist=historical[variable],
        kind=str(kind),
        group=sdba.Grouper("time.dayofyear", window=int(window_n)),
        nquantiles=equally_spaced_nodes(int(quantiles_n), eps=None),
    )
    return qdm

dodola.core.xclim_convert_360day_calendar_interpolate

xclim_convert_360day_calendar_interpolate(ds, target='noleap', align_on='random', interpolation='linear', return_indices=False, ignore_nans=True)

Parameters:

Name Type Description Default
ds Dataset
required
target str

see xclim.core.calendar.convert_calendar

'noleap'
align_on str

this determines which days in the calendar will have missing values or will be the product of interpolation, if there is. It could be every year the same calendar days, or the days could randomly change. see xclim.core.calendar.convert_calendar

'random'
interpolation None or str

passed to xr.Dataset.interpolate_na if not None

'linear'
return_indices bool

on top of the converted dataset, return a list of the array indices identifying values that were inserted. This assumes there were no NaNs before conversion.

False
ignore_nans bool

if False and there are any NaNs in ds variables, an assertion error will be raised. NaNs are ignored otherwise.

True

Returns:

Type Description
tuple(xr.Dataset, xr.Dataset) if return_indices is True, xr.Dataset otherwise.
Notes

The default values of target, align_on and interpolation mean that our default approach is equivalent to that of the LOCA calendar conversion [1] for conversion from 360 days calendars to noleap calendars. In that approach, 5 calendar days are added (noleap calendars always have 365 days) to each year. But those calendar days are not necessarily those that will have their value be the product of interpolation. The days for which we interpolate are selected randomly every block of 72 days, so that they change every year.

[1] http://loca.ucsd.edu/loca-calendar/

Source code in dodola/core.py
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
def xclim_convert_360day_calendar_interpolate(
    ds,
    target="noleap",
    align_on="random",
    interpolation="linear",
    return_indices=False,
    ignore_nans=True,
):
    """
    Parameters
    ----------
    ds : xr.Dataset
    target : str
        see xclim.core.calendar.convert_calendar
    align_on : str
        this determines which days in the calendar will have missing values or will be the product of interpolation, if there is.
        It could be every year the same calendar days, or the days could randomly change. see xclim.core.calendar.convert_calendar
    interpolation : None or str
        passed to xr.Dataset.interpolate_na if not None
    return_indices : bool
        on top of the converted dataset, return a list of the array indices identifying values that were inserted.
        This assumes there were no NaNs before conversion.
    ignore_nans : bool
        if False and there are any NaNs in `ds` variables, an assertion error will be raised. NaNs are ignored otherwise.
    Returns
    -------
    tuple(xr.Dataset, xr.Dataset) if return_indices is True, xr.Dataset otherwise.

    Notes
    -----
    The default values of `target`, `align_on` and `interpolation` mean that our default approach is equivalent to that of the LOCA
    calendar conversion [1] for conversion from 360 days calendars to noleap calendars. In that approach, 5 calendar days are added (noleap
    calendars always have 365 days) to each year. But those calendar days are not necessarily those that will have their value be the product
    of interpolation. The days for which we interpolate are selected randomly every block of 72 days, so that they change every year.

    [1] http://loca.ucsd.edu/loca-calendar/
    """

    if get_calendar(ds) != "360_day":
        raise ValueError(
            "tried to use 360 day calendar conversion for a non-360-day calendar dataset"
        )

    if not ignore_nans:
        for var in ds:
            assert (
                ds[var].isnull().sum() == 0
            ), "360 days calendar conversion with interpolation : there are nans !"

    ds_converted = convert_calendar(
        ds, target=target, align_on=align_on, missing=np.NaN
    )

    if interpolation:
        ds_out = ds_converted.interpolate_na("time", interpolation)
    else:
        ds_out = ds_converted

    if return_indices:
        return (ds_out, xr.ufuncs.isnan(ds_converted))
    else:
        return ds_out

dodola.core.xclim_remove_leapdays

xclim_remove_leapdays(ds)

Parameters:

Name Type Description Default
ds Dataset
required

Returns:

Type Description
Dataset
Source code in dodola/core.py
450
451
452
453
454
455
456
457
458
459
460
461
462
def xclim_remove_leapdays(ds):
    """

    Parameters
    ----------
    ds : xr.Dataset

    Returns
    -------
    xr.Dataset
    """
    ds_noleap = convert_calendar(ds, target="noleap")
    return ds_noleap

dodola.core.xclim_units_any2pint

xclim_units_any2pint(ds, var)

Parameters:

Name Type Description Default
ds Dataset
required
var str
required

Returns:

Type Description
xr.Dataset with `var` units str attribute converted to xclim's pint registry format
Source code in dodola/core.py
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
def xclim_units_any2pint(ds, var):
    """
    Parameters
    ----------
    ds : xr.Dataset
    var : str

    Returns
    -------
    xr.Dataset with `var` units str attribute converted to xclim's pint registry format
    """

    logger.info(f"Reformatting {var} unit string representation")
    ds[var].attrs["units"] = str(xclim_units.units2pint(ds[var].attrs["units"]))
    return ds

dodola.core.xclim_units_pint2cf

xclim_units_pint2cf(ds, var)

Parameters:

Name Type Description Default
ds Dataset
required
var str
required

Returns:

Type Description
xr.Dataset with `var` units str attribute converted to CF format
Source code in dodola/core.py
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
def xclim_units_pint2cf(ds, var):
    """
    Parameters
    ----------
    ds : xr.Dataset
    var : str

    Returns
    -------
    xr.Dataset with `var` units str attribute converted to CF format
    """
    logger.info(f"Reformatting {var} unit string representation")
    ds[var].attrs["units"] = xclim_units.pint2cfunits(
        xclim_units.units2pint(ds[var].attrs["units"])
    )
    return ds

dodola.core.xesmf_regrid

xesmf_regrid(x, domain, method, weights_path=None, astype=None, add_cyclic=None, keep_attrs=True)

Regrid a Dataset.

Parameters:

Name Type Description Default
x Dataset
required
domain Dataset

Domain to regrid to.

required
method str

Method of regridding. Passed to xesmf.Regridder.

required
weights_path str

Local path to netCDF file of pre-calculated XESMF regridding weights.

None
astype str, numpy.dtype, or None

Typecode or data-type to which the regridded output is cast.

None
add_cyclic str, or None

Add cyclic point (aka wrap-around pixel) to given dimension before regridding. Useful for avoiding dateline artifacts along longitude in global datasets.

None
keep_attrs bool

Whether to pass attrs from input to regridded output.

True

Returns:

Type Description
Dataset
Source code in dodola/core.py
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
def xesmf_regrid(
    x, domain, method, weights_path=None, astype=None, add_cyclic=None, keep_attrs=True
):
    """
    Regrid a Dataset.

    Parameters
    ----------
    x : xr.Dataset
    domain : xr.Dataset
        Domain to regrid to.
    method : str
        Method of regridding. Passed to ``xesmf.Regridder``.
    weights_path : str, optional
        Local path to netCDF file of pre-calculated XESMF regridding weights.
    astype : str, numpy.dtype, or None, optional
        Typecode or data-type to which the regridded output is cast.
    add_cyclic : str, or None, optional
        Add cyclic point (aka wrap-around pixel) to given dimension before
        regridding. Useful for avoiding dateline artifacts along longitude
        in global datasets.
    keep_attrs : bool, optional
        Whether to pass attrs from input to regridded output.

    Returns
    -------
    xr.Dataset
    """
    if add_cyclic:
        x = _add_cyclic(x, add_cyclic)

    regridder = xe.Regridder(
        x,
        domain,
        method=method,
        filename=weights_path,
    )
    if astype:
        return regridder(x, keep_attrs=keep_attrs).astype(astype)
    return regridder(x, keep_attrs=keep_attrs)