{ "cells": [ { "cell_type": "markdown", "id": "f697d2a4-51c4-4fe0-acbe-afe9e67b7c84", "metadata": { "tags": [] }, "source": [ "# Regrid to regular\n", "\n", "This script shows how to regrid ERA5 data from a gaussian reduced grid to regular lon-lat grid with a linear approach." ] }, { "cell_type": "code", "execution_count": null, "id": "296ee82f-863f-44dd-866a-d37978d74468", "metadata": { "tags": [] }, "outputs": [], "source": [ "import intake\n", "import dask\n", "import logging\n", "from distributed import Client\n", "import xarray as xr\n", "#client.close()\n", "#client=Client(silence_logs=logging.ERROR)" ] }, { "cell_type": "markdown", "id": "7200b771-5d8a-4f1a-ba30-bf3db9dc5272", "metadata": {}, "source": [ "## User parameters:\n", "\n", "- **catalog**: The one from git is always reachable but uses simplecache which can cause problems with async\n", "- **catalog_entry**: One of the era5 time series available in list(cat)\n", "- **target_global_vars**: A list of variables that is defined for 1280 longitudes at the equator and should be interpolated\n", "- **openchunks**: A chunk setting for all dimension that are not related to lat and lon. Larger values mean less chunks but need more memory\n", "- **to_load_selection**: The selection of the time series for which the workflow is applied" ] }, { "cell_type": "code", "execution_count": null, "id": "4479eb7d-f82e-4f13-ac27-3c0b815fcf7f", "metadata": {}, "outputs": [], "source": [ "catalog=\"https://gitlab.dkrz.de/data-infrastructure-services/era5-kerchunks/-/raw/main/main.yaml\"\n", "#catalog=\"/work/bm1344/DKRZ/git/era5-kerchunks/main.yaml\"\n", "catalog_entry=\"surface_analysis_daily\"\n", "target_global_vars=[\"2t\"]\n", "openchunks=dict(\n", " time=4,\n", " #level=5 #for 3d data\n", ")\n", "to_load_selection=dict(\n", " time=\"2010\"\n", ")" ] }, { "cell_type": "markdown", "id": "f4e9cac3-02ea-4add-8d5b-5182205a8a3c", "metadata": {}, "source": [ "Open catalog and load data for the template for dask functions:" ] }, { "cell_type": "code", "execution_count": null, "id": "16c0affd-d069-4d1c-9f4f-bcae774e7989", "metadata": { "tags": [] }, "outputs": [], "source": [ "dask.config.set({'logging.distributed': 'error'})\n", "cat=intake.open_catalog(catalog)\n", "dssource=cat[catalog_entry](chunks=openchunks).to_dask()\n", "template_source=dssource[target_global_vars].isel(**{a:0 for a in openchunks.keys()}).load()" ] }, { "cell_type": "markdown", "id": "887faa6a-0859-46e1-8bfb-77a9ac8c2857", "metadata": {}, "source": [ "1. Unstack: Define function and template\n", "- Select equator lons for interpolation\n", "- Chunk entire record (lonxlat)" ] }, { "cell_type": "code", "execution_count": null, "id": "e858757e-e0b5-4ce8-a1f5-f56891b6b510", "metadata": { "tags": [] }, "outputs": [], "source": [ "def unstack(ds):\n", " return ds.rename({'value':'latlon'}).set_index(latlon=(\"lat\",\"lon\")).unstack(\"latlon\")\n", "\n", "template_unstack=unstack(template_source)\n", "equator_lons=template_unstack[target_global_vars].sel(lat=0.0,method=\"nearest\").dropna(dim=\"lon\")[\"lon\"]" ] }, { "cell_type": "code", "execution_count": null, "id": "feabace6-e3bf-4706-a560-2457bf2ecfad", "metadata": { "tags": [] }, "outputs": [], "source": [ "latlonchunks={\n", " a:len(template_unstack[a])\n", " for a in template_unstack.dims\n", "}\n", "\n", "nolatlonchunks={\n", " a:dssource[target_global_vars].chunksizes[a]\n", " for a in openchunks.keys()\n", "}" ] }, { "cell_type": "code", "execution_count": null, "id": "22d177a6-7f5d-4130-b6c8-7b9d5d25b2c4", "metadata": { "tags": [] }, "outputs": [], "source": [ "template_unstack=template_unstack.chunk(**latlonchunks)" ] }, { "cell_type": "markdown", "id": "a5adae35-afe1-406e-a5c6-d0e9272d969c", "metadata": {}, "source": [ "2. Interp: Interpolate all nans linearly and select only next to equator longitudes." ] }, { "cell_type": "code", "execution_count": null, "id": "8275f5af-4784-42ff-b46e-d9b81ffb8112", "metadata": {}, "outputs": [], "source": [ "def interp(ds):\n", " #reindexed_block=ds.dropna(dim=\"lon\").reindex(lon=xr.concat([ds[\"lon\"],equator_lons],\"lon\")[\"lon\"]).sortby(\"lon\").drop_duplicates('lon')\n", " #interped=ds.interpolate_na(dim=\"lon\",method=\"linear\",period=360.0)\n", " #interped=dask.optimize(interped)[0]\n", " #return ds.groupby(\"lat\").apply(\n", " # lambda dslat: dslat.dropna(dim=\"lon\").interp(lon=equator_lons.values,method=\"linear\",kwargs={\"fill_value\": \"extrapolate\"})\n", " #)\n", " return ds.interpolate_na(dim=\"lon\",method=\"linear\",period=360.0).reindex(lon=equator_lons)\n", "\n", "template_interp=interp(template_unstack)" ] }, { "cell_type": "code", "execution_count": null, "id": "be2e7869-87ba-4333-9a18-8343dbfe7141", "metadata": { "tags": [] }, "outputs": [], "source": [ "template_unstack=template_unstack.expand_dims(**{a:dssource[a] for a in openchunks.keys()}).chunk(nolatlonchunks)\n", "template_interp=template_interp.expand_dims(**{a:dssource[a] for a in openchunks.keys()}).chunk(nolatlonchunks)" ] }, { "cell_type": "markdown", "id": "829fa7dc-4649-4aae-9860-af8515b72566", "metadata": {}, "source": [ "## Define workflow here" ] }, { "cell_type": "code", "execution_count": null, "id": "d9355142-3835-487d-a604-0c81dc9c9261", "metadata": { "tags": [] }, "outputs": [], "source": [ "original=dssource[target_global_vars].sel(**to_load_selection)\n", "#unstacked=dssource[target_global_vars].map_blocks(unstack,template=template_unstack[target_global_vars])\n", "#dataset:\n", "unstacked=original.map_blocks(unstack,template=template_unstack[target_global_vars].sel(time=\"2010\"))\n", "#dataarray:\n", "#unstacked=original.map_blocks(unstack,template=template_unstack.sel(time=\"2010\"))\n", "#unstacked=dssource[target_global_vars].sel(time=\"2010\").map_blocks(unstack,template=template_unstack[target_global_vars]).chunk(lat=1)\n", "#unstacked=dssource[target_global_vars].sel(time=\"2010\").map_blocks(unstack,template=template_unstack.sel(time=\"2010\"))\n", "#interped=unstacked.map_blocks(interp,template=template_interp[target_global_vars])\n", "interped=unstacked.map_blocks(interp,template=template_interp.sel(time=\"2010\"))\n", "#interped=unstacked.map_blocks(interp,template=template_interp.sel(time=\"2010\"))" ] }, { "cell_type": "code", "execution_count": null, "id": "5b232a97-5a5b-4d5b-9177-524d535803d1", "metadata": { "tags": [] }, "outputs": [], "source": [ "interped=dask.optimize(interped)[0]" ] }, { "cell_type": "code", "execution_count": null, "id": "55573b86-7350-497e-a966-140cca481553", "metadata": { "tags": [] }, "outputs": [], "source": [ "interped" ] }, { "cell_type": "markdown", "id": "63e1f957-3f5f-47ba-9694-df90f5720339", "metadata": {}, "source": [ "## Run workflow here" ] }, { "cell_type": "code", "execution_count": null, "id": "677ca62c-323f-457e-b5f2-b4c46ae4e5c9", "metadata": { "tags": [] }, "outputs": [], "source": [ "from dask.diagnostics import ProgressBar\n", "with ProgressBar():\n", " t2=interped.compute()" ] }, { "cell_type": "code", "execution_count": null, "id": "37d38f3c-70b3-411e-b6a1-e2865fbfc117", "metadata": { "tags": [] }, "outputs": [], "source": [ "import hvplot.xarray\n", "t2.hvplot.image(x=\"lon\",y=\"lat\")" ] }, { "cell_type": "code", "execution_count": null, "id": "a955216a-ddea-4ec3-98d5-78408c722929", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "dkrzcatalog", "language": "python", "name": "dkrzcatalog" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.10" } }, "nbformat": 4, "nbformat_minor": 5 }