Merge pull request #22 from tinaok/update_chunking

update obsolete commands, Put visibilities of 'Zarr' in table of contents and/or title
pangeo-data · Oct 25, 2023 · d24283a · d24283a
2 parents 5d0c4b3 + a3569ca
commit d24283a
Show file tree

Hide file tree

Showing 2 changed files with 24 additions and 25 deletions.
diff --git a/tutorial/_toc.yml b/tutorial/_toc.yml
@@ -38,8 +38,8 @@ parts:
     title: The Easy-peasy life with OpenEO
   - file: part3/data_exploitability_pangeo
     title: How to exploit data on Pangeo
-  - file: part3/chunking_introduction
-    title: Data and preprocessing general knowledge
+  - file: part/chunking_introduction
+    title: Data chunking with zarr and kerchunk
   - file: part3/scaling_dask
     title: Scaling with Dask
   - file: part3/scaling_openEO

diff --git a/tutorial/part3/chunking_introduction.ipynb b/tutorial/part3/chunking_introduction.ipynb
@@ -5,7 +5,7 @@
    "id": "1bfbae7a-12f1-4787-a520-c3de7529168d",
    "metadata": {},
    "source": [
-    "# Data chunking"
+    "# Data chunking with zarr and kerchunk."
    ]
   },
   {
@@ -409,7 +409,7 @@
     "\n",
     "If we can have our original dataset already 'chunked' and accessed in an optimized way according to it's actual byte storage on disk, we won't need to load entire dataset every time, and our data anlayzis, even working on the entire dataset, will be greatly optimized.\n",
     "\n",
-    "Let's convert our input data into Zarr format so that we can learn what it is."
+    "Let's convert our input data into Zarr format so that we can learn what it is. We can keep the data as in DataArray or convert that into DataSet before storing them."
    ]
   },
   {
@@ -424,21 +424,22 @@
    },
    "outputs": [],
    "source": [
-    "test.to_dataset().to_zarr('test.zarr',mode='w')"
+    "test.to_zarr('test_DataArray.zarr',mode='w')"
    ]
   },
   {
-   "cell_type": "markdown",
-   "id": "42b738c4-0639-433a-a2a9-b188d9459119",
-   "metadata": {},
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "698991c9-d910-44a0-a7b2-569464614bd1",
+   "metadata": {
+    "collapsed": false,
+    "jupyter": {
+     "outputs_hidden": false
+    }
+   },
+   "outputs": [],
    "source": [
-    "<div class=\"alert alert-info\">\n",
-    "<i class=\"fa-check-circle fa\" style=\"font-size: 22px;color:#666;\"></i> <b>Warning</b>\n",
-    "<br>\n",
-    "<ul>\n",
-    "<li>DataArray can not be saved as 'zarr'. Before saving your data to zarr, you will need to convert it into a DataSet</li>\n",
-    "</ul>\n",
-    "</div>"
+    "test.to_dataset().to_zarr('test.zarr',mode='w')"
    ]
   },
   {
@@ -605,10 +606,6 @@
    "execution_count": null,
    "id": "33c5ce25",
    "metadata": {
-    "collapsed": false,
-    "jupyter": {
-     "outputs_hidden": false
-    },
     "tags": [
      "hide-output"
     ]
@@ -653,8 +650,9 @@
    },
    "outputs": [],
    "source": [
-    "LTS = xr.open_mfdataset(\n",
-    "    \"reference://\", engine=\"zarr\",\n",
+    "LTS = xr.open_dataset(\n",
+    "    \"reference://\", \n",
+    "    engine=\"zarr\",\n",
     "    backend_kwargs={\n",
     "        \"storage_options\": {\n",
     "            \"fo\": chunk_info,\n",
@@ -809,8 +807,9 @@
    "outputs": [],
    "source": [
     "%%time\n",
-    "LTS = xr.open_mfdataset(\n",
-    "    \"reference://\", engine=\"zarr\",\n",
+    "LTS = xr.open_dataset(\n",
+    "    \"reference://\", \n",
+    "    engine=\"zarr\",\n",
     "    backend_kwargs={\n",
     "        \"storage_options\": {\n",
     "            \"fo\": out,\n",
@@ -882,7 +881,7 @@
    "outputs": [],
    "source": [
     "import xarray as xr\n",
-    "LTS = xr.open_mfdataset(\n",
+    "LTS = xr.open_dataset(\n",
     "    \"reference://\", engine=\"zarr\",\n",
     "    backend_kwargs={\n",
     "        \"storage_options\": {\n",
@@ -920,7 +919,7 @@
    "outputs": [],
    "source": [
     "catalogue=\"https://object-store.cloud.muni.cz/swift/v1/foss4g-catalogue/c_gls_NDVI-LTS_1999-2019.json\"\n",
-    "LTS = xr.open_mfdataset(\n",
+    "LTS = xr.open_dataset(\n",
     "    \"reference://\", engine=\"zarr\",\n",
     "    backend_kwargs={\n",
     "        \"storage_options\": {\n",