diff --git a/README.md b/README.md index d159b5b..e86e6f0 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# Scaling Big Data Analysis with Pangeo and OpenEO: Unlocking the Power of Space Data +# Unlocking the Power of Space Data with Pangeo & OpenEO [![All Contributors](https://img.shields.io/badge/all_contributors-10-orange.svg?style=flat-square)](#contributors-) @@ -11,47 +11,57 @@ This repository contains the documentation and jupyter notebooks used for delive The content of this repository (folder `tutorial`) is rendered as an online document using [Jupyter Book](https://jupyterbook.org/en/stable/intro.html). **You can access it [here](https://pangeo-data.github.io/pangeo-openeo-BiDS-2023)**. -## Agenda +# Timeline of the workshops + +The programmes for each workshop are given below for your information. Each workshop is held separately. + +## Introduction to Pangeo + +| Time | Activity | +| ---- | -------- | +| 09:00 | 👋 Welcome (5 minutes) | +| 9:05 | Introduction and Motivation (15 minutes) | +| 9:20 | Overview of the Pangeo ecosystem (20 minutes) | +| 9:40 | Understanding Xarray to avoid common pitfalls (30 minutes) | +| 10:10 | Interactive Visualization with Hvplot (15 minutes) | +| 10:30 | Break (30 minutes) | | + +## Introduction to OpenEO + +| Time | Activity | +| ---- | -------- | +| 11:00 | 👋 Introduction and motivation (5 minutes) | +| 11:05 | Getting started with OpenEO (10 minutes) | +| 11:15 | Accessing and processing data with OpenEO (30 minutes) | +| 11:45 | ntegrate custom code into your workflow using User Defined Functions (30 minutes) | +| 12:15 | Q&A session - feedbacks (20 minutes) | +| 12:30 | 🍽️ Lunch | + +## Unlocking the Power of Space Data with Pangeo & OpenEO + +Please note that this workshop assumes some prior knowledge of Pangeo and OpenEO. If you are not familiar with any of these technologies, we suggest to check the content of the two other workshops (taught in the morning). + +| Time | Activity | +| ---- | -------- | +| 14:00 | 👋 Introduction and motivation (5 minutes) | +| 14:05 | Data discoverability and searchability (25 minutes) | +| | An overview of STAC and the different available sources| +| 14:30 | Data and pre-processing general knowledge (60 minutes) | +| | Introduction to chunking with netCDF, ZARR and Kerchunk | +| | Parallelization with Dask | +| 15:30 | ☕️ Break | +| 16:00 | Different data exploitation approaches (60 minutes) | +| | How to exploit data with OpenEO: snow coverage example | +| | How to exploit data on Pangeo: pure Xarray version | +| 17:00 | How to go beyond (45 minutes)| +| | Custom algorithms: UDF (OpenEO) and ufunc (Xarray) | +| | Scaling with OpenEO (how it works underneath) | +| 17:40 | Wrap-up and feedback survey (15 minutes) | + +These timelines are purely approximative and given for indication purpose only. We will adjust depending on the audience. +There will be additional breaks (5 minutes) regularly and time for questions during the workshops. -**Part-1: Pangeo** -- 9:00 Welcome (5 minutes) -- 9:05 Introduction and Motivation (15 minutes) -- 9:20 Overview of the Pangeo ecosystem (20 minutes) -- 9:40 Understanding Xarray to avoid common pitfalls (30 minutes) -- 10:10 Interactive Visualization with Hvplot (20 minutes) -- 10:30 Break (30 minutes) - -**Part-2: OpenEO** - -- 11:00 Getting started with OpenEO -- 11:15 Finding Data, Running first graphs, difference to client-side processing -- 11:45 Integrate custom code into your workflow using User Defined Functions -- 12:15 Feedback Block -- 12:30 Lunch - -**Part-3: Unlocking the Power of Space Data with Pangeo & OpenEO** - -- 14:00 Introduction to the afternoon session - -- 14:05 Data discoverability and searchability - - An overview of STAC and different sources and platforms (openeo.cloud, STAC browser, STAC Index ...) - -- 14:30 Data and pre-processing general knowledge - - Introduction to chunking examples with netcdf, zarr and Kerchunk - - Parallelization with Dask - -- 16:00 Different data exploitation approaches - - How to exploit data on OpenEO (30 minutes) - - Snow coverage example - - How to exploit data on Pangeo (30 minutes) - - Snow coverage pure xarray version - -- 17:00 How to go beyond - - Understanding how to create a custom algorithm: UDF (OpenEO) and ufunc (Xarray) (20 minutes) - - Scaling with OpenEO (how it works underneath) (30 minutes) - -- 17:45 Wrap-up and feedback survey (15 minutes) ## Contributors ✨ diff --git a/tutorial/_toc.yml b/tutorial/_toc.yml index 4215730..9de44de 100644 --- a/tutorial/_toc.yml +++ b/tutorial/_toc.yml @@ -29,8 +29,12 @@ parts: - caption: Part 2 - OpenEO chapters: - - file: part2/agenda_and_links - title: Agenda + - file: part2/openEO - boa_sentinel_2 + title: Sentinel-2 Bottom Of Atmosphere (BOA) + - file: part2/openEO - Corine Land Cover Alps + title: Corine Land Cover Alps + - file: part2/openEO - Client Side Land Cover Alps + title: Client Side Land Cover Alps - file: part2/advanced_workflows title: Advanced workflows - file: part2/stac_metadata @@ -40,18 +44,16 @@ parts: chapters: - file: part3/data_discovery title: Data discoverability and searchability + - file: part3/chunking_introduction + title: Introduction to chunking with netCDF, ZARR and Kerchunk + - file: part3/scaling_dask + title: Parallelization with Dask - file: part3/data_exploitability_openEO - title: How to exploit data on openEO + title: How to exploit data with openEO - file: part3/data_exploitability_pangeo title: How to exploit data on Pangeo - - file: part3/chunking_introduction - title: Data chunking with zarr - - file: part3/scaling_dask - title: Scaling with Dask - file: part3/advanced_udf - title: Advanced OpenEO (UDF) - - file: part3/advanced_ufunc - title: Advanced OpenEO (Ufunc) + title: Custom algorithms: UDF (OpenEO) and ufunc (Xarray) - file: part3/scaling_openeo title: Scaling with OpenEO diff --git a/tutorial/about/timeline.md b/tutorial/about/timeline.md index 5e39216..9ff4af6 100644 --- a/tutorial/about/timeline.md +++ b/tutorial/about/timeline.md @@ -11,36 +11,39 @@ The programmes for each workshop are given below for your information. Each work | 9:20 | Overview of the Pangeo ecosystem (20 minutes) | | 9:40 | Understanding Xarray to avoid common pitfalls (30 minutes) | | 10:10 | Interactive Visualization with Hvplot (15 minutes) | -| 10:25 | Wrap-up and feedback survey (5 minutes) | | +| 10:30 | Break (30 minutes) | | ## Introduction to OpenEO | Time | Activity | | ---- | -------- | -| 11:00 | 👋 Welcome (5 minutes) | +| 11:00 | 👋 Introduction and motivation (5 minutes) | | 11:05 | Getting started with OpenEO (10 minutes) | -| 11:15 | Accessing data with OpenEO (25 minutes) | -| 11:40 | Processing data with OpenEO (30 minutes) | -| 12:10 | Working with data cubes with OpenEO (20 minutes) | +| 11:15 | Accessing and processing data with OpenEO (30 minutes) | +| 11:45 | ntegrate custom code into your workflow using User Defined Functions (30 minutes) | +| 12:15 | Q&A session - feedbacks (20 minutes) | | 12:30 | 🍽️ Lunch | -## Scaling Big Data Analysis with Pangeo & OpenEO: Unlocking the Power of Space Data +## Unlocking the Power of Space Data with Pangeo & OpenEO Please note that this workshop assumes some prior knowledge of Pangeo and OpenEO. If you are not familiar with any of these technologies, we suggest to check the content of the two other workshops (taught in the morning). | Time | Activity | | ---- | -------- | -| 14:00 | 👋 Welcome (5 minutes) | -| 14:05 | Understanding what OpenEO does best and how to exploit it to easily streamline your data analysis (20 minutes) | -| 14:25 | Scaling with OpenEO (25 minutes) | -| 14:50 | Understanding when and how to exploit Pangeo to customise your algorithm and analyse multiple data sources (20 minutes) | -| 15:10 | Introduction to chunking (20 minutes) | +| 14:00 | 👋 Introduction and motivation (5 minutes) | +| 14:05 | Data discoverability and searchability (25 minutes) | +| | An overview of STAC and the different available sources| +| 14:30 | Data and pre-processing general knowledge (60 minutes) | +| | Introduction to chunking with netCDF, ZARR and Kerchunk | +| | Parallelization with Dask | | 15:30 | ☕️ Break | -| 16:00 | Scaling with Dask (30 minutes) | -| 16:30 | Cloud-friendly access to archival data with kerchunk (25 minutes) | -| 16:55 | Create Analysis Ready Cloud Optimised (ARCO) data (25 minutes) | -| 17:20 | Common workflow that combines the best of the two “worlds” (30 minutes) | -| 17:50 | Wrap-up and feedback survey (10 minutes) | +| 16:00 | Different data exploitation approaches (60 minutes) | +| | How to exploit data with OpenEO: snow coverage example | +| | How to exploit data on Pangeo: pure Xarray version | +| 17:00 | How to go beyond (45 minutes)| +| | Understanding how to create a custom algorithm: UDF (OpenEO) and ufunc (Xarray) | +| | Scaling with OpenEO (how it works underneath) | +| 17:40 | Wrap-up and feedback survey (15 minutes) | These timelines are purely approximative and given for indication purpose only. We will adjust depending on the audience. There will be additional breaks (5 minutes) regularly and time for questions during the workshops. diff --git a/tutorial/intro.md b/tutorial/intro.md index 51a3ceb..1337f72 100644 --- a/tutorial/intro.md +++ b/tutorial/intro.md @@ -14,9 +14,9 @@ More information can be found on the [BiDS'23](https://www.bigdatafromspace2023. ### Pangeo & OpenEO tutorial The tutorials are divided in 3 parts: -1. Introduction to the Pangeo ecosystem -2. Introduction to the OpenEO platform -3. Scaling Big Data Analysis with Pangeo and OpenEO: Unlocking the Power of Space Data +1. Introduction to Pangeo +2. Introduction to OpenEO +3. Unlocking the Power of Space Data with Pangeo & OpenEO The workshop timelines, setup and content are accessible via the left menu of this webpage. diff --git a/tutorial/part2/agenda_and_links.md b/tutorial/part2/agenda_and_links.md index 5188315..a2d1e11 100644 --- a/tutorial/part2/agenda_and_links.md +++ b/tutorial/part2/agenda_and_links.md @@ -1,4 +1,4 @@ -# Part-2: OpenEO (Comment: Intro Session) +# Part-2: Introduction to OpenEO This Part is going to introduce participants to openEO Platform. Attendees will learn what openEO Platform is, how to find data and run first process graphs. We will also show attendees how to integrate custom code into your workflow using User Defined Functions. diff --git a/tutorial/part3/data_discovery.md b/tutorial/part3/data_discovery.md index dff697c..c95018f 100644 --- a/tutorial/part3/data_discovery.md +++ b/tutorial/part3/data_discovery.md @@ -1,5 +1,5 @@ # Data discoverability and searchability -## An overview of STAC +## An overview of STAC and the different available sources -## Overview of different sources and platforms \ No newline at end of file +The presentation is available here: https://docs.google.com/presentation/d/1bUTlmvrDMm1affvr8tq5gePqBigCTY9YVN3mtcscpJk/edit?usp=sharing \ No newline at end of file