From 41cdce31bc1071496437d94ec84fd577f07f2dd1 Mon Sep 17 00:00:00 2001
From: George Verghese <109232709+gv2325@users.noreply.github.com>
Date: Sat, 7 Sep 2024 15:49:50 -0400
Subject: [PATCH] added tempo notebook in qmd
---
m203-grdiv1-pm25.qmd | 68 ++++++++++-----
m204-egis-tempo.qmd | 203 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 251 insertions(+), 20 deletions(-)
create mode 100644 m204-egis-tempo.qmd
diff --git a/m203-grdiv1-pm25.qmd b/m203-grdiv1-pm25.qmd
index 5a0f391..b053a36 100644
--- a/m203-grdiv1-pm25.qmd
+++ b/m203-grdiv1-pm25.qmd
@@ -7,18 +7,17 @@ bibliography: references.bib
## Overview
-In this lesson, you will use NASA socioeconomic and environmental Earthdata available at NASA SEDAC to compare relationships between levels of socioeconomic deprivation agaisnts air quality data of particulate matter (PM) in different international administrative areas.
+In this lesson, you will use NASA socioeconomic and environmental Earthdata available at NASA SEDAC to compare relationships between levels of socioeconomic deprivation agaisnts air quality data of particulate matter (PM) in different international administrative areas.
This lesson walks through the process of calculating and visualizing zonal statistics for a set of countries using raster data, focusing on GRDI country quintiles and PM2.5 concentration levels within these quintile areas. It begins by subsetting data by country and iterating over each country to extract relevant zonal statistics like mean, median, and various percentiles for each quintile. These statistics are stored in a GeoDataFrame, which is later used to create a choropleth map that visualizes specific GRDI metrics across countries. The lesson includes a detailed analysis of PM2.5 concentrations within different GRDI quartiles for selected countries. This involves clipping the raster data to each country's geometry, filtering the data based on the GRDI quartiles, and calculating the mean PM2.5 levels for each quartile. The results are then visualized using customized plots to highlight the relationship between air quality and GRDI metrics across the selected countries.
-
## Learning Objectives
After completing this lesson, you should be able to:
- Gain a general understanding of what is particulate matter (PM) in the air and how it impacts human health.
-- Learn about global socioeconomic dimensions of deprivation and how they are spatially represented.
-- Find statistical thresholds in socioeconomic data.
+- Learn about global socioeconomic dimensions of deprivation and how they are spatially represented.
+- Find statistical thresholds in socioeconomic data.
- Perform zonal statistics to summarize spatial data
- Resample spatial data to harmoniza and compare socioeconomic data against environmental data.
- Display data on a maps to get a general understanding of the spatial distribution of data.
@@ -26,8 +25,6 @@ After completing this lesson, you should be able to:
## Introduction
-
-
## Data Collection and Integration
The **Global (GL) Annual PM2.5 Grids from MODIS, MISR and SeaWiFS Aerosol Optical Depth (AOD), v4.03 (1998 – 2019)** can can be downloaded from the Socioeconomic Data and Applications Center (\[SEDAC\]()) [@centerforinternationalearthscienceinformationnetwork-ciesin-columbiauniversity2022].
@@ -39,6 +36,7 @@ Gather comprehensive datasets from reliable sources such as the US EPA's EJSCREE
### Preparing Environment and Variables
Importing python packages required:
+
```{python}
import xarray as xr
import rioxarray
@@ -55,21 +53,26 @@ import plotly.graph_objects as go
```
Load the GRDIv1 and PM2.5 data from local sources:
+
```{python}
# Load rasters
grdi_path = r"Z:\Sedac\GRDI\data\povmap-grdi-v1-geotiff\final data\povmap-grdi-v1.tif"
pm25_path = r"F:\TOPSSCHOOL\git\TOPSTSCHOOL-air-quality\data\sdei-global-annual-gwr-pm2-5-modis-misr-seawifs-aod-v4-gl-03-2019-geotiff\sdei-global-annual-gwr-pm2-5-modis-misr-seawifs-aod-v4-gl-03-2019.tif"
```
-Using the package rasterio to load the data into memory. This allows us to read the data and use it for processing.
+
+Using the package rasterio to load the data into memory. This allows us to read the data and use it for processing.
+
```{python}
# Open the input and reference rasters
grdi_raster = rioxarray.open_rasterio(grdi_path, mask_and_scale=True)
pm25_raster = rioxarray.open_rasterio(pm25_path, mask_and_scale=True)
```
-### Matching Data Points using Bilinear Resample
-The GRDI raster and PM2.5 rasters are incompatible in resolution. One method of harmonizing data is by using the `Resampling` bethod with a *bilinear* method. In this case, we reduce, or coarsen, the resolution of the GRDI raster to match the PM2.5 raster.
+### Matching Data Points using Bilinear Resample
+
+The GRDI raster and PM2.5 rasters are incompatible in resolution. One method of harmonizing data is by using the `Resampling` bethod with a *bilinear* method. In this case, we reduce, or coarsen, the resolution of the GRDI raster to match the PM2.5 raster.
+
```{python}
# Resample the input raster to match the reference raster
grdi_raster = grdi_raster.rio.reproject_match(pm25_raster,resampling=Resampling.bilinear)
@@ -77,6 +80,7 @@ grdi_raster = grdi_raster.rio.reproject_match(pm25_raster,resampling=Resampling.
```
## Previewing Spatial Data in a Plot
+
```{python}
# Plotting the rasters
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 20))
@@ -96,31 +100,35 @@ fig.colorbar(im2, ax=ax2, orientation='horizontal', label='PM2.5 Values')
plt.tight_layout()
plt.show()
```
+
## Working with administrative Data
`pygadm` is a package that has international administrative units from levels 0 to 2. We can search the available countries by listing the Names.
+
```{python}
country_table = gpd.GeoDataFrame(pygadm.Names())
len(country_table)
```
-Some available areas with a unique `GID_0` code share Names; therefore we drop the rows that contain digits.
+Some available areas with a unique `GID_0` code share Names; therefore we drop the rows that contain digits.
+
```{python}
country_table = country_table[~country_table['GID_0'].str.contains('\d', na=False)]
len(country_table)
```
### Subset Data From a Table
+
Doing Zonal statistics for more than 200 countries may take a while, therefore, we can subset the data randomly with the `.sample()` method.
```{python}
country_sample = country_table.sample(n=15)
country_sample
```
-## Zonal Statistics for Each Administrative Area
-`rasterstats` has a funcion `zonal_stats()` which allows you to use vectors to summarize raster data. We summarize GRDIv1 data to calculate the following statistics: count, minimum, mean, max, median, standard deviation, range, and percentiles 20, 40, 60, and 80.
+## Zonal Statistics for Each Administrative Area
+`rasterstats` has a funcion `zonal_stats()` which allows you to use vectors to summarize raster data. We summarize GRDIv1 data to calculate the following statistics: count, minimum, mean, max, median, standard deviation, range, and percentiles 20, 40, 60, and 80.
```{python}
@@ -172,7 +180,9 @@ for index, row in country_sample.iloc[:].iterrows():
```
+
Let's use the `.head()` method from Pandas to check the top of our table
+
```{python}
stats_results.head()
```
@@ -263,7 +273,8 @@ def calculate_country_stats(country_sample, grdi_raster, pm25_raster=None):
```
-From the table above, we can choose an attribute, or column, to display it in a map plot. In this case, I'm choosing the GRDI Max
+From the table above, we can choose an attribute, or column, to display it in a map plot. In this case, I'm choosing the GRDI Max
+
```{python}
column_chosen = 'GRDI_Max' #GRDI_Max, GRDI_Min, GRDI_Median
# Plotting
@@ -275,7 +286,9 @@ ax.set_title('Choropleth Map Showing GRDI Mean per country')
ax.set_axis_off() # Turn off the axis numbers and ticks
plt.show()
```
+
### Selecting Data by Column
+
Start my creating a list of countries that you are interested in to Subset data from the DataFrame that match the values in the `NAME_0` column. The `.isin()` mehthod checks each element in the DataFrame's column for the item present in the list and returns matching rows.
```{python}
@@ -286,16 +299,21 @@ selected_countries = ["Algeria", "Somalia", "Colombia", "Timor Leste", "Finland"
#use the list above to subset the country_table DataFrame by the column NAME_0
selected_countries = country_table[country_table['NAME_0'].isin(selected_countries)]
```
+
### Using a Defined Custom Function
-Recalling the defined fucntion `calculate_country_stats`, we can use our `selected_countries` list, and the GRDI and PM2.5 rasters, to create a new table of zonal statistics.
+
+Recalling the defined fucntion `calculate_country_stats`, we can use our `selected_countries` list, and the GRDI and PM2.5 rasters, to create a new table of zonal statistics.
+
```{python}
stats_results = calculate_country_stats(selected_countries, grdi_raster)
```
+
Show the head of the table again:
```{python}
stats_results.head()
```
+
Plot the map again choosing a column to plot:
```{python}
@@ -303,10 +321,11 @@ column_chosen = 'GRDI_Max' #GRDI_Max, GRDI_Min, GRDI_Median
stats_results.plot(column=column_chosen, legend=True)
plt.show()
```
+
## Creating a Table with Results
-We can create a list of **tuples** that we can use to refer to the GRDI statistical values, and the name, color, and symbol we want to assign.
-In this case, we are using the GRDI zonal statistics of each country we selected that include the Mean, Minimum, Maximum, and interquartiles.
+We can create a list of **tuples** that we can use to refer to the GRDI statistical values, and the name, color, and symbol we want to assign. In this case, we are using the GRDI zonal statistics of each country we selected that include the Mean, Minimum, Maximum, and interquartiles.
+
```{python}
# List of GRDI values and their corresponding properties
@@ -321,7 +340,9 @@ grdi_data = [
('GRDI_P80', 'Q80', 'red', '142')
]
```
+
We can create a figure to display the data based on the names colors and symbols we selected.
+
```{python}
# Create a figure
fig = go.Figure()
@@ -350,10 +371,13 @@ fig.update_layout(
# Show plot
fig.show()
```
+
## Summarizing PM2.5 Values by Socioeconomic Deprivation
-Considering the GRDI quartile values as a level of socieoeconomic deprivation within each country, we can use the `stats_results` GeoDataFrame, the GRDI raster, and the PM2.5 raster to calculate the Mean PM.25 value within each of those areas in each country. This can describe how the air quality for different socioeconomic strata compare within the country, as well as against other countries.
-The results will be added to the `stats_results` with the corresponting columns.
+Considering the GRDI quartile values as a level of socieoeconomic deprivation within each country, we can use the `stats_results` GeoDataFrame, the GRDI raster, and the PM2.5 raster to calculate the Mean PM.25 value within each of those areas in each country. This can describe how the air quality for different socioeconomic strata compare within the country, as well as against other countries.
+
+The results will be added to the `stats_results` with the corresponting columns.
+
```{python}
# iterate through the stats_results table rows
for index, row in stats_results.iloc[:].iterrows():
@@ -406,8 +430,11 @@ for index, row in stats_results.iloc[:].iterrows():
```{python}
stats_results.head()
```
+
## Plot Results of Mean PM2.5 in Socieceonomic Deprivation Quartiles per country
+
Similarly, we create a list of tuples of how we want to display the data, and create a figure based on the tuples. This plot would show each country in the y axis and the Log of Mean PM2.5 values in each country's GRDI quarties.
+
```{python}
# List of GRDI values and their corresponding properties
@@ -459,9 +486,10 @@ fig.update_layout(
# Show plot
fig.show()
```
-Use the plotly controls to take a closer look at the results.
-With this shapely plot, We can examine differences between countires and PM2.5 values. The plot displays the coutnries on the Y axis and log values of the average PM2.5 value on the X axis. Each country displays PM2.5 values averaged within the quartile areas based on GRDI values of each country. A higher quartile (Q) implies a higher degree of deprivation, 1 being the lowest and 5 the highest.
+Use the plotly controls to take a closer look at the results.
+
+With this shapely plot, We can examine differences between countires and PM2.5 values. The plot displays the coutnries on the Y axis and log values of the average PM2.5 value on the X axis. Each country displays PM2.5 values averaged within the quartile areas based on GRDI values of each country. A higher quartile (Q) implies a higher degree of deprivation, 1 being the lowest and 5 the highest.
Congratulations! .... Now you should be able to:
diff --git a/m204-egis-tempo.qmd b/m204-egis-tempo.qmd
new file mode 100644
index 0000000..8db50d7
--- /dev/null
+++ b/m204-egis-tempo.qmd
@@ -0,0 +1,203 @@
+---
+title: " Accessing Tropospheric Emissions: Monitoring of Pollution (TEMPO, v3, NO2 Vertical Column) through Earthdata GIS"
+author: "George Verghese"
+format: html
+bibliography: references.bib
+---
+
+## Brief
+
+The Tropospheric Emissions: Monitoring of Pollution (TEMPO) Nitrogen Dioxide Vertical Column Troposphere Beta layer provides information on the amount of nitrogen dioxide in the troposphere, available as approximately one-hour scans for daylight hours over North America, from May 2024 to present. These data should be considered as beta products and are not optimized for operational use.
+
+### **TEMPO Mission Overview**
+
+The Tropospheric Emissions: Monitoring of Pollution (TEMPO) instrument is a grating spectrometer, sensitive to visible (VIS) and ultraviolet (UV) wavelengths of light with a spectral range of 290-490 + 540-740 nm and 0.6 nm spectral resolution. The TEMPO instrument is attached to the Earth-facing side of a commercial telecommunications satellite (Intelsat 40e) in geostationary orbit over 91˚ W longitude (about 22,000 miles above Earth’s equator). This allows TEMPO to maintain a continuous view of North America so that the instrument's light-collecting mirror can make a complete east-to-west scan of the field of regard hourly during daylight hours. By measuring sunlight reflected and scattered from the Earth's surface and atmosphere back to the instrument's detectors, TEMPO's ultraviolet and visible light sensors provide measurements of ozone, nitrogen dioxide, formaldehyde, and other constituents involved in the chemical dynamics of Earth’s atmosphere.
+
+The primary mission objectives of TEMPO involve understanding the dynamics of air quality, pollution sources, and their impact on climate change. By providing near real-time data and comprehensive atmospheric composition measurements, TEMPO will assist scientists in studying pollution patterns, evaluating the efficacy of environmental policies, and predicting future trends in air quality.
+
+### Layer Overview
+
+EGIS Layer Access:
+
+The Tropospheric Emissions: Monitoring of Pollution (TEMPO) Nitrogen Dioxide Vertical Column Troposphere layer provides information on the amount of nitrogen dioxide in the troposphere. This is provided as the total number of nitrogen dioxide molecules in the tropospheric column of air above one square centimeter on the Earth’s surface (units: molecules/cm\^2). Nitrogen dioxide Level 3 files provide trace gas information on a regular grid. Level 3 files are derived by combining information from all Level 2 files constituting a TEMPO East-West scan cycle, using an area-weighted re-gridding approach. The data have been converted from their native file format (netCDF4) to Cloud Raster Format (CRF).
+
+**Temporal Coverage**
+
+The temporal resolution of a nominal scan is approximately one hour during daylight hours, with more frequent scans in the morning over the eastern portion of the field of regard (FOR) and in the evenings over the western portion of the FOR. Each image is presented with the starting timestamp of the corresponding TEMPO scan. Due to the nature of the TEMPO instrument’s east to west scanning pattern, each image is a composite of measurements taken over a period of 40-60 minutes, depending on the spatial coverage of the scan. Data are updated daily with the previous day's data. Data are available from May 2024 to present.
+
+**Geographic Coverage**
+
+ Imagery are available for North America. This layer is presented in its native geographic coordinate system (WGS 1984) and resolution. The sensor’s native spatial resolution is \~2 km x 4.75 km at the center of TEMPO’s FOR and the Level 3 product resolution is 0.02 x 0.02 degrees.
+
+**Data Filtering**
+
+The layer is filtered to display high-quality pixels using the main data quality flag (removing low confidence measurements), solar zenith angle (removing data retrieved at high solar zenith angles), and effective cloud fraction (removing where clouds obscure the tropospheric column) variables. The filters applied are set to remove pixels based on the following: main_data_quality_flag \> 1, eff_cloud_fraction \> 0.5, and solar_zenith_angle \> 80.
+
+**Data Validation**
+
+These data should be considered as beta products per the Beta Product Maturity level defined in the [TEMPO validation plan](https://tempo.si.edu/documents/SAO-DRD-11_TEMPO%20Science%20Validation_Plan_Baseline.pdf). The products are not optimized for operational use and anomalies may exist. Users should refrain from making conclusive public statements regarding science and applications of the TEMPO data products until the products are designated to be at the provisional validation status according to the validation plan. Users may consult the [TEMPO User Guide](https://asdc.larc.nasa.gov/documents/tempo/guide/TEMPO_Level-2-3_trace_gas_clouds_user_guide_V1.0.pdf) for descriptions of the data and associated known issues.
+
+**Recommended Usage Notes**
+
+When viewing the image service in the ESRI online map viewer, it is recommended to use the multidimensional slider rather than the default time slider. The multidimensional slider can be accessed via the “multidimensional” icon in the right-hand menu.
+
+**Contact**
+
+For inquiries about this service, please contact [larc-dl-asdc-tempo\@mail.nasa.gov](mailto:larc-dl-asdc-tempo@mail.nasa.gov) or post/view questions on the [Earthdata Forum](https://forum.earthdata.nasa.gov/).
+
+#### Import libraries
+
+```{python}
+import requests
+import json
+
+import pandas as pd
+import matplotlib.pyplot as plt
+
+from datetime import datetime
+```
+
+#### Use ArcGIS RESTful capabilities
+
+```{python}
+# Load data from REST end point
+service_url = "https://gis.earthdata.nasa.gov/UAT/rest/services/TEMPO_testing/L3_No2_tropo_2day_sample/ImageServer"
+md_raster = Raster(service_url, is_multidimensional=True, engine=None, gis=gis)
+```
+
+```{python}
+#Print Image Service Multidimensional Info
+print (md_raster.multidimensional_info)
+```
+
+```{python}
+# Print basic information about the multidimensional dataset
+print(f"Dataset extent: {md_raster.extent}")
+print(f"Variable names: {md_raster.variable_names}")
+#print(f"Time extent: {md_raster.time_extent}")
+```
+
+```{python}
+# Check your Raster function templates
+md_imagery = ImageryLayer(service_url, gis=gis)
+
+for fn in md_imagery.properties.rasterFunctionInfos:
+ print(fn['name'])
+```
+
+```{python}
+# Build url to retrieve the Multidimensional Information as a JSON
+multidimensional_info = base_url + '/multidimensionalInfo?f=pjson'
+
+# Send the request for Multidimensional Information
+multidimensional_get = requests.get(multidimensional_info)
+
+# Reset the content of the request as a JSON
+multidimensional_content = multidimensional_get.json()
+
+# You can view the full Multidimensional Information by:
+print(json.dumps(multidimensional_content, indent=2))
+
+# Filter JSON to output only the Variables
+variables = multidimensional_content['multidimensionalInfo']['variables']
+short_names = [var.get('name') for var in variables]
+long_names = [var.get('attributes').get('long_name') for var in variables]
+full_names = list(zip(short_names, long_names))
+print(full_names[:])
+```
+
+```{python}
+# Extract relevant information into a DataFrame
+samples = [
+ {
+ "StdTime": sample["attributes"]["StdTime"],
+ variable_name: float(sample["attributes"][variable_name])
+ }
+ for sample in data["samples"] if "attributes" in sample
+]
+
+df = pd.DataFrame(samples)
+```
+
+```{python}
+# Convert StdTime from Unix timestamp (milliseconds) to datetime
+df['StdTime'] = pd.to_datetime(df['StdTime'], unit='ms')
+
+# Plotting
+plt.figure(figsize=(10, 6))
+plt.plot(df['StdTime'], df[variable_name], marker='o', linestyle='-')
+plt.title(f'{variable_name} Over Time')
+plt.xlabel('Time')
+plt.ylabel(variable_name)
+plt.grid(True)
+plt.xticks(rotation=45)
+plt.tight_layout()
+plt.show()
+```
+
+```{python}
+import folium
+from folium.raster_layers import ImageOverlay
+
+# Create a map centered at a given latitude and longitude
+m = folium.Map(location=[0, 0], zoom_start=2,) # Centered on a global view crs="EPSG4326"
+
+# Parameters for the Export Image request
+variable_name = "T2M" # Example variable
+#image_date_time_str = "2022-01-01 23:59:59"
+#time_milliseconds = convert_to_milliseconds(image_date_time_str)
+time_milliseconds = "349747200000"
+bbox = "-180%2C-90%2C180%2C90" # Example bounding box for global coverage
+format_string = "jpgpng"
+size = "" # Example size, adjust as needed
+imageSR = ""
+bboxSR = ""
+
+# Construct the Export Image URL
+export_image_url = f"https://gis.earthdata.nasa.gov/image/rest/services/POWER/POWER_901_MONTHLY_METEOROLOGY_UTC/ImageServer/exportImage?bbox={bbox}&bboxSR={bboxSR}&size={size}&imageSR={imageSR}&time={time_milliseconds}&format={format}&pixelType=F64&noData=&noDataInterpretation=esriNoDataMatchAny&interpolation=RSP_BilinearInterpolation&compression=&compressionQuality=&bandIds=&sliceId=&mosaicRule=%7B%22multidimensionalDefinition%22%3A%5B%7B%22variableName%22%3A%22{variable_name}%22%7D%5D%7D&renderingRule=&adjustAspectRatio=true&validateExtent=false&lercVersion=1&compressionTolerance=&f=image"
+
+print(export_image_url)
+
+### Start of Jesters example code
+#NOTE: This is probably easier for the user to undestand than one really long string. Python can turn a dictionary into API endpoint parameters.
+import urllib
+api_endpoint = "https://gis.earthdata.nasa.gov/server/rest/services/POWER/POWER_901_MONTHLY_METEOROLOGY_UTC/ImageServer/exportImage?"
+args = {
+ "bbox": bbox,
+ "bboxSR": bboxSR,
+ "size": size,
+ "imageSR": imageSR,
+ "time": time_milliseconds,
+ "format": format_string,
+ "pixelType": "F64",
+ "noData": "",
+ "noDataInterpretation": "esriNoDataMatchAny",
+ "interpolation": "RSP_BilinearInterpolation",
+ "compression": "",
+ "compressionQuality": "",
+ "bandIds": "",
+ "sliceId": "",
+ f"mosaicRule": f"%7B%22multidimensionalDefinition%22%3A%5B%7B%22variableName%22%3A%22{variable_name}%22%7D%5D%7D",
+ "renderingRule": "",
+ "adjustAspectRatio": "true",
+ "validateExtent": "false",
+ "lercVersion": "1",
+ "compressionTolerance": "",
+ "f": "image"
+}
+url = api_endpoint + urllib.parse.urlencode(args)
+### End of Jesters example code
+
+# Add ImageOverlay to the map
+ImageOverlay(
+ image=export_image_url,
+ bounds=[[-90, -180], [90, 180]],
+ opacity=0.5,
+).add_to(m)
+
+# Save the map as an HTML file
+m.save('map_with_image_overlay.html')
+
+# Display the map in Jupyter Notebook or JupyterLab
+m
+```
\ No newline at end of file