Feature/regions landmask soil harvesters rev2 #69

sherrieF · 2024-09-25T20:44:30Z

This branch now has the following four new tests:
test_harvester_global_soil_moisture.py
test_harvester_global_soil_temperature.py
test_harvester_regional_soil_moisture.py
test_harvester_regional_soil_temperature.py
These python scripts test the following variables soil surface variables: soill4, soilm, soilt4, tg3
The global python scripts tests the global values returned from the daily_bfg.py
The regional python scripts tests the values returned from the daily_bfg.py for the regions:
'north_hemi': {'north_lat': 90.0, 'south_lat': 0.0, 'west_long': 0.0, 'east_long': 360.0},
'south_meni': {'north_lat': 0.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 360.0},
'eastern_hemis': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 180.0},
'western_hemis': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 180.0, 'east_long': 360.0},
It is important to note that the sotyp variable on the bfg files is used to delete grid cell values that are equal to
0 for ocean and 16 for ice. The ocean and ice values are removed as soon as the above variable are read in . So they are not included in any of the statistics.

CLASSES that have been added to the src/score-hv directory:
The following methods are in the python script stats_utils.py.
The classes that are initialized for this python script are:
self.weighted_averages=[]
self.variances=[]
self.minimum=[]
self.maximum=[]
self.stats=stats_list

def clear_requested_statistics(self):
    This method clears out the class lists so the statistice for
    multiple variables can be calculated and returned. It is called from
    the daily_bfg.py .

def calculate_requested_statistics(self,weights,temporal_mean):
    This method takes the weights and the temporal mean for 
    the variable passed in from the calling routine and 
    calculates the user requested statistics for that 
    variable. The following methods are called from this
    method.  The statistics that are calculated from the
    methods below are put into the class lists.
    def calculate_weighted_average(self,weights,temporal_mean):
        This method takes the weights and temporal mean of a 
        variable and calculates a weighted sum. 
    def calculate_var_variance(self,weights,temporal_mean):
        This method takes the weights and temporal mean of a 
        variable  and calculates the variance.
        variance = sum_R{ w_i * (x_i - xbar)^2 }
    def find_minimum_value(self,temporal_mean):
        This method finds the minimum value of the temporal
        mean of a variable.
    def find_maximum_value(self,temporal_mean):
        This method fine the maximum value of the temporal
        mean of a variable.

The following methods are in the python script region_utils.py

The class initialization:
def init(self,dataset):
"""
Here we initalize the region class as a dictionary.
Parameter: datset - This is a dataset that has been
opened with xarray.
"""
self.name = [] The list to store the name of the user region. This is key word and is needed.
self.north_lat = [] The list to store the user requested northern latitude of the reggion.
self.south_lat = [] The list to store the user requested southern latitude of the region.
self.west_long = [] The list to store the western longitude of the user region. In degrees East
self.east_long = [] The list to store the eastern longitude of the user region. In degrees East
self.region_indices = [] The list to strore the region indicies. These are passed back to the calling
routine.
self.latitude_values = dataset['grid_yt'].values This is the array of latitude values on the original
dataset.
self.longitude_values = dataset['grid_xt'].values This is the array of the longitude values on the
original data set.
The methods called from this class.
def test_user_latitudes(self,north_lat,south_lat):
This method tests the user input latitudes to make sure they are reasonable.
It exits with an error if the latitudes are out of bounds.
If the values pass the tests they are added to the region dictionary defined
in the def__init__ method above.
def test_user_longitudes(self,west_long,east_long):
This method tests the user input longitudes to make sure they are reasonable.
It exits with an error if the longitudes are out of bounds.
If the values pass the tests they are added to the region dictionary define
in the def__init__ method above.
def get_region_indices(self,region_index):
This method is called from the method get_region_data that is a member of
this class..
It calculates the start and end indicies in the latitude and longitude
arrays for the region index passed in from the get region data.

   Methods called from an external python script. 
   def add_user_region(self,dictionary):
        This method is called from the main calling python script.  It tests the   
        region dictionary passed in from the calling script for validity. It calls the 
        test_user_latitudes and test_user_longitudes to make the user defined region
        is valid.  If the user defined region is valid it populates the region dictionary
        as defined above.
   def get_region_indices(self,region_index):
       This method is called from the main calling python script.  
       It calculates the start and end indicies in the latitude and longitude
       arrays for the region index passed in. It returns the indicies to the calling python script.
   def get_region_data(self,region_index,data):
       This method subsets the full grid variable data into the requested region.

The following methods are in the python script mask_utils.py
The valid masks are land,ocean,sea and ice

def __init__(self,user_mask_value,soil_type_values):
    """
      Here we initalize the MaskCatalog class.
      """
    self.name = None 
    self.user_mask = user_mask_value
    self.data_mask = soil_type_values

def initial_mask_of_variable(self,var_name,variable_data,dataset):
This method sets the ocean and ice grid cells to missing for the soil variables.
This is done automatically for the soil variables: soill4,soilm,soilt4 and tg3.
This method is called from the main python script.

def replace_bad_values_with_nan(self,variable_data):
This method replaces missing or fill values with NaN.
This is done so any statistics the user has requested will
be calculated correctly.
This method is called from the main python script.

def user_mask_array(self,region_mask):
This takes the sotyp variable data from the data set.
It returns an array with boolean values. The grid points that
the user wants are set to True and the grid points the user
does not want is set to false.

daily_bfg.py

This python script is the main sript for the harvesters in score-hv
This script uses the following classes:
from score_hv.config_base import ConfigInterface
from score_hv.stats_utils import VarStatsCatalog
from score_hv.region_utils import GeoRegionsCatalog
from score_hv.mask_utils import MaskCatalog
This script reads the VALID_CONFIG_DICT that is set up in the
harvester tests. At present the VALID_CONFIG_DICT has the following values:
VALID_CONFIG_DICT = {'harvester_name': hv_registry.DAILY_BFG,
'filenames' : BFG_PATH,
'statistic': ['mean','variance', 'minimum', 'maximum'],
'variable': ['var1',...'varn'],
'regions': {name,latitude values, longitude values}
There can be more than one region.
'surface_mask': land,or ocean or ice
}

The daily_bfg.py then opens and reads the dataset requested
by the user. The path to the data files is in the VALID_CONFIG_DICT:filenames.
The python method xarray is used to open and read in the data set.
The script then reads the rest of the VALID_CONFIG_DICT.
A general rundown of the processing in the daily_bfg.py is as follows:
The gridcell area weight files is read in.
Each variable that has been requested is processed on at a time.
If a soil variable has been requested it is masked.
If a surface mask has been requested the variable grid point
and weights grid points are masked.
If a region or regions have been requested they are applied to
to the variable data and weights.
The requested user statistics are then calculated.
The following information is sent back to the havester that has called the daily_bfg.py
harvested_data.append(HarvestedData(
self.config.harvest_filenames,
statistic,
variable,
np.float32(value),
units,
dt.fromisoformat(median_cftime.isoformat()),
longname,
self.config.surface_mask,
self.config.regions))

The following are methods that are called from this main python script:
def get_gridcell_area_data_path():
returns the path to the gridcell area data file.

def get_median_cftime(xr_dataset):
    returns the median cftime from the sr_dataset.

def check_variable_exists(var_name,dataset_variable_names):
    Makes sure the requested variable is in the users dataset.

def calculate_surface_energy_balance(xr_dataset,dataset_variable_names):
    This method calculates the surface energy balance.  The surface energy balance
    is a derived field.

def calculate_toa_radative_flux(xr_dataset,dataset_variable_names):
    This method calculates the top of the atmosphere radiative energy flux(netrf_avetoa).
    This is a derived field.

def check_array_dimensions(region_variable,region_weights):
    This method makes sure that the region variable and the region weights 
    have the same dimensions.  If their dimensions are different we exit 
    the main script.  The dimensions must be the same to calculate the 
    statistics requested by the user.

def calculate_and_normalize_solid_angle(sum_global_weights,region_weights):
    This method calculates the solid angle for the regional weights
    and normalizes them.  The normalized regional weights are returned.

…iles, adding masking, included new unit tests and associated data.

sherrieF added 6 commits September 24, 2024 12:03

added regional capabilities to daily_bfg.py harvester, updated util f…

cbba33d

…iles, adding masking, included new unit tests and associated data.

working on addeding sotyp to netcdf files

a2eaab6

working on masking and regions

084090a

added the correct path to the gridcell area weights file in the tests

264b7ae

changed prateb to prate

44a70eb

changed a small t to a large T

7d8fe5a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/regions landmask soil harvesters rev2 #69

Feature/regions landmask soil harvesters rev2 #69

sherrieF commented Sep 25, 2024

Feature/regions landmask soil harvesters rev2 #69

Are you sure you want to change the base?

Feature/regions landmask soil harvesters rev2 #69

Conversation

sherrieF commented Sep 25, 2024