Feature/regions landmask soil harvesters rev2 #69
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This branch now has the following four new tests:
test_harvester_global_soil_moisture.py
test_harvester_global_soil_temperature.py
test_harvester_regional_soil_moisture.py
test_harvester_regional_soil_temperature.py
These python scripts test the following variables soil surface variables: soill4, soilm, soilt4, tg3
The global python scripts tests the global values returned from the daily_bfg.py
The regional python scripts tests the values returned from the daily_bfg.py for the regions:
'north_hemi': {'north_lat': 90.0, 'south_lat': 0.0, 'west_long': 0.0, 'east_long': 360.0},
'south_meni': {'north_lat': 0.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 360.0},
'eastern_hemis': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 0.0, 'east_long': 180.0},
'western_hemis': {'north_lat': 90.0, 'south_lat': -90.0, 'west_long': 180.0, 'east_long': 360.0},
It is important to note that the sotyp variable on the bfg files is used to delete grid cell values that are equal to
0 for ocean and 16 for ice. The ocean and ice values are removed as soon as the above variable are read in . So they are not included in any of the statistics.
CLASSES that have been added to the src/score-hv directory:
The following methods are in the python script stats_utils.py.
The classes that are initialized for this python script are:
self.weighted_averages=[]
self.variances=[]
self.minimum=[]
self.maximum=[]
self.stats=stats_list
The following methods are in the python script region_utils.py
The class initialization:
def init(self,dataset):
"""
Here we initalize the region class as a dictionary.
Parameter: datset - This is a dataset that has been
opened with xarray.
"""
self.name = [] The list to store the name of the user region. This is key word and is needed.
self.north_lat = [] The list to store the user requested northern latitude of the reggion.
self.south_lat = [] The list to store the user requested southern latitude of the region.
self.west_long = [] The list to store the western longitude of the user region. In degrees East
self.east_long = [] The list to store the eastern longitude of the user region. In degrees East
self.region_indices = [] The list to strore the region indicies. These are passed back to the calling
routine.
self.latitude_values = dataset['grid_yt'].values This is the array of latitude values on the original
dataset.
self.longitude_values = dataset['grid_xt'].values This is the array of the longitude values on the
original data set.
The methods called from this class.
def test_user_latitudes(self,north_lat,south_lat):
This method tests the user input latitudes to make sure they are reasonable.
It exits with an error if the latitudes are out of bounds.
If the values pass the tests they are added to the region dictionary defined
in the def__init__ method above.
def test_user_longitudes(self,west_long,east_long):
This method tests the user input longitudes to make sure they are reasonable.
It exits with an error if the longitudes are out of bounds.
If the values pass the tests they are added to the region dictionary define
in the def__init__ method above.
def get_region_indices(self,region_index):
This method is called from the method get_region_data that is a member of
this class..
It calculates the start and end indicies in the latitude and longitude
arrays for the region index passed in from the get region data.
The following methods are in the python script mask_utils.py
The valid masks are land,ocean,sea and ice
def initial_mask_of_variable(self,var_name,variable_data,dataset):
This method sets the ocean and ice grid cells to missing for the soil variables.
This is done automatically for the soil variables: soill4,soilm,soilt4 and tg3.
This method is called from the main python script.
def replace_bad_values_with_nan(self,variable_data):
This method replaces missing or fill values with NaN.
This is done so any statistics the user has requested will
be calculated correctly.
This method is called from the main python script.
def user_mask_array(self,region_mask):
This takes the sotyp variable data from the data set.
It returns an array with boolean values. The grid points that
the user wants are set to True and the grid points the user
does not want is set to false.
daily_bfg.py
This python script is the main sript for the harvesters in score-hv
This script uses the following classes:
from score_hv.config_base import ConfigInterface
from score_hv.stats_utils import VarStatsCatalog
from score_hv.region_utils import GeoRegionsCatalog
from score_hv.mask_utils import MaskCatalog
This script reads the VALID_CONFIG_DICT that is set up in the
harvester tests. At present the VALID_CONFIG_DICT has the following values:
VALID_CONFIG_DICT = {'harvester_name': hv_registry.DAILY_BFG,
'filenames' : BFG_PATH,
'statistic': ['mean','variance', 'minimum', 'maximum'],
'variable': ['var1',...'varn'],
'regions': {name,latitude values, longitude values}
There can be more than one region.
'surface_mask': land,or ocean or ice
}
The daily_bfg.py then opens and reads the dataset requested
by the user. The path to the data files is in the VALID_CONFIG_DICT:filenames.
The python method xarray is used to open and read in the data set.
The script then reads the rest of the VALID_CONFIG_DICT.
A general rundown of the processing in the daily_bfg.py is as follows:
The gridcell area weight files is read in.
Each variable that has been requested is processed on at a time.
If a soil variable has been requested it is masked.
If a surface mask has been requested the variable grid point
and weights grid points are masked.
If a region or regions have been requested they are applied to
to the variable data and weights.
The requested user statistics are then calculated.
The following information is sent back to the havester that has called the daily_bfg.py
harvested_data.append(HarvestedData(
self.config.harvest_filenames,
statistic,
variable,
np.float32(value),
units,
dt.fromisoformat(median_cftime.isoformat()),
longname,
self.config.surface_mask,
self.config.regions))
The following are methods that are called from this main python script:
def get_gridcell_area_data_path():
returns the path to the gridcell area data file.